Zhuravlev, I couldn't reproduce your problem here. Since both problems(unexpected failover and segfault) were occured in "worker process", I would like to take a backtrace of the process. The process looks like:
23567 ? S 0:00 pgpool: worker process -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp > Thanks for the report. We are looking into this. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp > >> Hello, >> >> I am having a problem with selection of master node when I manually >> attach slave node (with id #1) which came up after postgresql server reboot. >> When I attach the slave, pgpool tries to select master node, and for >> some reason reports that previously working master node (with id #0) is no >> longer >> available, marks the master node as unavailable, and selects the newly >> attached node (id #1) as master. If after that I manually attach the node >> with >> id #0 back to the pool it becomes a master. >> >> More detailed explanation. >> >> == Setup == >> Two postgresql nodes (running postgresql 9.0.3 on FreeBSD 8.0): >> Master node (storage1) and a slave one (storage2). >> The system uses streaming replication with the slave node running in hot >> standby mode. >> pgpool-II-3.0.3-beta1 is installed on storage1 node with the following >> configuration file: >> >> ====================================================================== >> zaa:...local/etc# cat pgpool.conf | sort | grep -v '^#' | grep -v '^$' >> authentication_timeout = 30 >> backend_hostname0 = 'storage1' >> backend_hostname1 = 'storage2' >> backend_port0 = 5432 >> backend_port1 = 5432 >> backend_weight0 = 0.01 >> backend_weight1 = 0.99 >> black_function_list = 'nextval,setval' >> child_idle_limit = 0 >> child_life_time = 300 >> child_max_connections = 0 >> connection_cache = true >> connection_life_time = 0 >> debug_level = 1 >> enable_pool_hba = false >> fail_over_on_backend_error = false >> failover_command = '' >> health_check_period = 5 >> health_check_user = 'pgpool' >> listen_addresses = 'localhost' >> load_balance_mode = true >> log_connections = true >> log_hostname = false >> log_per_node_statement = true >> log_standby_delay = 'always' >> log_statement = true >> master_slave_mode = true >> master_slave_sub_mode = 'stream' >> max_pool = 5 >> num_init_children = 2 >> parallel_mode = false >> pid_file_name = '/var/run/pgpool/pgpool.pid' >> port = 9999 >> print_timestamp = true >> replication_mode = false >> reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT' >> socket_dir = '/tmp' >> white_function_list = '' >> ====================================================================== >> >> I can successfully connect to local port 9999 and issue DML and select >> queries. >> >> == Problem == >> >> 1. I shutdown postgresql on slave node storage2 and see the following >> lines in syslog on the node with pgpool: >> >> ====================================================================== >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 53354: >> detect_stop_postmaster_error: receive admin shutdown error from a node. >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 LOG: pid 53354: postmaster >> on DB node 1 was shutdown by administrative command >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 LOG: pid 53354: >> notice_backend_error: 1 fail over request from pid 53354 >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 52999: >> failover_handler called >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 52999: >> failover_handler: starting to select new master node >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 LOG: pid 52999: starting >> degeneration. shutdown host storage2(5432) >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 53354: child >> received shutdown request signal 3 >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 52999: >> failover_handler: kill 53354 >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 52999: >> failover_handler: kill 53355 >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 DEBUG: pid 53355: child >> received shutdown request signal 3 >> Feb 21 23:22:33 zaa pgpool: 2011-02-21 23:22:33 LOG: pid 52999: >> failover_handler: set new master node: 0 >> ... >> Feb 21 23:22:34 zaa pgpool: 2011-02-21 23:22:34 DEBUG: pid 52999: starting >> health checking >> Feb 21 23:22:34 zaa pgpool: 2011-02-21 23:22:34 DEBUG: pid 52999: >> health_check: 0 th DB node status: 2 >> Feb 21 23:22:34 zaa pgpool: 2011-02-21 23:22:34 DEBUG: pid 52999: >> health_check: 1 th DB node status: 3 >> ====================================================================== >> >> As you can see, pgpool correctly marks storage2 node as being down. >> >> 2. After a while I start postgres on node storage2. >> 3. In order to return the node back into the pool I run the following >> command: >> >> pcp_attach_node -d 120 localhost 9898 postgres postgres 1 >> >> and see those lines in syslog on storage1: >> >> ====================================================================== >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 LOG: pid 53003: >> send_failback_request: fail back 1 th node request from pid 53003 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> failover_handler called >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> failover_handler: starting to select new master node >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 LOG: pid 52999: starting >> fail back. reconnect host storage2(5432) >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> failover_handler: kill 56131 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 56131: child >> received shutdown request signal 3 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> failover_handler: kill 56132 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 56132: child >> received shutdown request signal 3 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 LOG: pid 52999: >> failover_handler: set new master node: 0 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 LOG: pid 52999: failback >> done. reconnect host storage2(5432) >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 56205: I am 56205 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: pool_ssl: >> SSL requested but SSL support is not available >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 56206: I am 56206 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: s_do_auth: >> auth kind: 0 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: s_do_auth: >> parameter status data received >> Feb 21 23:28:03 zaa last message repeated 10 times >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: s_do_auth: >> backend key data received >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: s_do_auth: >> transaction state: I >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: T >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> num_fileds: 1 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: D >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: C >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: Z >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: T >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> num_fileds: 1 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: D >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: C >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: do_query: >> kind: Z >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 LOG: pid 52999: >> find_primary_node: primary node id is 0 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> reap_handler called >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> reap_handler: call wait3 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: child >> 56132 exits with status 0 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: child >> 56131 exits with status 0 >> Feb 21 23:28:03 zaa pgpool: 2011-02-21 23:28:03 DEBUG: pid 52999: >> reap_handler: normally exited >> Feb 21 23:28:05 zaa pgpool: 2011-02-21 23:28:05 DEBUG: pid 52999: starting >> health checking >> Feb 21 23:28:05 zaa pgpool: 2011-02-21 23:28:05 DEBUG: pid 52999: >> health_check: 0 th DB node status: 2 >> Feb 21 23:28:05 zaa pgpool: 2011-02-21 23:28:05 DEBUG: pid 52999: >> health_check: 1 th DB node status: 1 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56021: do_query: >> kind: T >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56021: >> num_fileds: 1 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56021: do_query: >> kind: D >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56021: do_query: >> kind: C >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56021: do_query: >> kind: Z >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56021: do_query: >> kind: E >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 ERROR: pid 56021: pool_read: >> read failed (Connection reset by peer) >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 56021: >> notice_backend_error: 0 fail over request from pid 56021 >> Feb 21 23:28:08 zaa postgres[56090]: [5-1] LOG: unexpected EOF on client >> connection >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> failover_handler called >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> failover_handler: starting to select new master node >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: starting >> degeneration. shutdown host storage1(5432) >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> failover_handler: kill 56205 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56205: child >> received shutdown request signal 3 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> failover_handler: kill 56206 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: >> failover_handler: set new master node: 1 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56206: child >> received shutdown request signal 3 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: failover >> done. shutdown host storage1(5432) >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56213: I am 56213 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56214: I am 56214 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: pool_ssl: >> SSL requested but SSL support is not available >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: s_do_auth: >> auth kind: 0 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: s_do_auth: >> parameter status data received >> Feb 21 23:28:08 zaa last message repeated 10 times >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: s_do_auth: >> backend key data received >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: s_do_auth: >> transaction state: I >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: T >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> num_fileds: 1 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: D >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: C >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: Z >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: T >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> num_fileds: 1 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: E >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: do_query: >> kind: Z >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: >> find_primary_node: do_query returns no rows >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: >> find_primary_node: do_query returns no data >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: >> find_primary_node: primary node id is 1 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> reap_handler called >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> reap_handler: call wait3 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: child >> 56206 exits with status 0 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: child >> 56205 exits with status 0 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 ERROR: pid 52999: Child >> process 56021 was terminated by segmentation fault >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: worker >> child 56021 exits with status 11 by signal 11 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: fork a new >> worker child pid 56216 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> reap_handler: normally exited >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56216: I am 56216 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56216: pool_ssl: >> SSL requested but SSL support is not available >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56216: s_do_auth: >> auth kind: 0 >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56216: s_do_auth: >> parameter status data received >> Feb 21 23:28:08 zaa last message repeated 10 times >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56216: s_do_auth: >> backend key data received >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 56216: s_do_auth: >> transaction state: I >> Feb 21 23:28:10 zaa pgpool: 2011-02-21 23:28:10 DEBUG: pid 52999: starting >> health checking >> Feb 21 23:28:10 zaa pgpool: 2011-02-21 23:28:10 DEBUG: pid 52999: >> health_check: 0 th DB node status: 3 >> Feb 21 23:28:10 zaa pgpool: 2011-02-21 23:28:10 DEBUG: pid 52999: >> health_check: 1 th DB node status: 1 >> ====================================================================== >> >> As you can see, at first pgpool correctly chooses storage1 node as master, >> but >> then for some reason it gets information about the master node (storage1) >> unavailability: >> >> ====================================================================== >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 ERROR: pid 56021: pool_read: >> read failed (Connection reset by peer) >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 56021: >> notice_backend_error: 0 fail over request from pid 56021 >> Feb 21 23:28:08 zaa postgres[56090]: [5-1] LOG: unexpected EOF on client >> connection >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> failover_handler called >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 DEBUG: pid 52999: >> failover_handler: starting to select new master node >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: starting >> degeneration. shutdown host storage1(5432) >> ====================================================================== >> >> slave node storage2 becomes the master node and health check shows storage1 >> node as unavailable. >> >> If I run "pcp_attach_node -d 120 localhost 9898 postgres postgres 0" command, >> storage1 node again becomes the master node as it should be. >> >> Could someone please let me know what might be causing the issue? Is this a >> bug >> or I am doing something wrong? >> >> PS. One more interesting issue is presence of those two lines: >> ====================================================================== >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 ERROR: pid 52999: Child >> process 56021 was terminated by segmentation fault >> Feb 21 23:28:08 zaa pgpool: 2011-02-21 23:28:08 LOG: pid 52999: worker >> child 56021 exits with status 11 by signal 11 >> ====================================================================== >> >> -- >> Alexander Zhuravlev >> http://ikato.com/ >> _______________________________________________ >> Pgpool-general mailing list >> [email protected] >> http://pgfoundry.org/mailman/listinfo/pgpool-general > _______________________________________________ > Pgpool-general mailing list > [email protected] > http://pgfoundry.org/mailman/listinfo/pgpool-general _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
