Re: [GENERAL] BDR: ALTER statement hanging
Thanks Craig, the problem was that (if I remember correctly) there were absolutely no errors or warnings logged when I issued the ALTER statement. Everything seemed to operate normally except that the execution never completed. Even the fact that the node_status was set to 'i' didn't result in any log messages and the replication was working as it should. -Selim From: Craig Ringer [cr...@2ndquadrant.com] Sent: Sunday, December 06, 2015 7:05 PM To: Selim Tuvi Cc: Sylvain MARECHAL; pgsql-general@postgresql.org Subject: Re: [GENERAL] BDR: ALTER statement hanging If you're not sure what's going on on a node, look at its logs. The background worker API and PostgreSQL's lack of autonomous transactions makes it quite challenging for BDR workers to capture logs and expose them to users at the SQL level. So always, if in doubt, examine the log files.
Re: [GENERAL] BDR: ALTER statement hanging
If you're not sure what's going on on a node, look at its logs. The background worker API and PostgreSQL's lack of autonomous transactions makes it quite challenging for BDR workers to capture logs and expose them to users at the SQL level. So always, if in doubt, examine the log files.
Re: [GENERAL] BDR: ALTER statement hanging
Yes they seem to be active: deliver=# select * from pg_replication_slots; slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn --++---++--++--+--+- bdr_533136_6223770712502831127_1_16389__ | bdr| logical | 533136 | deliver | t | | 182302 | 0/9C8A5598 bdr_533136_6223800735012265413_1_16389__ | bdr| logical | 533136 | deliver | t | | 182302 | 0/9C8A5598 (2 rows) Although when I look at bdr.bdr_nodes I see the status as still initializing for the other two nodes, I don't know if that could cause this problem: deliver=# select * from bdr.bdr_nodes; node_sysid | node_timeline | node_dboid | node_status | node_name | node_local_dsn | node_init_from_dsn -+---++-+-+--- -+-- 6212648563684174798 | 1 | 533136 | r | pe-deliverdb-sf-01v | host=pe-deliverdb-sf-01v port=5432 dbname=deliver user=deliver_admin password=x | 6223770712502831127 | 1 | 16389 | i | pe-deliverdb-sing-01v | host=pe-deliverdb-sing-01v port=5432 dbname=deliver user=deliver_admin password=x | host=pe-deliverdb-sf-01v port=5432 dbname=deliver user=deliver_admin password=x 6223800735012265413 | 1 | 16389 | i | pe-deliverdb-lon-01v | host=pe-deliverdb-lon-01v port=5432 dbname=deliver user=deliver_admin password=x | host=pe-deliverdb-sf-01v port=5432 dbname=deliver user=deliver_admin password=x -Selim From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Andreas Kretschmer [akretsch...@spamfence.net] Sent: Thursday, December 03, 2015 10:49 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] BDR: ALTER statement hanging Selim Tuvi <st...@ilm.com> wrote: > Hi, I am running a three node BDR cluster. BDR version is 0.9.3. Postgres > version is 9.4.5. > > With 0.9.2, I used to be able to issue ALTER statements using psql and it > would > go through. This time it is just hanging. The statement is this: for ddl-commands all nodes MUST be active in replication, so have you checked that in pg_replication_slots? Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BDR: ALTER statement hanging
Thanks, I removed the other nodes from bdr.bdr_nodes table, deleted all the bdr_connections and pg_replication_identifier entries, dropped the pg_replication_slots restarted the instance and then trying the ALTER statement resulted in: ERROR: No peer nodes or peer node count unknown, cannot acquire DDL lock HINT: BDR is probably still starting up, wait a while The only way I could issue the statement is run the following to convert the node to a standalone instance: BEGIN; SET LOCAL bdr.permit_unsafe_ddl_commands = true; SET LOCAL bdr.skip_ddl_locking = true; security label for 'bdr' on database deliver is '{"bdr": false}'; COMMIT; I am still puzzled as to why the bdr_nodes node_status was reporting "i" when there were no errors in the logs. -Selim From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Andreas Kretschmer [andr...@a-kretschmer.de] Sent: Friday, December 04, 2015 9:59 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] BDR: ALTER statement hanging > Selim Tuvi <st...@ilm.com> hat am 4. Dezember 2015 um 18:46 geschrieben: > > > Yes they seem to be active: > > deliver=# select * from pg_replication_slots; > slot_name | plugin | slot_type | datoid | > database | active | xmin | catalog_xmin | restart_lsn > --++---++--++--+--+- > bdr_533136_6223770712502831127_1_16389__ | bdr| logical | 533136 | > deliver | t | | 182302 | 0/9C8A5598 > bdr_533136_6223800735012265413_1_16389__ | bdr| logical | 533136 | > deliver | t | | 182302 | 0/9C8A5598 > (2 rows) > > Although when I look at bdr.bdr_nodes I see the status as still initializing > for the other two nodes, I don't know if that could cause this problem: > > deliver=# select * from bdr.bdr_nodes; > node_sysid | node_timeline | node_dboid | node_status | > node_name | > node_local_dsn > | > node_init_from_dsn > -+---++-+-+--- > -+-- > 6212648563684174798 | 1 | 533136 | r | > pe-deliverdb-sf-01v | host=pe-deliverdb-sf-01v port=5432 dbname=deliver > user=deliver_admin password=x | > 6223770712502831127 | 1 | 16389 | i | > pe-deliverdb-sing-01v | host=pe-deliverdb-sing-01v port=5432 dbname=deliver > user=deliver_admin password=x | host=pe-deliverdb-sf-01v port=5432 > dbname=deliver user=deliver_admin password=x > 6223800735012265413 | 1 | 16389 | i | > pe-deliverdb-lon-01v | host=pe-deliverdb-lon-01v port=5432 dbname=deliver > user=deliver_admin password=x | host=pe-deliverdb-sf-01v port=5432 > dbname=deliver user=deliver_admin password=x > > -Selim > I think, the state 'i' is the main reason for your problem, because of: "i- Joining: The node is doing initial slot creation or an initial dump and load". But i can't tell you why this nodes are in this state. Regards, Andreas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BDR: ALTER statement hanging
Yes, bdr_connections had the same number of rows: deliver=# select * from bdr.bdr_connections; conn_sysid | conn_timeline | conn_dboid | conn_origin_sysid | conn_origin_timeline | conn_origin_dboid | conn_is_unidirectional | conn_dsn| conn_apply_delay | conn_replication_sets -+---++---+--+---++--- -+--+--- 6212648563684174798 | 1 | 533136 | 0 | 0 | 0 | f | host=pe-deliverdb-sf-01v port=5432 dbname=deliver user=deliver_admin password=x | | {default} 6223770712502831127 | 1 | 16389 | 0 | 0 | 0 | f | host=pe-deliverdb-sing-01v port=5432 dbname=deliver user=deliver_admin password=x | | {default} 6223800735012265413 | 1 | 16389 | 0 | 0 | 0 | f | host=pe-deliverdb-lon-01v port=5432 dbname=deliver user=deliver_admin password=x | | {default} (3 rows) One other thing I noticed is that the conn_dboid is the same for two of the nodes. Is that normal? -Selim From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] on behalf of Sylvain MARECHAL [marechal.sylva...@gmail.com] Sent: Friday, December 04, 2015 10:14 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] BDR: ALTER statement hanging Le 04/12/2015 18:59, Andreas Kretschmer a écrit : > > > I think, the state 'i' is the main reason for your problem, because of: "i- > Joining: The node is doing initial slot creation or an initial dump and load". > > But i can't tell you why this nodes are in this state. > > > Regards, Andreas > > Did-you check the bdr.bdr_connections table? It should have as many lines as the bdr.bdr_nodes tables. Sylvain -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BDR: ALTER statement hanging
Le 04/12/2015 18:59, Andreas Kretschmer a écrit : I think, the state 'i' is the main reason for your problem, because of: "i- Joining: The node is doing initial slot creation or an initial dump and load". But i can't tell you why this nodes are in this state. Regards, Andreas Did-you check the bdr.bdr_connections table? It should have as many lines as the bdr.bdr_nodes tables. Sylvain -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BDR: ALTER statement hanging
> Selim Tuvihat am 4. Dezember 2015 um 18:46 geschrieben: > > > Yes they seem to be active: > > deliver=# select * from pg_replication_slots; > slot_name | plugin | slot_type | datoid | > database | active | xmin | catalog_xmin | restart_lsn > --++---++--++--+--+- > bdr_533136_6223770712502831127_1_16389__ | bdr| logical | 533136 | > deliver | t | | 182302 | 0/9C8A5598 > bdr_533136_6223800735012265413_1_16389__ | bdr| logical | 533136 | > deliver | t | | 182302 | 0/9C8A5598 > (2 rows) > > Although when I look at bdr.bdr_nodes I see the status as still initializing > for the other two nodes, I don't know if that could cause this problem: > > deliver=# select * from bdr.bdr_nodes; > node_sysid | node_timeline | node_dboid | node_status | > node_name | > node_local_dsn > | > node_init_from_dsn > -+---++-+-+--- > -+-- > 6212648563684174798 | 1 | 533136 | r | > pe-deliverdb-sf-01v | host=pe-deliverdb-sf-01v port=5432 dbname=deliver > user=deliver_admin password=x | > 6223770712502831127 | 1 | 16389 | i | > pe-deliverdb-sing-01v | host=pe-deliverdb-sing-01v port=5432 dbname=deliver > user=deliver_admin password=x | host=pe-deliverdb-sf-01v port=5432 > dbname=deliver user=deliver_admin password=x > 6223800735012265413 | 1 | 16389 | i | > pe-deliverdb-lon-01v | host=pe-deliverdb-lon-01v port=5432 dbname=deliver > user=deliver_admin password=x | host=pe-deliverdb-sf-01v port=5432 > dbname=deliver user=deliver_admin password=x > > -Selim > I think, the state 'i' is the main reason for your problem, because of: "i- Joining: The node is doing initial slot creation or an initial dump and load". But i can't tell you why this nodes are in this state. Regards, Andreas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BDR: ALTER statement hanging
Selim Tuviwrote: > Hi, I am running a three node BDR cluster. BDR version is 0.9.3. Postgres > version is 9.4.5. > > With 0.9.2, I used to be able to issue ALTER statements using psql and it > would > go through. This time it is just hanging. The statement is this: for ddl-commands all nodes MUST be active in replication, so have you checked that in pg_replication_slots? Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] BDR: ALTER statement hanging
Hi, I am running a three node BDR cluster. BDR version is 0.9.3. Postgres version is 9.4.5. With 0.9.2, I used to be able to issue ALTER statements using psql and it would go through. This time it is just hanging. The statement is this: alter table pts alter column shot drop not null; I also tried to add a column but that hangs as well: alter table pts add column shot_tmp text; Thanks -Selim