Re: [HACKERS] pg_rewind hangs if --source-server is used and syncrep is enabled
On Thu, Oct 6, 2016 at 7:37 PM, Heikki Linnakangas wrote: > Committed, thanks! I moved the call to where we establish the connection, > that felt slightly more natural. Thanks for the commit. Indeed that's better with the other sanity checks. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind hangs if --source-server is used and syncrep is enabled
On 10/06/2016 02:24 AM, Michael Paquier wrote: On Wed, Oct 5, 2016 at 11:53 PM, Michael Banck wrote: My colleague Christoph Berg pointed out that pg_rewind could just set synchronous_commit = local before creating the temporary table, which indeed works, proof-of-concept patch attached Even synchronous_commit = off would not matter much, and we could just use that for performance reasons. The temporary table used in this context is just used to track the delta chunks of blocks, so this solution sounds better to me. I'll patch 9.4's pg_rewind similarly to what is decided here, and we could as well use an extra PQexec call to bring more clarity for the code, now an extra round-trip could be a big deal where network lag matters, but compared to the COPY chunks sent out that would not matter much I guess. Committed, thanks! I moved the call to where we establish the connection, that felt slightly more natural. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind hangs if --source-server is used and syncrep is enabled
On Wed, Oct 5, 2016 at 11:53 PM, Michael Banck wrote: > On Wed, Oct 05, 2016 at 04:39:39PM +0200, Michael Banck wrote: >> To the user, the last thing printed is "need to copy MB [...]". If >> the user cancels the pg_rewind command with ^C, the backend keeps >> hanging around even in --dry-run mode. That won't hurt too much as it >> does not seem to block future pg_rewind runs after synchronous_commit >> has been set to a different value, but looks surprising to me. Oops. >> Not sure whether pg_rewind could error out gracefully without hanging in >> this case, > > My colleague Christoph Berg pointed out that pg_rewind could just set > synchronous_commit = local before creating the temporary table, which > indeed works, proof-of-concept patch attached Even synchronous_commit = off would not matter much, and we could just use that for performance reasons. The temporary table used in this context is just used to track the delta chunks of blocks, so this solution sounds better to me. I'll patch 9.4's pg_rewind similarly to what is decided here, and we could as well use an extra PQexec call to bring more clarity for the code, now an extra round-trip could be a big deal where network lag matters, but compared to the COPY chunks sent out that would not matter much I guess. I am just posting another version, and added a CF entry to not lose track of it: https://commitfest.postgresql.org/11/811/ -- Michael diff --git a/src/bin/pg_rewind/libpq_fetch.c b/src/bin/pg_rewind/libpq_fetch.c index 9239009..4bc2bc3 100644 --- a/src/bin/pg_rewind/libpq_fetch.c +++ b/src/bin/pg_rewind/libpq_fetch.c @@ -403,6 +403,17 @@ libpq_executeFileMap(filemap_t *map) int i; /* + * Set up connection context, with synchronous replication enabled the + * following command could stuck the connection. + */ + sql = "SET synchronous_commit = off;"; + res = PQexec(conn, sql); + if (PQresultStatus(res) != PGRES_COMMAND_OK) + pg_fatal("could not set up connection context: %s", + PQresultErrorMessage(res)); + PQclear(res); + + /* * First create a temporary table, and load it with the blocks that we * need to fetch. */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind hangs if --source-server is used and syncrep is enabled
On Wed, Oct 05, 2016 at 04:39:39PM +0200, Michael Banck wrote: > if pg_rewind is told to fetch data via a libpq connection > (--source-server), synchronous replication is enabled and there is only > one sync standby (pilot error, but sill); pg_rewinding the old master > hangs at the CREATE TEMPORARY TABLE step (CREATE TABLE waiting for > X/). At least this happened to one of our clients while > evaluating pg_rewind. > > To the user, the last thing printed is "need to copy MB [...]". If > the user cancels the pg_rewind command with ^C, the backend keeps > hanging around even in --dry-run mode. That won't hurt too much as it > does not seem to block future pg_rewind runs after synchronous_commit > has been set to a different value, but looks surprising to me. > > Not sure whether pg_rewind could error out gracefully without hanging in > this case, My colleague Christoph Berg pointed out that pg_rewind could just set synchronous_commit = local before creating the temporary table, which indeed works, proof-of-concept patch attached Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer diff --git a/src/bin/pg_rewind/libpq_fetch.c b/src/bin/pg_rewind/libpq_fetch.c index 9239009..a5cb426 100644 --- a/src/bin/pg_rewind/libpq_fetch.c +++ b/src/bin/pg_rewind/libpq_fetch.c @@ -406,7 +406,7 @@ libpq_executeFileMap(filemap_t *map) * First create a temporary table, and load it with the blocks that we * need to fetch. */ - sql = "CREATE TEMPORARY TABLE fetchchunks(path text, begin int4, len int4);"; + sql = "SET synchronous_commit = local; CREATE TEMPORARY TABLE fetchchunks(path text, begin int4, len int4);"; res = PQexec(conn, sql); if (PQresultStatus(res) != PGRES_COMMAND_OK) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_rewind hangs if --source-server is used and syncrep is enabled
Hi, if pg_rewind is told to fetch data via a libpq connection (--source-server), synchronous replication is enabled and there is only one sync standby (pilot error, but sill); pg_rewinding the old master hangs at the CREATE TEMPORARY TABLE step (CREATE TABLE waiting for X/). At least this happened to one of our clients while evaluating pg_rewind. To the user, the last thing printed is "need to copy MB [...]". If the user cancels the pg_rewind command with ^C, the backend keeps hanging around even in --dry-run mode. That won't hurt too much as it does not seem to block future pg_rewind runs after synchronous_commit has been set to a different value, but looks surprising to me. Not sure whether pg_rewind could error out gracefully without hanging in this case, but maybe it could/should clean up the query on SIGTERM? And at the least, maybe this caveat should be documented? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers