Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On 13.07.2012 14:35, Ryan Kelly wrote: On Mon, Jul 09, 2012 at 05:35:15PM +0900, Shigeru HANADA wrote: On Mon, Jun 25, 2012 at 9:00 PM, Ryan Kellyrpkell...@gmail.com wrote: - Copying only select() part of pqSocketPoll seems not enough, should we use poll(2) if it is supported? I did not think the additional complexity was worth it in this case. Unless you see some reason to use poll(2) that I do not. I checked where select() is used in PG, and noticed that poll is used at only few places. Agreed to use only select() here. poll() potentially performs better, but that hardly matters here, so select() is ok. However, on some platforms, signals don't interrupt select(). On such platforms, hitting CTRL-C will still not interrupt the connection attempt. Also there's a race condition: if you hit CTRL-C just after checking that the global variable is not set, but before entering select(), the select() will again not be interrupted (until the user gets frustrated and hits CTRL-C again). I think we need to sleep one second at a time, and check the global variable on every iteration, until the timeout is reached. That's what we do in non-critical sleep loops like this in the backend. We've actually been trying to get rid of such polling in the backend code lately, to reduce the number of unnecessary interrupts which consume power on an otherwise idle system, but I think that technique would still be OK for psql's connection code. Once the issues above are fixed, IMO this patch can be marked as Ready for committer. I have also added additional documentation reflecting Heikki's suggestion that PQconnectTimeout be recommended for use by applications using the async API. Attached is v6 of the patch. Thanks! With this patch, any connect_timeout parameter given in the original connection info string is copied to the \connect command. That sounds reasonable at first glance, but I don't think it's quite right: First of all, if you give a new connect_timeout string directly in the \connect command, the old connect_timeout still overrides the new one: $ ./psql postgres://localhost/postgres?connect_timeout=3 psql (9.3devel) Type help for help. postgres=# \connect postgres://192.168.1.123/postgres?connect_timeout=1000 after three seconds timeout expired Previous connection kept Secondly, the docs say that the new connection only inherits a few explicitly listed options from the old connection: dbname, username, host and port. If you add connect_timeout to that list, at least it requires a documentation update. But I don't think it should be inherited in the first place. Or if it should, what about other options, like application_name? Surely they should then be inherited too. All in all I think we should just leave out the changes to \connect, and not inherit connect_timeout from the old connection. If we want to change the behavior of all options, that's separate discussion and a separate patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On Mon, Jul 09, 2012 at 05:35:15PM +0900, Shigeru HANADA wrote: Hi Ryan, On Mon, Jun 25, 2012 at 9:00 PM, Ryan Kelly rpkell...@gmail.com wrote: Connection attempt by \connect command could be also canceled by pressing Ctrl+C on psql prompt. In addition, I tried setting PGCONNECT_TIMEOUT to 0 (infinite), but psql gave up after few seconds, for both start-up and re-connect. Is this intentional behavior? A timeout of 0 (infinite) means to keep trying until we succeed or fail, not keep trying forever. As you mentioned above, your connection attempts error out after a few seconds. This is what is happening. In my environment no such error occurs and as a result psql continues to attempt to connect for as long as I let it. For handy testing, I wrote simple TCP server which accepts connection and blocks until client gives up connection attempt (or forever). When I tested your patch with setting PGCONNECT_TIMEOUT=0, I found that psql's CPU usage goes up to almost 100% (10~ usr + 80~ sys) while connection attempt by \c command is undergoing. After reading code for a while, I found that FD_ZERO was not called. This makes select() to return immediately in the loop every time and caused busy loop. # Maybe you've forgotten adding FD_ZERO when you were merging Heikki's v2 patch. Yes, I had lost them somewhere. My apologies. - Checking status by calling PQstatus just after PQconnectStartParams is necessary. Yes, I agree. Checked. - Copying only select() part of pqSocketPoll seems not enough, should we use poll(2) if it is supported? I did not think the additional complexity was worth it in this case. Unless you see some reason to use poll(2) that I do not. I checked where select() is used in PG, and noticed that poll is used at only few places. Agreed to use only select() here. - Don't we need to clear error message stored in PGconn after PQconnectPoll returns OK status, like connectDBComplete? I do not believe there is a client interface for clearing the error message. Additionally, the documentation states that PQerrorMessage Returns the error message most recently generated by an operation on the connection. This seems to indicate that the error message should be cleared as this behavior is part of the contract of PQerrorMessage. My comment was pointless. Sorry for noise. Here is my additional comments for v5 patch: - Using static array for fixed-length connection parameters was suggested in comments for another CF item, using fallback_application_name for some command line tools, and IMO this approach would also suit for this patch. http://archives.postgresql.org/message-id/CA+TgmoYZiayts=fjsytzqlg7rewlwkdkey5f+fhp5v5_nu_...@mail.gmail.com I have done this as well. - Some comments go past 80 columns, and opening brace in line 1572 should go on next line. Please refer coding conventions written in PostgreSQL wiki. http://www.postgresql.org/docs/devel/static/source-format.html I have corrected these issues. Once the issues above are fixed, IMO this patch can be marked as Ready for committer. I have also added additional documentation reflecting Heikki's suggestion that PQconnectTimeout be recommended for use by applications using the async API. Attached is v6 of the patch. Regards, -- Shigeru Hanada * 不明 - 自動検出 * 英語 * 日本語 * 英語 * 日本語 javascript:void(0); -Ryan Kelly diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 5c5dd68..654f5f5 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -436,8 +436,12 @@ switch(PQstatus(conn)) The literalconnect_timeout/literal connection parameter is ignored when using functionPQconnectPoll/function; it is the application's responsibility to decide whether an excessive amount of time has elapsed. - Otherwise, functionPQconnectStart/function followed by a - functionPQconnectPoll/function loop is equivalent to + It is recommended to use functionPQconnectTimeout/function to get the + value of literalconnect_timeout/literal and use that as the timeout + in the application. That way the user gets the same timeout behavior + regardless of whether the application uses functionPQconnectPoll/function + or the blocking connection API. Otherwise, functionPQconnectStart/function + followed by a functionPQconnectPoll/function loop is equivalent to functionPQconnectdb/function. /para @@ -1496,6 +1500,24 @@ char *PQoptions(const PGconn *conn); /para /listitem /varlistentry + +varlistentry id=libpq-pqconnecttimeout + term + functionPQconnectTimeout/function + indexterm + primaryPQconnectTimeout/primary + /indexterm + /term + + listitem + para + Returns the connect_timeout property as given to libpq. +synopsis +char *PQconnectTimeout(const PGconn *conn); +/synopsis + /para +
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On 09.07.2012 11:35, Shigeru HANADA wrote: Once the issues above are fixed, IMO this patch can be marked as Ready for committer. Thanks. The docs on async connections says: The connect_timeout connection parameter is ignored when using PQconnectPoll; it is the application's responsibility to decide whether an excessive amount of time has elapsed. I think we should recommend using PQconnectTimeout(), similar to what you did in psql, in all client applications that use the non-blocking connection API. Perhaps something like: The connect_timeout connection parameter is ignored when using PQconnectPoll; it is the application's responsibility to decide whether an excessive amount of time has elapsed. It is recommended to use the PQconnectTimeout() to get value of the parameter, and use that as the timeout in the application. That way the user gets the same timeout behavior, regardless of whether the application uses PQconnectPoll or the nonblocking connection API. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
Hi Ryan, On Mon, Jun 25, 2012 at 9:00 PM, Ryan Kelly rpkell...@gmail.com wrote: Connection attempt by \connect command could be also canceled by pressing Ctrl+C on psql prompt. In addition, I tried setting PGCONNECT_TIMEOUT to 0 (infinite), but psql gave up after few seconds, for both start-up and re-connect. Is this intentional behavior? A timeout of 0 (infinite) means to keep trying until we succeed or fail, not keep trying forever. As you mentioned above, your connection attempts error out after a few seconds. This is what is happening. In my environment no such error occurs and as a result psql continues to attempt to connect for as long as I let it. For handy testing, I wrote simple TCP server which accepts connection and blocks until client gives up connection attempt (or forever). When I tested your patch with setting PGCONNECT_TIMEOUT=0, I found that psql's CPU usage goes up to almost 100% (10~ usr + 80~ sys) while connection attempt by \c command is undergoing. After reading code for a while, I found that FD_ZERO was not called. This makes select() to return immediately in the loop every time and caused busy loop. # Maybe you've forgotten adding FD_ZERO when you were merging Heikki's v2 patch. - Checking status by calling PQstatus just after PQconnectStartParams is necessary. Yes, I agree. Checked. - Copying only select() part of pqSocketPoll seems not enough, should we use poll(2) if it is supported? I did not think the additional complexity was worth it in this case. Unless you see some reason to use poll(2) that I do not. I checked where select() is used in PG, and noticed that poll is used at only few places. Agreed to use only select() here. - Don't we need to clear error message stored in PGconn after PQconnectPoll returns OK status, like connectDBComplete? I do not believe there is a client interface for clearing the error message. Additionally, the documentation states that PQerrorMessage Returns the error message most recently generated by an operation on the connection. This seems to indicate that the error message should be cleared as this behavior is part of the contract of PQerrorMessage. My comment was pointless. Sorry for noise. Here is my additional comments for v5 patch: - Using static array for fixed-length connection parameters was suggested in comments for another CF item, using fallback_application_name for some command line tools, and IMO this approach would also suit for this patch. http://archives.postgresql.org/message-id/CA+TgmoYZiayts=fjsytzqlg7rewlwkdkey5f+fhp5v5_nu_...@mail.gmail.com - Some comments go past 80 columns, and opening brace in line 1572 should go on next line. Please refer coding conventions written in PostgreSQL wiki. http://www.postgresql.org/docs/devel/static/source-format.html Once the issues above are fixed, IMO this patch can be marked as Ready for committer. Regards, -- Shigeru Hanada * 不明 - 自動検出 * 英語 * 日本語 * 英語 * 日本語 javascript:void(0); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
Shigeru: Thank you very much for your review. Comments are inline below, and a new patch is attached. On Fri, Jun 22, 2012 at 10:06:53AM +0900, Shigeru HANADA wrote: (2012/05/01 0:30), Ryan Kelly wrote: On Mon, Apr 30, 2012 at 09:02:33AM -0400, Alvaro Herrera wrote: Well, do *you* want it? Of course. That way I can stop patching my psql and go back to using the one that came with my release :) Here is result of my review of v4 patch. Submission -- - The patch is in context diff format - It needs rebase for HEAD, but patch command takes care automatically. - Make finishes cleanly, and all regression tests pass Functionality - After applying this patch, I could cancel connection attempt at start-up by pressing Ctrl+C on terminal just after invoking psql command with an unused IP address. Without this patch, such attempt ends up with error such as No route to host after uninterruptable few seconds (maybe the duration until error would depend on environment). Connection attempt by \connect command could be also canceled by pressing Ctrl+C on psql prompt. In addition, I tried setting PGCONNECT_TIMEOUT to 0 (infinite), but psql gave up after few seconds, for both start-up and re-connect. Is this intentional behavior? A timeout of 0 (infinite) means to keep trying until we succeed or fail, not keep trying forever. As you mentioned above, your connection attempts error out after a few seconds. This is what is happening. In my environment no such error occurs and as a result psql continues to attempt to connect for as long as I let it. Detail of changes - Basically this patch consists of three changes: 1) defer setup of cancel handler until start-up connection has established 2) new libpq API PQconnectTimeout which returns connect_timeout value of current session 3) use asynchronous connection API at \connect psql command, this requires API added by 2) These seem reasonable to achieve canceling connection attempt at start-up and \connect, but, as Ryan says, codes added to do_connect might need more simplification. I have some random comments. - Checking status by calling PQstatus just after PQconnectStartParams is necessary. Yes, I agree. - Copying only select() part of pqSocketPoll seems not enough, should we use poll(2) if it is supported? I did not think the additional complexity was worth it in this case. Unless you see some reason to use poll(2) that I do not. - Don't we need to clear error message stored in PGconn after PQconnectPoll returns OK status, like connectDBComplete? I do not believe there is a client interface for clearing the error message. Additionally, the documentation states that PQerrorMessage Returns the error message most recently generated by an operation on the connection. This seems to indicate that the error message should be cleared as this behavior is part of the contract of PQerrorMessage. Regards, -- Shigeru HANADA -Ryan Kelly diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 5c5dd68..79f4bc0 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -1496,6 +1496,24 @@ char *PQoptions(const PGconn *conn); /para /listitem /varlistentry + +varlistentry id=libpq-pqconnecttimeout + term + functionPQconnectTimeout/function + indexterm + primaryPQconnectTimeout/primary + /indexterm + /term + + listitem + para + Returns the connect_timeout property as given to libpq. +synopsis +char *PQconnectTimeout(const PGconn *conn); +/synopsis + /para + /listitem +/varlistentry /variablelist /para diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 0926786..23b0106 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1506,7 +1506,7 @@ static bool do_connect(char *dbname, char *user, char *host, char *port) { PGconn *o_conn = pset.db, - *n_conn; + *n_conn = NULL; char *password = NULL; if (!dbname) @@ -1539,7 +1539,7 @@ do_connect(char *dbname, char *user, char *host, char *port) while (true) { -#define PARAMS_ARRAY_SIZE 8 +#define PARAMS_ARRAY_SIZE 9 const char **keywords = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*keywords)); const char **values = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*values)); @@ -1557,30 +1557,140 @@ do_connect(char *dbname, char *user, char *host, char *port) values[5] = pset.progname; keywords[6] = client_encoding; values[6] = (pset.notty || getenv(PGCLIENTENCODING)) ? NULL : auto; - keywords[7] = NULL; - values[7] = NULL; + keywords[7] = connect_timeout; + values[7] = PQconnectTimeout(o_conn); + keywords[8] = NULL; + values[8] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); - - free(keywords); - free(values); - - /* We can immediately discard the password -- no longer needed */ - if (password) - free(password); - - if (PQstatus(n_conn) ==
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
(2012/05/01 0:30), Ryan Kelly wrote: On Mon, Apr 30, 2012 at 09:02:33AM -0400, Alvaro Herrera wrote: Well, do *you* want it? Of course. That way I can stop patching my psql and go back to using the one that came with my release :) Here is result of my review of v4 patch. Submission -- - The patch is in context diff format - It needs rebase for HEAD, but patch command takes care automatically. - Make finishes cleanly, and all regression tests pass Functionality - After applying this patch, I could cancel connection attempt at start-up by pressing Ctrl+C on terminal just after invoking psql command with an unused IP address. Without this patch, such attempt ends up with error such as No route to host after uninterruptable few seconds (maybe the duration until error would depend on environment). Connection attempt by \connect command could be also canceled by pressing Ctrl+C on psql prompt. In addition, I tried setting PGCONNECT_TIMEOUT to 0 (infinite), but psql gave up after few seconds, for both start-up and re-connect. Is this intentional behavior? Detail of changes - Basically this patch consists of three changes: 1) defer setup of cancel handler until start-up connection has established 2) new libpq API PQconnectTimeout which returns connect_timeout value of current session 3) use asynchronous connection API at \connect psql command, this requires API added by 2) These seem reasonable to achieve canceling connection attempt at start-up and \connect, but, as Ryan says, codes added to do_connect might need more simplification. I have some random comments. - Checking status by calling PQstatus just after PQconnectStartParams is necessary. - Copying only select() part of pqSocketPoll seems not enough, should we use poll(2) if it is supported? - Don't we need to clear error message stored in PGconn after PQconnectPoll returns OK status, like connectDBComplete? Regards, -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On Sun, Apr 29, 2012 at 10:12:40PM -0400, Alvaro Herrera wrote: Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012: I have attached a new patch which handles the connect_timeout option by adding a PQconnectTimeout(conn) function to access the connect_timeout which I then use to retrieve the existing value from the old connection. Was this patch dropped entirely? If not and it hasn't been committed yet, I think it belongs in the open CF here: https://commitfest.postgresql.org/action/commitfest_view?id=14 Needs some freshening if anyone still wants it. Update against latest HEAD attached. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -Ryan Kelly diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 8a820ac..bf4d110 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -1479,6 +1479,24 @@ char *PQoptions(const PGconn *conn); /para /listitem /varlistentry + +varlistentry id=libpq-pqconnecttimeout + term + functionPQconnectTimeout/function + indexterm + primaryPQconnectTimeout/primary + /indexterm + /term + + listitem + para + Returns the connect_timeout property as given to libpq. +synopsis +char *PQconnectTimeout(const PGconn *conn); +/synopsis + /para + /listitem +/varlistentry /variablelist /para diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index dd59aa1..90dfe13 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1504,7 +1504,7 @@ static bool do_connect(char *dbname, char *user, char *host, char *port) { PGconn *o_conn = pset.db, - *n_conn; + *n_conn = NULL; char *password = NULL; if (!dbname) @@ -1537,7 +1537,7 @@ do_connect(char *dbname, char *user, char *host, char *port) while (true) { -#define PARAMS_ARRAY_SIZE 8 +#define PARAMS_ARRAY_SIZE 9 const char **keywords = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*keywords)); const char **values = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*values)); @@ -1555,17 +1555,120 @@ do_connect(char *dbname, char *user, char *host, char *port) values[5] = pset.progname; keywords[6] = client_encoding; values[6] = (pset.notty || getenv(PGCLIENTENCODING)) ? NULL : auto; - keywords[7] = NULL; - values[7] = NULL; + keywords[7] = connect_timeout; + values[7] = PQconnectTimeout(o_conn); + keywords[8] = NULL; + values[8] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); + /* attempt connection asynchronously */ + n_conn = PQconnectStartParams(keywords, values, true); + + if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) + { + /* interrupted during connection attempt */ + PQfinish(n_conn); + n_conn = NULL; + } + else + { + time_t end_time = -1; + + /* + * maybe use a connection timeout. this code essentially stolen + * from src/interfaces/libpq/fe-connect.c connectDBComplete + */ + if (PQconnectTimeout(n_conn) != NULL) + { +int timeout = atoi(PQconnectTimeout(n_conn)); +if (timeout 0) +{ + /* + * Rounding could cause connection to fail; need at least 2 secs + */ + if (timeout 2) + timeout = 2; + /* calculate the finish time based on start + timeout */ + end_time = time(NULL) + timeout; +} + } + + while(end_time 0 || time(NULL) end_time) + { +int poll_res; +int rc; +fd_set read_mask, + write_mask; +struct timeval timeout; +struct timeval *ptr_timeout; + +poll_res = PQconnectPoll(n_conn); +if (poll_res == PGRES_POLLING_OK || + poll_res == PGRES_POLLING_FAILED) +{ + break; +} + +if (poll_res == PGRES_POLLING_READING) + FD_SET(PQsocket(n_conn), read_mask); +if (poll_res == PGRES_POLLING_WRITING) + FD_SET(PQsocket(n_conn), write_mask); + +/* + * Compute appropriate timeout interval. essentially stolen + * from src/interfaces/libpq/fe-misc.c pqSocketPoll. Maybe + * that function could be made public? we could then replace + * the whole inside of this while loop, assuming it is safe + * to longjmp out from there. + */ +if (end_time == ((time_t) -1)) + ptr_timeout = NULL; +else +{ + time_t now = time(NULL); + + if (end_time now) + timeout.tv_sec = end_time - now; + else + timeout.tv_sec = 0; + timeout.tv_usec = 0; + ptr_timeout = timeout; +} + +sigint_interrupt_enabled = true; +if (cancel_pressed) +{ + PQfinish(n_conn); + n_conn = NULL; + sigint_interrupt_enabled = false; + break; +} +rc = select(PQsocket(n_conn) + 1, + read_mask, write_mask, NULL, + ptr_timeout); +sigint_interrupt_enabled = false; + +if (rc 0 errno != EINTR) + break; + } + + if (PQstatus(n_conn) != CONNECTION_OK +end_time 0 time(NULL) = end_time)
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
Excerpts from Ryan Kelly's message of lun abr 30 07:10:14 -0400 2012: On Sun, Apr 29, 2012 at 10:12:40PM -0400, Alvaro Herrera wrote: Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012: I have attached a new patch which handles the connect_timeout option by adding a PQconnectTimeout(conn) function to access the connect_timeout which I then use to retrieve the existing value from the old connection. Was this patch dropped entirely? If not and it hasn't been committed yet, I think it belongs in the open CF here: https://commitfest.postgresql.org/action/commitfest_view?id=14 Needs some freshening if anyone still wants it. Update against latest HEAD attached. Well, do *you* want it? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On Mon, Apr 30, 2012 at 09:02:33AM -0400, Alvaro Herrera wrote: Excerpts from Ryan Kelly's message of lun abr 30 07:10:14 -0400 2012: On Sun, Apr 29, 2012 at 10:12:40PM -0400, Alvaro Herrera wrote: Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012: I have attached a new patch which handles the connect_timeout option by adding a PQconnectTimeout(conn) function to access the connect_timeout which I then use to retrieve the existing value from the old connection. Was this patch dropped entirely? If not and it hasn't been committed yet, I think it belongs in the open CF here: https://commitfest.postgresql.org/action/commitfest_view?id=14 Needs some freshening if anyone still wants it. Update against latest HEAD attached. Well, do *you* want it? Of course. That way I can stop patching my psql and go back to using the one that came with my release :) -Ryan Kelly -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012: I have attached a new patch which handles the connect_timeout option by adding a PQconnectTimeout(conn) function to access the connect_timeout which I then use to retrieve the existing value from the old connection. Was this patch dropped entirely? If not and it hasn't been committed yet, I think it belongs in the open CF here: https://commitfest.postgresql.org/action/commitfest_view?id=14 -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On Tue, Jan 10, 2012 at 11:29:58AM +0200, Heikki Linnakangas wrote: On 09.01.2012 15:49, Ryan Kelly wrote: On Mon, Jan 09, 2012 at 10:35:50AM +0200, Heikki Linnakangas wrote: That assumes that it's safe to longjmp out of PQconnectdbParams at any instant. It's not. I'm guessing because it could result in a resource leak? Yes, and other unfinished business, too. I think you'd need to use the asynchronous connection functions PQconnectStartParams() and PQconnectPoll(), and select(). New patch attached. Thanks, some comments: * Why do you need the timeout? * If a SIGINT arrives before you set sigint_interrupt_enabled, it just sets cancel_pressed but doesn't jump out of the connection attempt. You need to explicitly check cancel_pressed after setting sigint_interrupt_enabled to close that race condition. * You have to reinitialize the fd mask with FD_ZERO/SET before each call to select(). select() modifies the mask. * In case of PGRES_POLLING_WRITING, you have to wait until the socket becomes writable, not readable. Attached is a new version that fixes those. Yup, I'm an idiot. There's one caveat in the libpq docs about PQconnectStart/PQconnectPoll: The connect_timeout connection parameter is ignored when using PQconnectPoll; it is the application's responsibility to decide whether an excessive amount of time has elapsed. Otherwise, PQconnectStart followed by a PQconnectPoll loop is equivalent to PQconnectdb. So after this patch, connect_timeout will be ignored in \connect. That probably needs to be fixed. You could incorporate a timeout fairly easily into the select() calls, but unfortunately there's no easy way to get the connect_timeout value. You could to parse the connection string the user gave with PQconninfoParse(), but the effective timeout setting could come from a configuration file, too. Not sure what to do about that. If there was a PQconnectTimeout(conn) function, similar to PQuser(conn), PQhost(conn) et al, you could use that. Maybe we should add that, or even better, a generic function that could be used to return not just connect_timeout, but all the connection options in effect in a connection. I have attached a new patch which handles the connect_timeout option by adding a PQconnectTimeout(conn) function to access the connect_timeout which I then use to retrieve the existing value from the old connection. I also borrowed some code from other places: * connectDBComplete had some logic surrounding handling the timeout value (src/interfaces/libpq/fe-connect.c:1402). * The timeout code is lifted from pqSocketPoll which, if made public, could essentially replace all the select/timeout code in that loop (src/interfaces/libpq/fe-misc.c:1034). Before I get flamed for duplicating code, yes, I know it's a bad thing to do. If anyone has any guidance I'd be glad to implement their suggestions. -Ryan Kelly diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 72c9384..0ee2df1 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -1375,6 +1375,24 @@ char *PQoptions(const PGconn *conn); /para /listitem /varlistentry + +varlistentry id=libpq-pqconnecttimeout + term + functionPQconnectTimeout/function + indexterm + primaryPQconnectTimeout/primary + /indexterm + /term + + listitem + para + Returns the connect_timeout property as given to libpq. +synopsis +char *PQconnectTimeout(const PGconn *conn); +/synopsis + /para + /listitem +/varlistentry /variablelist /para diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 69fac83..7ba22d0 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1516,7 +1516,7 @@ static bool do_connect(char *dbname, char *user, char *host, char *port) { PGconn *o_conn = pset.db, - *n_conn; + *n_conn = NULL; char *password = NULL; if (!dbname) @@ -1549,7 +1549,7 @@ do_connect(char *dbname, char *user, char *host, char *port) while (true) { -#define PARAMS_ARRAY_SIZE 8 +#define PARAMS_ARRAY_SIZE 9 const char **keywords = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*keywords)); const char **values = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*values)); @@ -1567,17 +1567,120 @@ do_connect(char *dbname, char *user, char *host, char *port) values[5] = pset.progname; keywords[6] = client_encoding; values[6] = (pset.notty || getenv(PGCLIENTENCODING)) ? NULL : auto; - keywords[7] = NULL; - values[7] = NULL; + keywords[7] = connect_timeout; + values[7] = PQconnectTimeout(o_conn); + keywords[8] = NULL; + values[8] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); + /* attempt connection asynchronously */ + n_conn = PQconnectStartParams(keywords, values, true); + + if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) + { + /* interrupted during connection attempt */ + PQfinish(n_conn); + n_conn = NULL; + } + else +
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On 09.01.2012 15:49, Ryan Kelly wrote: On Mon, Jan 09, 2012 at 10:35:50AM +0200, Heikki Linnakangas wrote: That assumes that it's safe to longjmp out of PQconnectdbParams at any instant. It's not. I'm guessing because it could result in a resource leak? Yes, and other unfinished business, too. I think you'd need to use the asynchronous connection functions PQconnectStartParams() and PQconnectPoll(), and select(). New patch attached. Thanks, some comments: * Why do you need the timeout? * If a SIGINT arrives before you set sigint_interrupt_enabled, it just sets cancel_pressed but doesn't jump out of the connection attempt. You need to explicitly check cancel_pressed after setting sigint_interrupt_enabled to close that race condition. * You have to reinitialize the fd mask with FD_ZERO/SET before each call to select(). select() modifies the mask. * In case of PGRES_POLLING_WRITING, you have to wait until the socket becomes writable, not readable. Attached is a new version that fixes those. There's one caveat in the libpq docs about PQconnectStart/PQconnectPoll: The connect_timeout connection parameter is ignored when using PQconnectPoll; it is the application's responsibility to decide whether an excessive amount of time has elapsed. Otherwise, PQconnectStart followed by a PQconnectPoll loop is equivalent to PQconnectdb. So after this patch, connect_timeout will be ignored in \connect. That probably needs to be fixed. You could incorporate a timeout fairly easily into the select() calls, but unfortunately there's no easy way to get the connect_timeout value. You could to parse the connection string the user gave with PQconninfoParse(), but the effective timeout setting could come from a configuration file, too. Not sure what to do about that. If there was a PQconnectTimeout(conn) function, similar to PQuser(conn), PQhost(conn) et al, you could use that. Maybe we should add that, or even better, a generic function that could be used to return not just connect_timeout, but all the connection options in effect in a connection. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 69fac83..135d022 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1515,8 +1515,8 @@ param_is_newly_set(const char *old_val, const char *new_val) static bool do_connect(char *dbname, char *user, char *host, char *port) { - PGconn *o_conn = pset.db, - *n_conn; + PGconn *o_conn = pset.db; + PGconn *n_conn = NULL; char *password = NULL; if (!dbname) @@ -1570,14 +1570,67 @@ do_connect(char *dbname, char *user, char *host, char *port) keywords[7] = NULL; values[7] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); + /* attempt connection asynchronously */ + n_conn = PQconnectStartParams(keywords, values, true); + + if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) + { + /* interrupted during connection attempt */ + PQfinish(n_conn); + n_conn = NULL; + } + else + { + while (true) + { +int poll_res; +int rc; +fd_set read_mask, + write_mask; + +poll_res = PQconnectPoll(n_conn); +if (poll_res == PGRES_POLLING_OK || + poll_res == PGRES_POLLING_FAILED) +{ + break; +} + +FD_ZERO(read_mask); +FD_ZERO(write_mask); + +if (poll_res == PGRES_POLLING_READING) + FD_SET(PQsocket(n_conn), read_mask); +if (poll_res == PGRES_POLLING_WRITING) + FD_SET(PQsocket(n_conn), write_mask); + +sigint_interrupt_enabled = true; +if (cancel_pressed) +{ + /* interrupted during connection attempt */ + PQfinish(n_conn); + n_conn = NULL; + sigint_interrupt_enabled = false; + break; +} +rc = select(PQsocket(n_conn) + 1, + read_mask, write_mask, NULL, + NULL); +sigint_interrupt_enabled = false; + +if (rc 0 errno != EINTR) + break; + } + } free(keywords); free(values); /* We can immediately discard the password -- no longer needed */ if (password) + { free(password); + password = NULL; + } if (PQstatus(n_conn) == CONNECTION_OK) break; @@ -1586,7 +1639,7 @@ do_connect(char *dbname, char *user, char *host, char *port) * Connection attempt failed; either retry the connection attempt with * a new password, or give up. */ - if (!password PQconnectionNeedsPassword(n_conn) pset.getPassword != TRI_NO) + if (PQconnectionNeedsPassword(n_conn) pset.getPassword != TRI_NO) { PQfinish(n_conn); password = prompt_for_password(user); @@ -1600,7 +1653,8 @@ do_connect(char *dbname, char *user, char *host, char *port) */ if (pset.cur_cmd_interactive) { - psql_error(%s, PQerrorMessage(n_conn)); + if (n_conn) +psql_error(%s, PQerrorMessage(n_conn)); /* pset.db is left unmodified */ if (o_conn) @@ -1608,7 +1662,9 @@ do_connect(char *dbname, char *user, char *host, char
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On 08.01.2012 22:18, Ryan Kelly wrote: @@ -1570,7 +1570,13 @@ do_connect(char *dbname, char *user, char *host, char *port) keywords[7] = NULL; values[7] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); + if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) { + /* got here with longjmp */ + } else { + sigint_interrupt_enabled = true; + n_conn = PQconnectdbParams(keywords, values, true); + sigint_interrupt_enabled = false; + } free(keywords); free(values); That assumes that it's safe to longjmp out of PQconnectdbParams at any instant. It's not. I think you'd need to use the asynchronous connection functions PQconnectStartParams() and PQconnectPoll(), and select(). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On Mon, Jan 09, 2012 at 10:35:50AM +0200, Heikki Linnakangas wrote: On 08.01.2012 22:18, Ryan Kelly wrote: @@ -1570,7 +1570,13 @@ do_connect(char *dbname, char *user, char *host, char *port) keywords[7] = NULL; values[7] = NULL; -n_conn = PQconnectdbParams(keywords, values, true); +if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) { +/* got here with longjmp */ +} else { +sigint_interrupt_enabled = true; +n_conn = PQconnectdbParams(keywords, values, true); +sigint_interrupt_enabled = false; +} free(keywords); free(values); That assumes that it's safe to longjmp out of PQconnectdbParams at any instant. It's not. I'm guessing because it could result in a resource leak? I think you'd need to use the asynchronous connection functions PQconnectStartParams() and PQconnectPoll(), and select(). New patch attached. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -Ryan Kelly diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 69fac83..7c31891 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1516,7 +1516,7 @@ static bool do_connect(char *dbname, char *user, char *host, char *port) { PGconn *o_conn = pset.db, - *n_conn; + *n_conn = NULL; char *password = NULL; if (!dbname) @@ -1570,14 +1570,69 @@ do_connect(char *dbname, char *user, char *host, char *port) keywords[7] = NULL; values[7] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); + /* attempt connection asynchronously */ + n_conn = PQconnectStartParams(keywords, values, true); + + if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) + { + /* interrupted during connection attempt */ + PQfinish(n_conn); + n_conn = NULL; + } else { + int waiting = true, select_ret; + struct timeval timeout; + fd_set input_mask; + + FD_ZERO(input_mask); + FD_SET(PQsocket(n_conn), input_mask); + + timeout.tv_sec = 0; + timeout.tv_usec = 50; /* 0.5 sec */ + + sigint_interrupt_enabled = true; + select_ret = select(PQsocket(n_conn) + 1, input_mask, NULL, NULL, timeout); + sigint_interrupt_enabled = false; + + /* don't even bother waiting, something wrong with the socket */ + if (select_ret 0 errno != EINTR) + { +waiting = false; + } + + while(waiting) + { +switch(PQconnectPoll(n_conn)) +{ + case PGRES_POLLING_READING: + case PGRES_POLLING_WRITING: + timeout.tv_sec = 0; + timeout.tv_usec = 50; + + sigint_interrupt_enabled = true; + select_ret = select(PQsocket(n_conn) + 1, input_mask, NULL, NULL, timeout); + sigint_interrupt_enabled = false; + + if (select_ret 0 errno != EINTR) + { + waiting = false; + } + break; + default: + waiting = false; + break; +} + } + } free(keywords); free(values); /* We can immediately discard the password -- no longer needed */ if (password) + { free(password); + password = NULL; + } if (PQstatus(n_conn) == CONNECTION_OK) break; @@ -1586,7 +1641,7 @@ do_connect(char *dbname, char *user, char *host, char *port) * Connection attempt failed; either retry the connection attempt with * a new password, or give up. */ - if (!password PQconnectionNeedsPassword(n_conn) pset.getPassword != TRI_NO) + if (PQconnectionNeedsPassword(n_conn) pset.getPassword != TRI_NO) { PQfinish(n_conn); password = prompt_for_password(user); @@ -1600,7 +1655,8 @@ do_connect(char *dbname, char *user, char *host, char *port) */ if (pset.cur_cmd_interactive) { - psql_error(%s, PQerrorMessage(n_conn)); + if (n_conn) +psql_error(%s, PQerrorMessage(n_conn)); /* pset.db is left unmodified */ if (o_conn) @@ -1608,7 +1664,9 @@ do_connect(char *dbname, char *user, char *host, char *port) } else { - psql_error(\\connect: %s, PQerrorMessage(n_conn)); + if (n_conn) +psql_error(\\connect: %s, PQerrorMessage(n_conn)); + if (o_conn) { PQfinish(o_conn); diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c index 8b1864c..e53d84c 100644 --- a/src/bin/psql/startup.c +++ b/src/bin/psql/startup.c @@ -111,8 +111,6 @@ main(int argc, char *argv[]) setvbuf(stderr, NULL, _IONBF, 0); #endif - setup_cancel_handler(); - pset.progname = get_progname(argv[0]); pset.db = NULL; @@ -245,8 +243,9 @@ main(int argc, char *argv[]) } /* - * Now find something to do + * Now find something to do (and handle cancellation, if applicable) */ + setup_cancel_handler(); /* * process file given by -f -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Allow breaking out of hung connection attempts
When attempting to connect to a non-existent host with psql, the connection will hang and ^C will not break the attempt. This affects two places: startup and in interactive mode with \c. During startup, the new behavior will be to exit psql. In interactive mode, the new behavior will be to return to the interactive shell. --- src/bin/psql/command.c | 17 + src/bin/psql/startup.c |5 ++--- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 69fac83..74e406b 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1516,7 +1516,7 @@ static bool do_connect(char *dbname, char *user, char *host, char *port) { PGconn *o_conn = pset.db, - *n_conn; + *n_conn = NULL; char *password = NULL; if (!dbname) @@ -1570,7 +1570,13 @@ do_connect(char *dbname, char *user, char *host, char *port) keywords[7] = NULL; values[7] = NULL; - n_conn = PQconnectdbParams(keywords, values, true); + if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) { + /* got here with longjmp */ + } else { + sigint_interrupt_enabled = true; + n_conn = PQconnectdbParams(keywords, values, true); + sigint_interrupt_enabled = false; + } free(keywords); free(values); @@ -1600,7 +1606,8 @@ do_connect(char *dbname, char *user, char *host, char *port) */ if (pset.cur_cmd_interactive) { - psql_error(%s, PQerrorMessage(n_conn)); + if (n_conn) + psql_error(%s, PQerrorMessage(n_conn)); /* pset.db is left unmodified */ if (o_conn) @@ -1608,7 +1615,9 @@ do_connect(char *dbname, char *user, char *host, char *port) } else { - psql_error(\\connect: %s, PQerrorMessage(n_conn)); + if (n_conn) + psql_error(\\connect: %s, PQerrorMessage(n_conn)); + if (o_conn) { PQfinish(o_conn); diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c index 8b1864c..e53d84c 100644 --- a/src/bin/psql/startup.c +++ b/src/bin/psql/startup.c @@ -111,8 +111,6 @@ main(int argc, char *argv[]) setvbuf(stderr, NULL, _IONBF, 0); #endif - setup_cancel_handler(); - pset.progname = get_progname(argv[0]); pset.db = NULL; @@ -245,8 +243,9 @@ main(int argc, char *argv[]) } /* -* Now find something to do +* Now find something to do (and handle cancellation, if applicable) */ + setup_cancel_handler(); /* * process file given by -f -- 1.7.7.4 Previously, these changes were submitted to -bugs: http://archives.postgresql.org/pgsql-bugs/2012-01/msg00030.php http://archives.postgresql.org/pgsql-bugs/2012-01/msg00036.php Though, I think that might not have been the correct forum? I still could be wrong with -hackers, so just let me know where to go with this if I'm still off-base. -Ryan Kelly -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers