Re: closesocket behavior in different platforms
On Thu, Jan 30, 2020 at 9:04 AM Amit Kapila wrote: > > On Wed, Jan 29, 2020 at 8:29 PM Robert Haas wrote: > > > > On Wed, Jan 29, 2020 at 6:04 AM vignesh C wrote: > > > Thanks for your review and suggestion. I have made a patch based on > > > similar lines. Attached patch has the doc update with the explanation. > > > Thoughts? > > > > Documenting this doesn't seem very useful to me. > > > > I thought of documenting it because this has been reported/discussed > multiple times (see some of the links of discussions at the end of the > first email) and every time we need to spend time explaining the same > thing. However, if we decide not to do that I am fine with it. > Does anybody else have any opinion on whether it makes sense to document this behavior? To summarize for others, the behavior difference as noted by Vignesh in his patch is: + + + You will get server closed the connection unexpectedly message while + trying to execute sql command on disconnected connection. The message is + slightly different in windows and non-windows. In non-windows, you will + see a FATAL message before the error message: + +FATAL: terminating connection due to idle-in-transaction timeout +server closed the connection unexpectedly +This probably means the server terminated abnormally +before or while processing the request. +The connection to the server was lost. Attempting reset: Succeeded. + + In windows, you might not see the FATAL message: + +server closed the connection unexpectedly +This probably means the server terminated abnormally +before or while processing the request. +The connection to the server was lost. Attempting reset: Succeeded. We have spent a decent amount of time on this and it is due to windows API behaving differently. By documenting, we might avoid the future effort of explaining this to users. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: closesocket behavior in different platforms
On Wed, Jan 29, 2020 at 8:29 PM Robert Haas wrote: > > On Wed, Jan 29, 2020 at 6:04 AM vignesh C wrote: > > Thanks for your review and suggestion. I have made a patch based on > > similar lines. Attached patch has the doc update with the explanation. > > Thoughts? > > Documenting this doesn't seem very useful to me. > I thought of documenting it because this has been reported/discussed multiple times (see some of the links of discussions at the end of the first email) and every time we need to spend time explaining the same thing. However, if we decide not to do that I am fine with it. > If we could fix the > code, that would be useful, but otherwise I think I'd just do nothing. > Yeah, that is our first choice as well, but there doesn't seem to be a good solution to it as this is a platform-specific behavior. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: closesocket behavior in different platforms
On Wed, Jan 29, 2020 at 6:04 AM vignesh C wrote: > Thanks for your review and suggestion. I have made a patch based on > similar lines. Attached patch has the doc update with the explanation. > Thoughts? Documenting this doesn't seem very useful to me. If we could fix the code, that would be useful, but otherwise I think I'd just do nothing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: closesocket behavior in different platforms
On Wed, Jan 29, 2020 at 4:34 PM vignesh C wrote: > On Tue, Jan 21, 2020 at 11:22 AM Amit Kapila > wrote: > > > > On Fri, Dec 6, 2019 at 11:24 AM vignesh C wrote: > > > [...] > > Thanks for your review and suggestion. I have made a patch based on > similar lines. Attached patch has the doc update with the explanation. > Thoughts? > Hi Vignesh, I have looked into the patch, realised that some format tagging and the grammar changes are needed. Commented inline below: + + + You will get server closed the connection unexpectedly message while The error message is usually wrapped inside the tag. + trying to execute sql command on disconnected connection. The message is Also, s/disconnected connection/the disconnected connection + slightly different in windows and non-windows. In non-windows, you will s/in windows/on Windows s/In non-windows/On non-windows + see a FATAL message before the error message: How about : On non-window you'll see a fatal error as below. + +FATAL: terminating connection due to idle-in-transaction timeout +server closed the connection unexpectedly +This probably means the server terminated abnormally +before or while processing the request. +The connection to the server was lost. Attempting reset: Succeeded. + + In windows, you might not see the FATAL message: s/In windows /On Windows s/FATAL message/fatal error + +server closed the connection unexpectedly +This probably means the server terminated abnormally +before or while processing the request. +The connection to the server was lost. Attempting reset: Succeeded. + + This message "FATAL: terminating connection due to idle-in-transaction Usually for doubt-quoting is used. Here think, we should remove FATA L and wrap the error message text inside tag. + timeout" that is sent from server will not be displayed in windows, How about : that is sent from the server will not be displayed on windows. + however it will be present in the log file. The reason for this is, in s/however/However s/in/on + windows the client cannot receive the message sent by the server when the s/windows/Windows or Windows + server has closed the client connection. This behavior can be noticed when + the client connection has been disconnected because of + idle_in_transaction_session_timeout, pg_terminate_backend, pg_ctl kill + TERM and drop database with (force). s/idle_in_transaction_session_timeout/ s/pg_terminate_backend/pg_terminate_backend() s/pg_ctl kill TERM/DROP DATABASE ... WITH ( FORCE ) Regards, Amul
Re: closesocket behavior in different platforms
On Tue, Jan 21, 2020 at 11:22 AM Amit Kapila wrote: > > On Fri, Dec 6, 2019 at 11:24 AM vignesh C wrote: > > > > It is noticed that in all the 4 cases the message "FATAL: terminating > > connection due to administrator command" does not appear in windows. > > > > However the following message is present in the server log file: > > FATAL: terminating connection due to administrator command > > > > The reason for this looks like: > > When the server closes a connection, it sends the ErrorResponse packet, and > > then closes the socket and terminates the backend process. If the packet is > > received before the server closes the connection, the error message is > > received in both windows and linux. If the packet is not received before > > the server closes the connection, the error message is not received in case > > of windows where as in linux it is received. > > > > There have been a couple of discussion earlier also on this [1] & [2], but > > we could not find any alternate solution. > > > > One of the options that msdn suggests in [3] is to use SO_LINGER option, we > > had tried this option with no luck in solving. One other thing that we had > > tried was to sleep for 1 second before closing the socket, this solution > > works if the client is active, whereas in case of inactive clients it does > > not solves the problem. One other thought that we had was to simultaneously > > check the connection from psql, when we are waiting for query input in > > gets_interactive function or have a separate thread to check the connection > > status periodically, this might work only in case of psql but will not work > > for application which uses libpq. Amit had also suggested one solution in > > [4], where he proposed 'I have also tried calling closesocket() explicitly > > in our function socket_close which has changed the error message to "could > > not receive data from server: Software caused connection abort > > (0x2745/10053)".' > > > > Based on previous investigation and information in this email, I don't > see anything we can do about this. > > > Should we add some documentation for the above behavior. > > > > That sounds reasonable to me. Any proposal for the same? One idea > could be to add something like "Client Disconnection Problems" after > the "Client Connection Problems" section in docs [1]. > Thanks for your review and suggestion. I have made a patch based on similar lines. Attached patch has the doc update with the explanation. Thoughts? Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com From e8fc0f70c4bb0bdcff8edc748ae7e04a0fcbf7b4 Mon Sep 17 00:00:00 2001 From: vignesh Date: Wed, 29 Jan 2020 16:23:51 +0530 Subject: [PATCH] Doc update for the server message not being displayed when the client connection is closed by the server. Doc update for the server message not being displayed when the client connection is closed by the server in case of idle_in_transaction_session_timeout, pg_terminate_backend, pg_ctl kill TERM and drop database with (force). When the server closes client connection, it sends the ErrorResponse packet, and then closes the socket and terminates the backend process. If the packet is received before the server closes the connection, the error message is received in both windows and linux. If the packet is not received before the server closes the connection, the error message is not received in case of windows where as in non-windows it is received. --- doc/src/sgml/runtime.sgml | 30 ++ 1 file changed, 30 insertions(+) diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index a34d31d..3a0abd7 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -627,6 +627,36 @@ psql: could not connect to server: No such file or directory indicate more fundamental problems, like lack of network connectivity. + + + You will get server closed the connection unexpectedly message while + trying to execute sql command on disconnected connection. The message is + slightly different in windows and non-windows. In non-windows, you will + see a FATAL message before the error message: + +FATAL: terminating connection due to idle-in-transaction timeout +server closed the connection unexpectedly +This probably means the server terminated abnormally +before or while processing the request. +The connection to the server was lost. Attempting reset: Succeeded. + + In windows, you might not see the FATAL message: + +server closed the connection unexpectedly +This probably means the server terminated abnormally +before or while processing the request. +The connection to the server was lost. Attempting reset: Succeeded. + + This message "FATAL: terminating connection due to idle-in-transaction + timeout" that is sent from server will not be displayed in windows, + however it will be present in the log file. The reason
Re: closesocket behavior in different platforms
On Fri, Dec 6, 2019 at 11:24 AM vignesh C wrote: > > It is noticed that in all the 4 cases the message "FATAL: terminating > connection due to administrator command" does not appear in windows. > > However the following message is present in the server log file: > FATAL: terminating connection due to administrator command > > The reason for this looks like: > When the server closes a connection, it sends the ErrorResponse packet, and > then closes the socket and terminates the backend process. If the packet is > received before the server closes the connection, the error message is > received in both windows and linux. If the packet is not received before the > server closes the connection, the error message is not received in case of > windows where as in linux it is received. > > There have been a couple of discussion earlier also on this [1] & [2], but we > could not find any alternate solution. > > One of the options that msdn suggests in [3] is to use SO_LINGER option, we > had tried this option with no luck in solving. One other thing that we had > tried was to sleep for 1 second before closing the socket, this solution > works if the client is active, whereas in case of inactive clients it does > not solves the problem. One other thought that we had was to simultaneously > check the connection from psql, when we are waiting for query input in > gets_interactive function or have a separate thread to check the connection > status periodically, this might work only in case of psql but will not work > for application which uses libpq. Amit had also suggested one solution in > [4], where he proposed 'I have also tried calling closesocket() explicitly in > our function socket_close which has changed the error message to "could not > receive data from server: Software caused connection abort > (0x2745/10053)".' > Based on previous investigation and information in this email, I don't see anything we can do about this. > Should we add some documentation for the above behavior. > That sounds reasonable to me. Any proposal for the same? One idea could be to add something like "Client Disconnection Problems" after the "Client Connection Problems" section in docs [1]. Anybody else has any better suggestions on this topic? [1] - https://www.postgresql.org/docs/devel/server-start.html -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
closesocket behavior in different platforms
Hi, In few scenarios the message displayed in psql console is not consistent in windows and linux. The execution results from few scenarios in windows and linux is listed below: In CentOS *After transaction idle timeout*postgres=# SET idle_in_transaction_session_timeout=300; SET postgres=# BEGIN; BEGIN postgres=# SELECT * FROM pg_class; FATAL: terminating connection due to idle-in-transaction timeout server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. *After pg_terminate_backend from another session:*postgres=# select * from dual; FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. *Similarly in pg_ctl kill TERM and drop database with (force).* In Windows *After transaction idle timeout*postgres=# set idle_in_transaction_session_timeout=300; SET postgres=# begin; BEGIN postgres=# select * from dual postgres-# ; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. *After pg_terminate_backend from another session:*postgres=# select * from dual; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. *Similarly in pg_ctl kill TERM and drop database with (force).*There may be some more scenarios which I'm missing. It is noticed that in all the 4 cases the message "FATAL: terminating connection due to administrator command" does not appear in windows. However the following message is present in the server log file: FATAL: terminating connection due to administrator command The reason for this looks like: When the server closes a connection, it sends the ErrorResponse packet, and then closes the socket and terminates the backend process. If the packet is received before the server closes the connection, the error message is received in both windows and linux. If the packet is not received before the server closes the connection, the error message is not received in case of windows where as in linux it is received. There have been a couple of discussion earlier also on this [1] & [2], but we could not find any alternate solution. One of the options that msdn suggests in [3] is to use SO_LINGER option, we had tried this option with no luck in solving. One other thing that we had tried was to sleep for 1 second before closing the socket, this solution works if the client is active, whereas in case of inactive clients it does not solves the problem. One other thought that we had was to simultaneously check the connection from psql, when we are waiting for query input in gets_interactive function or have a separate thread to check the connection status periodically, this might work only in case of psql but will not work for application which uses libpq. Amit had also suggested one solution in [4], where he proposed 'I have also tried calling closesocket() explicitly in our function socket_close which has changed the error message to "could not receive data from server: Software caused connection abort (0x2745/10053)".' Or Should we add some documentation for the above behavior. Thoughts? [1] https://www.postgresql.org/message-id/flat/CA%2BhUKGJowQypXSKsjws9A%2BnEQDD0-mExHZqFXtJ09N209rCO5A%40mail.gmail.com#0629f079bc59ecdaa0d6ac9f8f2c18ac [2] https://www.postgresql.org/message-id/87k1iy44fd@news-spur.riddles.org.uk [3] https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket [4] https://www.postgresql.org/message-id/CAA4eK1%2BGNyjaPK77y%2Beuh5eAgM75pncG1JYZhxYZF%2BSgS6NpjA%40mail.gmail.com Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com