Re: [GENERAL] libpq: indefinite block on poll during network problems

Albe Laurenz Tue, 27 May 2014 03:37:25 -0700

Dmitry Samonenko wrote:
> I have an application which uses libpq for interaction with remote PostgreSQL 
> 9.2.4 server. Clients
> and Server nodes are running Linux and connection is established using TCPv4. 
> The client application
> has some small fault-tolerance features, which are activated when server 
> related problems are
> encountered.
> 
> One day some bad things happened with network layer hardware and, long story 
> short, host with PSQL
> server got isolated. All TCP messages routed to server node were NOT 
> delivered or acknowledged in any
> way. Client application got blocked in libpq code according to debugger.
> 
> I have successfully reproduced the problem in the laboratory environment. 
> These iptables commands
> should be run on the server node after some period of client <-> server 
> interaction:
> 
> # iptables -A OUTPUT -p tcp --sport 5432 -j DROP
> # iptables -A INPUT  -p tcp --dport 5432 -j DROP
> 
> 
> I made a glimpse over master branch of libpq sources and some questions 
> arose. Namely:
> 
> 1. Connection to PSQL server is made without an option to specify SO_RCVTIMEO 
> and SO_SNDTIMEO. Why is
> that? Is setting socket timeouts considered harmful?
> 
> 2. PQexec ultimately leads to PQwait, which after some function calls "lands" 
> in pqSocketCheck and
> pqSocketPoll. These 2 functions have parameter end_time. It is set (-1) for 
> PQexec scenario, which
> leads to infinite poll timeout in pqSocketPoll. Is it possible to implement 
> configurable timeout for
> PQexec calls? Is there some implemented features, which should be used to 
> handle situation like this?
> 
> Currently, I have changed Linux kernel tcp4 stack counters responsible for 
> retransmission, so OS
> actually closes socket after some period. This is detected by pqSocketPoll's 
> poll and libpq handles
> situation correctly - error is reported to my application. But it's just a 
> workaround.
> 
> So, this infinite poll situation looks like imperfection to me and I think it 
> should be considered as
> a bug. Is it?


In PostgreSQL you can handle the problem of dying connections by setting the
tcp_keepalives_* parameters (see 
http://www.postgresql.org/docs/current/static/runtime-config-connection.html).

That should take care of the problem, right?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] libpq: indefinite block on poll during network problems

Reply via email to