On Thu, Feb 04, 2010 at 10:19:40AM +0900, Tatsuo Ishii wrote:
> > I noticed that if i take one of the replication nodes offline, and if
> > the failure is noticed from a client transaction instead of a health
> > check, the failure propagates up to the client and causes an exception
> > (which in turn would likely cause a 500 Internal Server Error for
> > a visitor to a postgres-backed webapp).  Shouldn't it be the case that
> > the error results in the silent (to the frontend, anyway) degradation
> > of the node and another attempt at the transmission of the request?
> 
> Yeah, it's desirable but pretty hard to implement.
> 
> For example, consider load balanced SELECT. If a node goes offline
> while SELECT result is being returned to frontend, pgpool needs to
> remember all the rows which it has been already received and restart
> the transmission from the point against the different node.

Indeed, that'd be pretty complicated to do.  But i think there are some
specific cases that would be less complicated to catch:

 * a SELECT is load balanced to a node that is offline (i.e. the write
   to the backend with teh query packet fails before any response is
   recieved).
 * an UPDATE or INSERT succeeds on the master node but fails on one or more
   replication nodes (i.e. they are detected as out of sync and failed).
 * an UPDATE or INSERT fails on the master node before any response is
   recieved (similar to the first case), such that the "mastership" is
   transferred to another node.

I'd think that in all of these cases it'd wouldn't be unreasonable to
transparently catch it and hide it from the frontend, what do you think?


        sean

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Pgpool-hackers mailing list
[email protected]
http://pgfoundry.org/mailman/listinfo/pgpool-hackers

Reply via email to