Re: [HACKERS] Determine if an error is transient by its error code.
Craig Ringerwrites: > On 20 March 2017 at 10:26, Dominick O'Dierno wrote: >> Essentially I want to determine by the error code if it is worth retrying >> the call (transient) or if the error was due to a bad query or programmer >> error, in which case don't retry. > In general you'll need classes of retry: > * just reissue the query (deadlock retry, etc) > * reconnect and retry Yeah. There's a pretty significant fraction of these where just blindly repeating the failing query isn't likely to help; the error code is meant to suggest that the DBA has to fix something, eg adjust configuration limits. I'm also pretty dubious about the value of a blind retry for, eg, disk_full. One you missed that I think *is* supposed to imply "just retry" is 40001 serialization_failure. You have to retry the whole transaction though, not just one query. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Determine if an error is transient by its error code.
On 20 March 2017 at 10:26, Dominick O'Diernowrote: > Hello folks, > > I'm trying to define a transient fault detection strategy for a client > application when calling a postgres database. > > Essentially I want to determine by the error code if it is worth retrying > the call (transient) or if the error was due to a bad query or programmer > error, in which case don't retry. > > Going through the codes as posted here > https://www.postgresql.org/docs/9.6/static/errcodes-appendix.html I had a go > at making a list of error codes which may be transient: > > 53000: insufficient_resources > 53100: disk_full > 53200: out_of_memory > 53300: too_many_connections > 53400: configuration_limit_exceeded > 57000: operator_intervention > 57014: query_canceled > 57P01: admin_shutdown > 57P02: crash_shutdown > 57P03: cannot_connect_now > 57P04: database_dropped > 58000: system_error > 58030: io_error Depends on how transient you mean, really. I/O error, disk full, cannot_connect_now, etc may or may not require admin intervention. I would argue that database_dropped isn't transient. But I guess you might be re-creating it? > These next few I am not sure whether they should be treated as transient or > not, but I am guessing so > > 55P03: lock_not_available Yeah, I'd say that's transient. > 55006: object_in_use Same. > 55000: object_not_in_prerequisite_state Varies. This can be a bit of a catchall error, encompassing things that need configuration changes, things that need system state changes (won't work in recover or whatever), and things that will change in a short span of time. In general you'll need classes of retry: * just reissue the query (deadlock retry, etc) * reconnect and retry etc. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Determine if an error is transient by its error code.
Hello folks, I'm trying to define a transient fault detection strategy for a client application when calling a postgres database. Essentially I want to determine by the error code if it is worth retrying the call (transient) or if the error was due to a bad query or programmer error, in which case don't retry. Going through the codes as posted here https://www.postgresql.org/docs/9.6/static/errcodes-appendix.html I had a go at making a list of error codes which may be transient: 53000: insufficient_resources 53100: disk_full 53200: out_of_memory 53300: too_many_connections 53400: configuration_limit_exceeded 57000: operator_intervention 57014: query_canceled 57P01: admin_shutdown 57P02: crash_shutdown 57P03: cannot_connect_now 57P04: database_dropped 58000: system_error 58030: io_error These next few I am not sure whether they should be treated as transient or not, but I am guessing so 55P03: lock_not_available 55006: object_in_use 55000: object_not_in_prerequisite_state 08000: connection_exception 08003: connection_does_not_exist 08006: connection_failure 08001: sqlclient_unable_to_establish_sqlconnection 08004: sqlserver_rejected_establishment_of_sqlconnection 08007: transaction_resolution_unknown Are there any codes listed above where retrying would actually not be helpful? Are there any codes that I did not include that I should have? Thanks, -Dominick