Andres Freund <> writes:
>> How would postmaster know when to restart a worker that stopped?
> I had imagined we would assign some return codes special
> meaning. Currently 0 basically means "restart immediately", 1 means
> "crashed, wait for some time", everything else results in a postmaster
> restart. It seems we can just assign returncode 2 as "done", probably
> with some enum or such hiding the numbers.

In Erlang, the lib that cares about such things in called OTP, and that
proposes a model of supervisor that knows when to restart a worker. The
specs for the restart behaviour are:

  Restart = permanent | transient | temporary

Restart defines when a terminated child process should be restarted.

  - A permanent child process is always restarted.

  - A temporary child process is never restarted (not even when the
    supervisor's restart strategy is rest_for_one or one_for_all and a
    sibling's death causes the temporary process to be terminated).

  - A transient child process is restarted only if it terminates
    abnormally, i.e. with another exit reason than normal, shutdown or

Then about restart frequency, what they have is:

    The supervisors have a built-in mechanism to limit the number of
    restarts which can occur in a given time interval. This is
    determined by the values of the two parameters MaxR and MaxT in the
    start specification returned by the callback function [ ... ]

    If more than MaxR number of restarts occur in the last MaxT seconds,
    then the supervisor terminates all the child processes and then

You can read the whole thing here:

I think we should get some inspiration from them here.

Dimitri Fontaine     PostgreSQL : Expertise, Formation et Support

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to