Hi,
On 2015-01-05 17:16:43 +0900, Michael Paquier wrote:
> + <varlistentry id="restore-command-retry-interval"
> xreflabel="restore_command_retry_interval">
> + <term><varname>restore_command_retry_interval</varname>
> (<type>integer</type>)
> + <indexterm>
> + <primary><varname>restore_command_retry_interval</> recovery
> parameter</primary>
> + </indexterm>
> + </term>
> + <listitem>
> + <para>
> + If <varname>restore_command</> returns nonzero exit status code,
> retry
> + command after the interval of time specified by this parameter,
> + measured in milliseconds if no unit is specified. Default value is
> + <literal>5s</>.
> + </para>
> + <para>
> + This command is particularly useful for warm standby configurations
> + to leverage the amount of retries done to restore a WAL segment file
> + depending on the activity of a master node.
> + </para>
Don't really like this paragraph. What's "leveraging the amount of
retries done" supposed to mean?
> +# restore_command_retry_interval
> +#
> +# specifies an optional retry interval for restore_command command, if
> +# previous command returned nonzero exit status code. This can be useful
> +# to increase or decrease the number of calls of restore_command.
It's not really the number but frequency, right?
> + else if (strcmp(item->name, "restore_command_retry_interval")
> == 0)
> + {
> + const char *hintmsg;
> +
> + if (!parse_int(item->value,
> &restore_command_retry_interval, GUC_UNIT_MS,
> + &hintmsg))
> + ereport(ERROR,
> +
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("parameter \"%s\"
> requires a temporal value",
> +
> "restore_command_retry_interval"),
> + hintmsg ? errhint("%s",
> _(hintmsg)) : 0));
"temporal value" sounds odd to my ears. I'd rather keep in line with
what guc.c outputs in such a case.
> + now = GetCurrentTimestamp();
> + if
> (!TimestampDifferenceExceeds(last_fail_time, now,
> +
> restore_command_retry_interval))
> {
> - pg_usleep(1000000L * (5 - (now
> - last_fail_time)));
> - now = (pg_time_t) time(NULL);
> + long secs;
> + int
> microsecs;
> +
> +
> TimestampDifference(last_fail_time, now, &secs, µsecs);
> +
> pg_usleep(restore_command_retry_interval * 1000L - (1000000L * secs + 1L *
> microsecs));
> + now = GetCurrentTimestamp();
> }
> last_fail_time = now;
> currentSource = XLOG_FROM_ARCHIVE;
Am I missing something or does this handle/affect streaming failures
just the same? That might not be a bad idea, because the current
interval is far too long for e.g. a sync replication setup. But it
certainly makes the name a complete misnomer.
Not this patch's fault, but I'm getting a bit tired seing the above open
coded. How about adding a function that does the sleeping based on a
timestamptz and a ms interval?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers