Hello.
Thanks, I understand, what look in another part of code. Hope right now I did
right changes.
To not modify current pg_usleep calculation, I changed
restore_command_retry_interval value to seconds (not milliseconds). In this
case, min value - 1 second.
Mon, 29 Dec 2014 00:15:03 +0900 от Michael Paquier <[email protected]>:
>On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev < [email protected] > wrote:
>> Thanks for suggestions.
>>
>> Patch updated.
>
>Cool, thanks. I just had an extra look at it.
>
>+ This is useful, if I using for restore of wal logs some
>+ external storage (like AWS S3) and no matter what the slave database
>+ will lag behind the master. The problem, what for each request to
>+ AWS S3 need to pay, what is why for N nodes, which try to get next
>+ wal log each 5 seconds will be bigger price, than for example each
>+ 30 seconds.
>I reworked this portion of the docs, it is rather incorrect as the
>documentation should not use first-person subjects, and I don't
>believe that referencing any commercial products is a good thing in
>this context.
>
>+# specifies an optional timeout after nonzero code of restore_command.
>+# This can be useful to increase/decrease number of a restore_command calls.
>This is still referring to a timeout. That's not good. And the name of
>the parameter at the top of this comment block is missing.
>
>+static int restore_command_retry_interval = 5000L;
>I think that it would be more adapted to set that to 5000, and
>multiply by 1L. I am also wondering about having a better lower bound,
>like 100ms to avoid some abuse with this feature in the retries?
>
>+ ereport(ERROR,
>+
>(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>+ errmsg("\"%s\" must
>be bigger zero",
>+ "restore_command_retry_interval")));
>I'd rather rewrite that to "must have a strictly positive value".
>
>- * Wait for more WAL to
>arrive. Time out after 5 seconds,
>+ * Wait for more WAL to
>arrive. Time out after
>+ *
>restore_command_retry_interval (5 seconds by default),
> * like when polling the
>archive, to react to a trigger
> * file promptly.
> */
>
>WaitLatch(&XLogCtl->recoveryWakeupLatch,
> WL_LATCH_SET
>| WL_TIMEOUT,
>- 5000L);
>+
>restore_command_retry_interval);
>I should have noticed earlier, but in its current state your patch
>actually does not work. What you are doing here is tuning the time
>process waits for WAL from stream. In your case what you want to
>control is the retry time for a restore_command in archive recovery,
>no?
>--
>Michael
>
>
>--
>Sent via pgsql-hackers mailing list ([email protected])
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers
--
Alexey Vasiliev
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index ef78bc0..38420a5 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,26 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="restore-command-retry-interval" xreflabel="restore_command_retry_interval">
+ <term><varname>restore_command_retry_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>restore_command_retry_interval</> recovery parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ If <varname>restore_command</> returns nonzero exit status code, retry
+ command after the interval of time specified by this parameter.
+ Default value is <literal>5s</>.
+ </para>
+ <para>
+ This is useful, if I using for restore of wal logs some
+ external storage and no matter what the slave database
+ will lag behind the master.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect1>
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..5b63f60 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
#
#recovery_end_command = ''
#
+# specifies an optional retry interval of restore_command command, if previous return nonzero exit status code.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_command_retry_interval = 5s
+#
#---------------------------------------------------------------------------
# RECOVERY TARGET PARAMETERS
#---------------------------------------------------------------------------
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e5dddd4..83a6db0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
static char *recoveryTargetName;
static int recovery_min_apply_delay = 0;
static TimestampTz recoveryDelayUntilTime;
+static int restore_command_retry_interval = 5;
/* options taken from recovery.conf for XLOG streaming */
static bool StandbyModeRequested = false;
@@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void)
(errmsg_internal("trigger_file = '%s'",
TriggerFile)));
}
+ else if (strcmp(item->name, "restore_command_retry_interval") == 0)
+ {
+ const char *hintmsg;
+
+ if (!parse_int(item->value, &restore_command_retry_interval, GUC_UNIT_S,
+ &hintmsg))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("parameter \"%s\" requires a temporal value",
+ "restore_command_retry_interval"),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+ ereport(DEBUG2,
+ (errmsg_internal("restore_command_retry_interval = '%s'", item->value)));
+
+ if (restore_command_retry_interval < 1)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("\"%s\" must have a strictly positive value",
+ "restore_command_retry_interval")));
+ }
+ }
else if (strcmp(item->name, "recovery_min_apply_delay") == 0)
{
const char *hintmsg;
@@ -10495,13 +10518,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
* machine, so we've exhausted all the options for
* obtaining the requested WAL. We're going to loop back
* and retry from the archive, but if it hasn't been long
- * since last attempt, sleep 5 seconds to avoid
- * busy-waiting.
+ * since last attempt, sleep restore_command_retry_interval
+ * (by default 5 seconds) to avoid busy-waiting.
*/
now = (pg_time_t) time(NULL);
- if ((now - last_fail_time) < 5)
+ if ((now - last_fail_time) < restore_command_retry_interval)
{
- pg_usleep(1000000L * (5 - (now - last_fail_time)));
+ pg_usleep(1000000L * (restore_command_retry_interval - (now - last_fail_time)));
now = (pg_time_t) time(NULL);
}
last_fail_time = now;
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers