Re: Add Information during standby recovery conflicts

Fujii Masao Thu, 29 Oct 2020 20:25:31 -0700



On 2020/10/30 10:29, Masahiko Sawada wrote:

,

On Thu, 29 Oct 2020 at 00:16, Fujii Masao <[email protected]> wrote:




On 2020/10/27 9:41, Masahiko Sawada wrote:

On Tue, 20 Oct 2020 at 22:02, Drouvot, Bertrand <[email protected]> wrote:


Hi,

On 10/15/20 9:15 AM, Masahiko Sawada wrote:

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



On Thu, 15 Oct 2020 at 14:52, Kyotaro Horiguchi <[email protected]> wrote:

At Thu, 15 Oct 2020 14:28:57 +0900, Masahiko Sawada 
<[email protected]> wrote in

ereport(..(errmsg("%s", _("hogehoge")))) results in
fprintf((translated("%s")), translate("hogehoge")).

So your change (errmsg("%s", gettext_noop("hogehoge")) results in

fprintf((translated("%s")), DONT_translate("hogehoge")).

which leads to a translation problem.

(errmsg(gettext_noop("hogehoge"))

This seems equivalent to (errmsg("hogehoge")), right?

Yes and no.  However eventually the two works the same way,
"(errmsg(gettext_noop("hogehoge"))" is a shorthand of

1: char *msg = gettext_noop("hogehoge");
...
2: .. (errmsg(msg));

That is, the line 1 only registers a message id "hogehoge" and doesn't
translate. The line 2 tries to translate the content of msg and it
finds the translation for the message id "hogehoge".

Understood.

I think I could understand translation stuff. Given we only report the
const string returned from get_recovery_conflict_desc() without
placeholders, the patch needs to use errmsg_internal() instead while
not changing _() part. (errmsg(get_recovery_conflict_desc())) is not
good (warned by -Wformat-security).

Ah, right. we get a complain if no value parameters added. We can
silence it by adding a dummy parameter to errmsg, but I'm not sure
which is preferable.

Okay, I'm going to use errmsg_internal() for now until a better idea comes.

I've attached the updated patch that fixed the translation part.


Thanks for reviewing and helping on this patch!

The patch tester bot is currently failing due to:

"proc.c:1290:5: error: ‘standbyWaitStart’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]"

I've attached a new version with the minor change to fix it.


Thank you for updating the patch!

I've looked at the patch and revised a bit the formatting etc.

After some thoughts, I think it might be better to report the waiting
time as well. it would help users and there is no particular reason
for logging the report only once. It also helps make the patch clean
by reducing the variables such as recovery_conflict_logged. I’ve
implemented it in the v8 patch.


I read v8 patch. Here are review comments.


Thank you for your review.


When recovery conflict with buffer pin happens, log message is output
every deadlock_timeout. Is this intentional behavior? If yes, IMO that's
not good because lots of messages can be output.


Agreed.

if we were to log the recovery conflict only once in bufferpin
conflict case, we would log it multiple times only in lock conflict
case. So I guess it's better to do that in all conflict cases.


Yes, I agree that this behavior basically should be consistent between all 
cases.


+       if (log_recovery_conflict_waits)
+               waitStart = GetCurrentTimestamp();

What happens if log_recovery_conflict_waits is off at the beginning and
then it's changed during waiting for the conflict? In this case, waitStart is
zero, but log_recovery_conflict_waits is on. This may cause some problems?


Hmm, I didn't see a path that happens to reload the config file during
waiting for buffer cleanup lock. Even if the startup process receives
SIGHUP during that, it calls HandleStartupProcInterrupts() at the next
convenient time. It could be the beginning of main apply loop or
inside WaitForWALToBecomeAvailable() and so on but I didn’t see it in
the wait loop for taking a buffer cleanup.


Yes, you're right. I seem to have read the code wrongly.

However, I think it’s
better to use (waitStart > 0) for safety when checking if we log the
recovery conflict instead of log_recovery_conflict_waits.


+                       if (report_waiting)
+                               ts = GetCurrentTimestamp();

GetCurrentTimestamp() doesn't need to be called every cycle
in the loop after "logged" is true and "new_status" is not NULL.


Agreed


+extern const char *get_procsignal_reason_desc(ProcSignalReason reason);

Is this garbage?


Yes.


When log_lock_waits is enabled, both "still waiting for ..." and "acquired ..."
messages are output. Like this, when log_recovery_conflict_waits is enabled,
not only "still waiting ..." but also "resolved ..." message should be output?
The latter message might not need to be output if the conflict is canceled
due to max_standby_xxx_delay parameter. The latter message would be
useful to know how long the recovery was waiting for the conflict. Thought?
It's ok to implement this as a separate patch later, though.


There was a discussion that the latter message without waiting time is
not necessarily needed because the canceled process will log
"canceling statement due to conflict with XXX" as you mentioned. I
agreed with that. But I agree that the latter message with waiting
time would help users, for instance when the startup process is
waiting for multiple processes and it takes a time to cancel all of
them.


I agree that it's useful to output the wait time.

But as I told in previous email, it's ok to focus on the current patch
for now and then implement this as a separate patch later.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Add Information during standby recovery conflicts

Reply via email to