Re: [PATCH] fix a performance issue with multiple logical-decoding walsenders

Pierre Ducroquet Sat, 28 Dec 2019 04:56:22 -0800

On Thursday, December 26, 2019 8:18:46 PM CET Julien Rouhaud wrote:
> Hello Pierre,
> 
> On Thu, Dec 26, 2019 at 5:43 PM Pierre Ducroquet <[email protected]> 
wrote:
> > The second one was tested on PG 10 and PG 12 (with 48 lines offset). It
> > has on PG12 the same effect it has on a PG10+isAlive patch. Instead of
> > calling each time GetFlushRecPtr, we call it only if we notice we have
> > reached the value of the previous call. This way, when the senders are
> > busy decoding, we are no longer fighting for a spinlock to read the
> > FlushRecPtr.
> 
> The patch is quite straightforward and looks good to me.
> 
> -    XLogRecPtr    flushPtr;
> +    static XLogRecPtr flushPtr = 0;
> 
> You should use InvalidXLogRecPtr instead though, and maybe adding some
> comments to explain why the static variable is a life changer here.
> 
> > Here are some benchmark results.
> > On PG 10, to decode our replication stream, we went from 3m 43s to over 5
> > minutes after removing the first hot spot, and then down to 22 seconds.
> > On PG 12, we had to change the benchmark (due to GIN indexes creation
> > being
> > more optimized) so we can not compare directly with our previous bench. We
> > went from 15m 11s down to 59 seconds.
> > If needed, we can provide scripts to reproduce this situation. It is quite
> > simple: add ~20 walsenders doing logical replication in database A, and
> > then generate a lot of data in database B. The walsenders will be woken
> > up by the activity on database B, but not sending it thus keeping hitting
> > the same locks.
> 
> Quite impressive speedup!




Hi

Thank you for your comments.
Attached to this email is a patch with better comments regarding the 
XLogSendLogical change.
We've spent quite some time yesterday benching it again, this time with 
changes that must be fully processed by the decoder. The speed-up is obviously 
much smaller, we are only ~5% faster than without the patch.

Regards

diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index 74330f8c84..fbb30c6847 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2765,7 +2765,13 @@ XLogSendLogical(void)
 {
        XLogRecord *record;
        char       *errm;
-       XLogRecPtr      flushPtr;
+
+       /*
+        * We'll use the current flush point to determine whether we've caught 
up.
+        * This variable is static in order to cache it accross calls. This 
caching
+        * is needed because calling GetFlushRecPtr needs to acquire an 
expensive lock.
+        */
+       static XLogRecPtr flushPtr = InvalidXLogRecPtr;
 
        /*
         * Don't know whether we've caught up yet. We'll set WalSndCaughtUp to
@@ -2782,11 +2788,6 @@ XLogSendLogical(void)
        if (errm != NULL)
                elog(ERROR, "%s", errm);
 
-       /*
-        * We'll use the current flush point to determine whether we've caught 
up.
-        */
-       flushPtr = GetFlushRecPtr();
-
        if (record != NULL)
        {
                /*
@@ -2799,7 +2800,16 @@ XLogSendLogical(void)
                sentPtr = logical_decoding_ctx->reader->EndRecPtr;
        }
 
-       /* Set flag if we're caught up. */
+
+       /* Initialize flushPtr if needed */
+       if (flushPtr == InvalidXLogRecPtr)
+               flushPtr = GetFlushRecPtr();
+
+       /* If EndRecPtr is past our flushPtr, we must update it to know if we 
caught up */
+       if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr)
+               flushPtr = GetFlushRecPtr();
+
+       /* If EndRecPtr is still past our flushPtr, it means we caught up */
        if (logical_decoding_ctx->reader->EndRecPtr >= flushPtr)
                WalSndCaughtUp = true;

Re: [PATCH] fix a performance issue with multiple logical-decoding walsenders

Reply via email to