On 12/17/2014 10:35 AM, Simon Riggs wrote:
On 16 December 2014 at 21:17, Simon Riggs <si...@2ndquadrant.com> wrote:

This patch is a WIP version of doing that, but only currently attempts

With the patch, XLogSendLogical uses the same logic to calculate SendRqstPtr
that XLogSendPhysical does. It would be good to refactor that into a common
function, rather than copy-paste.

Some of the logic is similar, but not all.

SendRqstPtr isn't actually used for anything in XLogSendLogical.

It exists to allow the call which resets TLI.

I'll see if I can make it exactly identical; I didn't think so when I
first looked, will look again.

Yes, that works. New version attached

Some comments, mostly on readability (not all of these were this patch's fault):

                        /*
                         * Check that the timeline the client requested for 
exists, and
                         * the requested start location is on that timeline.
                         */
                        (void) ReadSendTimeLine(cmd->timeline);

                        /*
                         * Found the requested timeline in the history. Check 
that
                         * requested startpoint is on that timeline in our 
history.
                         *
                         * This is quite loose on purpose. We only check that 
we didn't
                         * fork off the requested timeline before the 
switchpoint. We
                         * don't check that we switched *to* it before the 
requested
                         * starting point. This is because the client can 
legitimately
                         * request to start replication from the beginning of 
the WAL
                         * segment that contains switchpoint, but on the new 
timeline, so
                         * that it doesn't end up with a partial segment. If 
you ask for a
                         * too old starting point, you'll get an error later 
when we fail
                         * to find the requested WAL segment in pg_xlog.
                         *
                         * XXX: we could be more strict here and only allow a 
startpoint
                         * that's older than the switchpoint, if it's still in 
the same
                         * WAL segment.
                         */

The first comment implies that the ReadSendTimeLine call checks that the requested start location is on the timeline, but that's actually done by the code that follows the second comment. I would merge these two comments, and move the ReadSendTimeLine call below the merged comment.

@@ -577,8 +571,8 @@ StartReplication(StartReplicationCmd *cmd)
                         * that's older than the switchpoint, if it's still in 
the same
                         * WAL segment.
                         */
-                       if (!XLogRecPtrIsInvalid(switchpoint) &&
-                               switchpoint < cmd->startpoint)
+                       if (!XLogRecPtrIsInvalid(sendTimeLineValidUpto) &&
+                               sendTimeLineValidUpto < cmd->startpoint)
                        {
                                ereport(ERROR,
                                                (errmsg("requested starting point 
%X/%X on timeline %u is not in this server's history",

IMHO using the local 'switchpoint' variable was more clear.

@@ -941,6 +936,8 @@ StartLogicalReplication(StartReplicationCmd *cmd)
         * Force a disconnect, so that the decoding code doesn't need to care
         * about an eventual switch from running in recovery, to running in a
         * normal environment. Client code is expected to handle reconnects.
+        * This covers the race condition where we are promoted half way
+        * through starting up.
         */
        if (am_cascading_walsender && !RecoveryInProgress())
        {

We could exit recovery immediately after this check. Why is this check needed?

        /*
+        * Find the timeline for the start location, or throw an error.
+        *
+        * Logical replication relies upon replication slots. Each slot has a
+        * single timeline history baked into it, so this should be easy.
+        */

I don't understand what "baked in" means here.

+       /*
+        * Get the SendRqstPtr and follow any timeline changes.
+        */
+       SendRqstPtr = GetLatestRequestPtr();

The old comment used to say "Figure out how far we can safely send the WAL". I think that was much more clear. It's not clear what following timeline changes means here, and the fact that it "gets the SendRqstPtr" is obvious from the code.

+
+static XLogRecPtr
+GetLatestRequestPtr(void)

This function desperately needs comment to explain what it does. I don't much like its name either.

+static TimeLineID
+ReadSendTimeLine(TimeLineID tli)

Ditto. This function is also missing a "return".

I think it would slightly more intuitive if this function didn't set the global variables directly, but simply returned the returned values to the caller.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to