Re: [HACKERS] Changeset Extraction v7.6.1

Robert Haas Wed, 19 Feb 2014 10:02:34 -0800

On Sat, Feb 15, 2014 at 6:59 PM, Andres Freund <[email protected]> wrote:
> On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
>> On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund <[email protected]> 
>> wrote:
>> > [ new patches ]
>>
>> 0001 already needs minor
>
> Hm?
>
> If there are conflicts, I'll push/send a rebased tomorrow or monday.


As you guessed, the missing word was "rebasing".  It's a trivial
conflict though, so please don't feel the need to repost just for
that.

>> +        * the transaction's invalidations. This currently won't be needed if
>> +        * we're just skipping over the transaction, since that currently 
>> only
>> +        * happens when starting decoding, after we invalidated all caches, 
>> but
>> +        * transactions in other databases might have touched shared 
>> relations.
>>
>> I'm having trouble understanding what this means, especially the part
>> after the "but".
>
> Let me rephrase, maybe that will help:
>
> This currently won't be needed if we're just skipping over the
> transaction because currenlty we only do so during startup, to get to
> the first transaction the client needs. As we have reset the catalog
> caches before starting to read WAL and we haven't yet touched any
> catalogs there can't be anything to invalidate.
>
> But if we're "forgetting" this commit because it's it happened in
> another database, the invalidations might be important, because they
> could be for shared catalogs and we might have loaded data into the
> relevant syscaches.
>
> Better?

Much!  Please include that text, or something like it.

>> +       if (IsTransactionState() &&
>> +               GetTopTransactionIdIfAny() != InvalidTransactionId)
>> +               ereport(ERROR,
>> +                               (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
>> +                                errmsg("cannot perform changeset
>> extraction in transaction that has performed writes")));
>>
>> This is sort of an awkward restriction, as it makes it hard to compose
>> this feature with others.  What underlies the restriction, can we get
>> rid of it, and if not can we include a comment here explaining why it
>> exists?
>
> Well, you can write the result into a table if you're halfway
> careful... :)
>
> I think it should be fairly easy to relax the restriction to creating a
> slot, but not getting data from it. Do you think that would that be
> sufficient?

That would be a big improvement, for sure, and might be entirely sufficient.

>> +               /* register output plugin name with slot */
>> +               strncpy(NameStr(slot->data.plugin), plugin,
>> +                               NAMEDATALEN);
>> +               NameStr(slot->data.plugin)[NAMEDATALEN - 1] = '\0';
>>
>> If it's safe to do this without a lock, I don't know why.
>
> Well, the worst that could happen is that somebody else doing a SELECT *
> FROM pg_replication_slot gets a incomplete plugin name... But we
> certainly can hold the spinlock during it if you think that's better.

Isn't the worst thing that can happen that they copy garbage out of
the buffer, because the new name is longer than the old and only
partially written?

>> More broadly, I wonder why this is_init stuff is in here at all.
>> Maybe the stuff that happens in the is_init case should be done in the
>> caller, or another helper function.
>
> The reason I moved it in there is that after the walsender patch there
> are two callers and the stuff is sufficiently complex that I having it
> twice lead to bugs.
> The reason it's currenlty the same function is that sufficiently much of
> the code would have to be shared that I found it it ugly to split. I'll
> have a look whether I can figure something out.

I was thinking that the is_init portion could perhaps be done first,
in a *previous* function call, and adjusted in such a way that the
non-is-init path can be used for both case here.

>> I don't think this is a very good idea.  The problem with doing things
>> during error recovery that can themselves fail is that you'll lose the
>> original error, which is not cool, and maybe even blow out the error
>> stack.  Many people have confuse PG_TRY()/PG_CATCH() with an
>> exception-handling system, but it's not.  One way to fix this is to
>> put some of the initialization logic in ReplicationSlotCreate() just
>> prior to calling CreateSlotOnDisk().  If the work that needs to be
>> done is too complex or protracted to be done there, then I think that
>> it should be pulled out of the act of creating the replication slot
>> and made to happen as part of first use, or as a separate operation
>> like PrepareForLogicalDecoding.
>
> I think what should be done here is adding a drop_on_release flag. As
> soon as everything important is done, it gets unset.

That might be more elegant, but I don't think it really fixes
anything, because backing stuff out from on disk can fail.  AIUI, your
whole concern here is that you don't want the slot creation to fail
halfway through and leave behind the slot, but what you've got here
doesn't prevent that; it just makes it less likely.  The more I think
about it, the more I think you're trying to pack stuff into slot
creation that really ought to be happening on first use.

>> ReorderBufferGetTXN() should get a comment about the performance
>> impact of this.  There's a tiny bit there in ReorderBufferReturnTXN()
>> but it should be better called out.  Should these call the valgrind
>> macros to make the memory inaccessible while it's being held in cache?
>
> Hm, I think it does call the valgrind stuff?
>     VALGRIND_MAKE_MEM_UNDEFINED(txn, sizeof(ReorderBufferTXN));
>     VALGRIND_MAKE_MEM_DEFINED(&txn->node, sizeof(txn->node));

That's there in ReorderBufferReturnTXN, but don't you need something
in ReorderBufferGetTXN?  Maybe not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changeset Extraction v7.6.1

Reply via email to