Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Amit Kapila
On Thu, Jun 12, 2014 at 12:45 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
  Function pg_create_logical_replication_slot() is trying to
  save slot twice once during CreateInitDecodingContext() and
  then in ReplicationSlotPersist(), isn't it better if we can make
  it do it just once?

 Doesn't work here. In the first save it's not yet marked as persistent -
 but we still need to safely reserve the xmin.

Okay, but if it crashes before saving the persistency to permanent
file and there remains a .tmp for this replication slot which it created
during save of this persistency information, then also xmin will get
lost, because during startup it will not consider such a slot.

 It's also not something
 that should happen very frequently, so I'm not worried about the
 overhead.

Right, just looking from the purpose of need for same.

  8. Is there a chance of inconsistency, if pg_restxlog resets the
  xlog and after restart, one of the slots contains xlog position
  older than what is resetted by pg_resetxlog?

 Yes. There's lots of ways to screw over your database by using
 pg_resetxlog.

Thats right, trying to think if there could be any thing which
won't even allow the server to get started due to replication slots
after pg_resetxlog.  As per my current understanding, I don't think
there is any such problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Andres Freund
On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
 On Thu, Jun 12, 2014 at 12:45 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
   Function pg_create_logical_replication_slot() is trying to
   save slot twice once during CreateInitDecodingContext() and
   then in ReplicationSlotPersist(), isn't it better if we can make
   it do it just once?
 
  Doesn't work here. In the first save it's not yet marked as persistent -
  but we still need to safely reserve the xmin.
 
 Okay, but if it crashes before saving the persistency to permanent
 file and there remains a .tmp for this replication slot which it created
 during save of this persistency information, then also xmin will get
 lost, because during startup it will not consider such a slot.

I can't follow here. If it crashes before it's marked persistent it'll
get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
cleaned up at restart.
It's fine if the entire slot is lost if the server crashes too early -
we haven't yet returned sucess...

  Yes. There's lots of ways to screw over your database by using
  pg_resetxlog.
 
 Thats right, trying to think if there could be any thing which
 won't even allow the server to get started due to replication slots
 after pg_resetxlog.  As per my current understanding, I don't think
 there is any such problem.

I'm not aware of any.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Amit Kapila
On Fri, Jun 13, 2014 at 1:45 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
  Okay, but if it crashes before saving the persistency to permanent
  file and there remains a .tmp for this replication slot which it created
  during save of this persistency information, then also xmin will get
  lost, because during startup it will not consider such a slot.

 I can't follow here. If it crashes before it's marked persistent it'll
 get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
 cleaned up at restart.

Yes that's fine, the point I wanted to say here is that the flush
of slot in CreateInitDecodingContext() wouldn't save xmin in all
cases which is I think what it want to achieve.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Andres Freund
On 2014-06-13 14:21:33 +0530, Amit Kapila wrote:
 On Fri, Jun 13, 2014 at 1:45 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
   Okay, but if it crashes before saving the persistency to permanent
   file and there remains a .tmp for this replication slot which it created
   during save of this persistency information, then also xmin will get
   lost, because during startup it will not consider such a slot.
 
  I can't follow here. If it crashes before it's marked persistent it'll
  get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
  cleaned up at restart.
 
 Yes that's fine, the point I wanted to say here is that the flush
 of slot in CreateInitDecodingContext() wouldn't save xmin in all
 cases which is I think what it want to achieve.

It'll savely stored until a xact abort (will be deleted during cleanup),
PANIC crash (deleted during startup) or until the slot is marked as
persistent. Which is what we need.

Maybe this helps:
/*
 * For logical decoding, it's extremely important that we never remove 
any
 * data that's still needed for decoding purposes, even after a crash;
 * otherwise, decoding will produce wrong answers.  Ordinary streaming
 * replication also needs to prevent old row versions from being removed
 * too soon, but the worst consequence we might encounter there is
 * unwanted query cancellations on the standby.  Thus, for logical
 * decoding, this value represents the latest xmin that has actually 
been
 * written to disk, whereas for streaming replication, it's just the 
same
 * as the persistent value (data.xmin).
 */
TransactionId effective_xmin;
TransactionId effective_catalog_xmin;


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Few observations in replication slots related code

2014-06-12 Thread Andres Freund
Hi Amit,

On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
 1. In function StartupReplicationSlots(XLogRecPtr checkPointRedo),
 parameter checkPointRedo is not used.

It used to be included in a debug message. Apparently the message was
removed at some point (don't remember it, but I have a memory like a
sieve).

 2. Few check are in different order in functions
 pg_create_physical_replication_slot() and
 pg_create_logical_replication_slot().
 
 if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
 elog(ERROR, return type must be a row type);
 check_permissions();
 
 CheckLogicalDecodingRequirements();
 
 I don't think it makes any difference, however having checks in similar
 order seems better unless there is a reason for not doing so.

Can change it.

 3. Currently there is any Assert (Assert(!MyReplicationSlot);) in
 pg_create_logical_replication_slot(), is there a need for similar
 Assert in pg_create_physical_replication_slot()?

Hm. Shouldn't really matter either way these days, but I guess it
doesn't hurt to add one.

 4.
 StartupDecodingContext()
 {
 ..
 context = AllocSetContextCreate(CurrentMemoryContext,
 Changeset Extraction Context,
 }
 
 To be consistent, shall we name this context as
 logical | logical decoding?

Heh. That apparently escaped when things were renamed. Yes.

 5.
 pg_create_logical_replication_slot()
 {
 ..
 CreateInitDecodingContext()
 
 ..
 ReplicationSlotPersist()
 }
 
 Function pg_create_logical_replication_slot() is trying to
 save slot twice once during CreateInitDecodingContext() and
 then in ReplicationSlotPersist(), isn't it better if we can make
 it do it just once?

Doesn't work here. In the first save it's not yet marked as persistent -
but we still need to safely reserve the xmin. It's also not something
that should happen very frequently, so I'm not worried about the
overhead.

 6.
 elog(ERROR, cannot handle changeset extraction yet);
 
 Shouldn't it be better to use logical replication instead
 of changeset extraction?

Will change.

 7.
 + * XXX: It might make sense to introduce ephemeral slots and always use
 + * the slot mechanism.
 
 Already there are RS_EPHEMERAL slots, might be this
 comment needs bit of rephrasing.

 8. Is there a chance of inconsistency, if pg_restxlog resets the
 xlog and after restart, one of the slots contains xlog position
 older than what is resetted by pg_resetxlog?

Yes. There's lots of ways to screw over your database by using
pg_resetxlog. I personally think there should be a required
--yes-i-am-breaking-my-database parameter for pg_resetxlog.

 9.
 void
 LogicalIncreaseXminForSlot(XLogRecPtr current_lsn, TransactionId xmin)
 {
 ..
 /*
  * don't overwrite if we already have a newer xmin. This can happen if we
  * restart decoding in a slot.
  */
 if (TransactionIdPrecedesOrEquals(xmin, slot-data.catalog_xmin))
 {
 }
 ..
 }
 
 Should we just release spinlock in this loop and just return,
 rather than keeping no action loop?

Don't think that'd make it any faster/simpler. We'd be stuck with
duplicate codepaths.

 10.
 * Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a
 suitable
  * index. Otherwise, it should be InvalidOid.
  */
 static void
 relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
bool is_internal)
 
 typo - *Iff*

iff := if and only if.

Thanks for the look.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Few observations in replication slots related code

2014-06-12 Thread Andres Freund
Hi,

On 2014-06-12 09:15:08 +0200, Andres Freund wrote:
  6.
  elog(ERROR, cannot handle changeset extraction yet);
  
  Shouldn't it be better to use logical replication instead
  of changeset extraction?
 
 Will change.

I don't see that message anywhere in current code? All of those were
rephrased in b89e151054a05f0f6d356ca52e3b725dd0505e53 (where Robert
changed the term to logical decoding) and then implemented in
5a991ef8692ed0d170b44958a81a6bd70e90585c.

Pushed a fix for the rest of the ones I commented on earlier. Thanks!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Few observations in replication slots related code

2014-06-12 Thread Amit Kapila
On Thu, Jun 12, 2014 at 5:02 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-12 09:15:08 +0200, Andres Freund wrote:
   6.
   elog(ERROR, cannot handle changeset extraction yet);
  
   Shouldn't it be better to use logical replication instead
   of changeset extraction?
 
  Will change.

 I don't see that message anywhere in current code?

Right, actually I was reading code from Git History and also
referring latest code, so it seems I have seen that in original
commit and missed to verify it in latest code.

While checking latest code, I got usage of *changeset extraction*
in below comment:

/*

..

*

 * This is useful to initialize the cutoff xid after which a new *changeset*

 * *extraction* replication slot can start decoding changes.

 *

..

*/


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Few observations in replication slots related code

2014-06-11 Thread Amit Kapila
Few observations in Replication slots related code:

1. In function StartupReplicationSlots(XLogRecPtr checkPointRedo),
parameter checkPointRedo is not used.


2. Few check are in different order in functions
pg_create_physical_replication_slot() and
pg_create_logical_replication_slot().

if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, return type must be a row type);
check_permissions();

CheckLogicalDecodingRequirements();

I don't think it makes any difference, however having checks in similar
order seems better unless there is a reason for not doing so.

3. Currently there is any Assert (Assert(!MyReplicationSlot);) in
pg_create_logical_replication_slot(), is there a need for similar
Assert in pg_create_physical_replication_slot()?

4.
StartupDecodingContext()
{
..
context = AllocSetContextCreate(CurrentMemoryContext,
Changeset Extraction Context,
}

To be consistent, shall we name this context as
logical | logical decoding?

5.
pg_create_logical_replication_slot()
{
..
CreateInitDecodingContext()

..
ReplicationSlotPersist()
}

Function pg_create_logical_replication_slot() is trying to
save slot twice once during CreateInitDecodingContext() and
then in ReplicationSlotPersist(), isn't it better if we can make
it do it just once?

6.
elog(ERROR, cannot handle changeset extraction yet);

Shouldn't it be better to use logical replication instead
of changeset extraction?

7.
+ * XXX: It might make sense to introduce ephemeral slots and always use
+ * the slot mechanism.

Already there are RS_EPHEMERAL slots, might be this
comment needs bit of rephrasing.

8. Is there a chance of inconsistency, if pg_restxlog resets the
xlog and after restart, one of the slots contains xlog position
older than what is resetted by pg_resetxlog?

9.
void
LogicalIncreaseXminForSlot(XLogRecPtr current_lsn, TransactionId xmin)
{
..
/*
 * don't overwrite if we already have a newer xmin. This can happen if we
 * restart decoding in a slot.
 */
if (TransactionIdPrecedesOrEquals(xmin, slot-data.catalog_xmin))
{
}
..
}

Should we just release spinlock in this loop and just return,
rather than keeping no action loop?

10.
* Iff ri_type = REPLICA_IDENTITY_INDEX, indexOid must be the Oid of a
suitable
 * index. Otherwise, it should be InvalidOid.
 */
static void
relation_mark_replica_identity(Relation rel, char ri_type, Oid indexOid,
   bool is_internal)

typo - *Iff*

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com