Re: [HACKERS] Cmpact commits and changeset extraction

2013-10-14 Thread Robert Haas
On Fri, Oct 11, 2013 at 4:05 PM, Andres Freund and...@2ndquadrant.com wrote:
 Maybe.  The original reason we added compact commits was because we
 thought that making unlogged tables logged and visca/versa was going
 to require adding still more stuff to the commit record.  I'm no
 longer sure that's the case and, in any case, nobody's worked out the
 design details.  But I can't help thinking more stuff's likely to come
 up in the future.  I'm certainly willing to entertain proposals for
 restructuring that, but I don't really want to just throw it out.

 Well, what I am thinking of - including  reading data depending on a
 flag in -xinfo - would give you extensibility without requiring
 different types of commits. And it would only blow up the size by
 whatever needs to be included.

Hard to comment without seeing the patch.  Sounds like it could be
reasonable, if it's not too ugly.

  Maybe you should just skip replay of transactions with no useful
  content.
 
  Yes, I have thought about that as well. But I dislike it - how do we
  define no useful content?

 The only action we detected for that XID was the commit itself.

 What if the transaction was intentionally done to get an xid + timestamp
 in a multimaster system? What if it includes DDL but no logged data? Do
 we replay a transaction because of the pg_shdepend entry when creating a
 table in another database?

None of these things seem particularly alarming to me.  I don't know
whether that represents a failure of imagination on my part or undue
alarm on your part.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cmpact commits and changeset extraction

2013-10-11 Thread Andres Freund
On 2013-10-01 10:12:13 -0400, Robert Haas wrote:
 On Tue, Oct 1, 2013 at 7:26 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-10-01 06:20:20 -0400, Robert Haas wrote:
  On Mon, Sep 30, 2013 at 5:34 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   What's wrong with #1?
  
   It seems confusing that a changeset stream in database #1 will contain
   commits (without corresponding changes) from database #2. Seems like aaa
   pola violation to me.
 
  I don't really see the problem.  A transaction could be empty for lots
  of reasons; it may have obtained an XID without writing any data, or
  whatever it's changed may be outside the bounds of logical rep.
 
  Sure. But all of them will have had a corresponding action in the
  database. If your replication stream suddenly sees commits that you
  cannot connect to any application activity... And it would depend on the
  kind of commit, you won't see it if a non-compact commit was used.
  It also means we need to do work to handle that commit. If you have a
  busy and a less so database and you're only replicating the non-busy
  one, that might be noticeable.
 
 Maybe.  The original reason we added compact commits was because we
 thought that making unlogged tables logged and visca/versa was going
 to require adding still more stuff to the commit record.  I'm no
 longer sure that's the case and, in any case, nobody's worked out the
 design details.  But I can't help thinking more stuff's likely to come
 up in the future.  I'm certainly willing to entertain proposals for
 restructuring that, but I don't really want to just throw it out.

Well, what I am thinking of - including  reading data depending on a
flag in -xinfo - would give you extensibility without requiring
different types of commits. And it would only blow up the size by
whatever needs to be included.

  Maybe you should just skip replay of transactions with no useful
  content.
 
  Yes, I have thought about that as well. But I dislike it - how do we
  define no useful content?
 
 The only action we detected for that XID was the commit itself.

What if the transaction was intentionally done to get an xid + timestamp
in a multimaster system? What if it includes DDL but no logged data? Do
we replay a transaction because of the pg_shdepend entry when creating a
table in another database?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cmpact commits and changeset extraction

2013-10-01 Thread Robert Haas
On Mon, Sep 30, 2013 at 5:34 PM, Andres Freund and...@2ndquadrant.com wrote:
 What's wrong with #1?

 It seems confusing that a changeset stream in database #1 will contain
 commits (without corresponding changes) from database #2. Seems like aaa
 pola violation to me.

I don't really see the problem.  A transaction could be empty for lots
of reasons; it may have obtained an XID without writing any data, or
whatever it's changed may be outside the bounds of logical rep.  Maybe
you should just skip replay of transactions with no useful content.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cmpact commits and changeset extraction

2013-10-01 Thread Andres Freund
On 2013-10-01 06:20:20 -0400, Robert Haas wrote:
 On Mon, Sep 30, 2013 at 5:34 PM, Andres Freund and...@2ndquadrant.com wrote:
  What's wrong with #1?
 
  It seems confusing that a changeset stream in database #1 will contain
  commits (without corresponding changes) from database #2. Seems like aaa
  pola violation to me.
 
 I don't really see the problem.  A transaction could be empty for lots
 of reasons; it may have obtained an XID without writing any data, or
 whatever it's changed may be outside the bounds of logical rep.

Sure. But all of them will have had a corresponding action in the
database. If your replication stream suddenly sees commits that you
cannot connect to any application activity... And it would depend on the
kind of commit, you won't see it if a non-compact commit was used.
It also means we need to do work to handle that commit. If you have a
busy and a less so database and you're only replicating the non-busy
one, that might be noticeable.

 Maybe you should just skip replay of transactions with no useful
 content.

Yes, I have thought about that as well. But I dislike it - how do we
define no useful content? If the user did a SELECT * FROM foo FOR
UPDATE, maybe it was done to coordinate stuff with the standby and the
knowledge about that commit is required?
It doesn't really seem our responsibility to detect that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cmpact commits and changeset extraction

2013-10-01 Thread Robert Haas
On Tue, Oct 1, 2013 at 7:26 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-10-01 06:20:20 -0400, Robert Haas wrote:
 On Mon, Sep 30, 2013 at 5:34 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  What's wrong with #1?
 
  It seems confusing that a changeset stream in database #1 will contain
  commits (without corresponding changes) from database #2. Seems like aaa
  pola violation to me.

 I don't really see the problem.  A transaction could be empty for lots
 of reasons; it may have obtained an XID without writing any data, or
 whatever it's changed may be outside the bounds of logical rep.

 Sure. But all of them will have had a corresponding action in the
 database. If your replication stream suddenly sees commits that you
 cannot connect to any application activity... And it would depend on the
 kind of commit, you won't see it if a non-compact commit was used.
 It also means we need to do work to handle that commit. If you have a
 busy and a less so database and you're only replicating the non-busy
 one, that might be noticeable.

Maybe.  The original reason we added compact commits was because we
thought that making unlogged tables logged and visca/versa was going
to require adding still more stuff to the commit record.  I'm no
longer sure that's the case and, in any case, nobody's worked out the
design details.  But I can't help thinking more stuff's likely to come
up in the future.  I'm certainly willing to entertain proposals for
restructuring that, but I don't really want to just throw it out.

 Maybe you should just skip replay of transactions with no useful
 content.

 Yes, I have thought about that as well. But I dislike it - how do we
 define no useful content?

The only action we detected for that XID was the commit itself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cmpact commits and changeset extraction

2013-09-30 Thread Robert Haas
On Mon, Sep 30, 2013 at 10:50 AM, Andres Freund and...@2ndquadrant.com wrote:
 Changeset extraction only works in the context of a single database but
 has to scan through xlog records from multiple databases. Most records
 are easy to skip because they contain the database in the relfilenode or
 are just not interesting for logical replication. The only exception are
 compact commits.
 So we have some alternatives:
 1) don't do anything, in that case empty transactions will get replayed since 
 the changes
   themselves will get skipped.
 2) Don't use compact commits if wal_level=logical
 3) unify compact and non-compact commits, trying to get the normal one
smaller.

 For 3) I am thinking of using 'xinfo' to store whether we have the other
 information or not. E.g. if there are subxacts in a compact commit we
 signal that by the flag 'XACT_COMMIT_CONTAINS_SUBXACTS' and store the
 number of subxacts after the xlog record. Similarly with relations,
 invalidation messages and the database id. That should leave compact
 commits without any subxacts at the former size, and those with at the
 former size + 4. Normal commits would get smaller in many cases since we
 don't store the empty fields.

 I personally think 3) is the best solution, any other opinions?

What's wrong with #1?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cmpact commits and changeset extraction

2013-09-30 Thread Andres Freund
On 2013-09-30 14:22:22 -0400, Robert Haas wrote:
 On Mon, Sep 30, 2013 at 10:50 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Changeset extraction only works in the context of a single database but
  has to scan through xlog records from multiple databases. Most records
  are easy to skip because they contain the database in the relfilenode or
  are just not interesting for logical replication. The only exception are
  compact commits.
  So we have some alternatives:
  1) don't do anything, in that case empty transactions will get replayed 
  since the changes
themselves will get skipped.
  2) Don't use compact commits if wal_level=logical
  3) unify compact and non-compact commits, trying to get the normal one
 smaller.
 
  For 3) I am thinking of using 'xinfo' to store whether we have the other
  information or not. E.g. if there are subxacts in a compact commit we
  signal that by the flag 'XACT_COMMIT_CONTAINS_SUBXACTS' and store the
  number of subxacts after the xlog record. Similarly with relations,
  invalidation messages and the database id. That should leave compact
  commits without any subxacts at the former size, and those with at the
  former size + 4. Normal commits would get smaller in many cases since we
  don't store the empty fields.
 
  I personally think 3) is the best solution, any other opinions?
 
 What's wrong with #1?

It seems confusing that a changeset stream in database #1 will contain
commits (without corresponding changes) from database #2. Seems like aaa
pola violation to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers