subject:"\[HACKERS\] logical changeset generation v4"

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-02-04 Thread Robert Haas

On Sat, Feb 2, 2013 at 4:38 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-01-28 16:55:52 -0500, Steve Singer wrote:
 If your using non-surragate /natural primary keys this tends to come up
 occasionally due to data-entry errors or renames.  I'm  looking at this from
 the point of view of what do I need to use this as a source for a production
 replication system with fewer sharp-edges compared to trigger source slony.
 My standard is a bit higher than 'first' version because I intent to use it
 in the version 3.0 of slony not 1.0.  If others feel I'm asking for too much
 they should speak up, maybe I am. Also the way things will fail if someone
 were to try and update a primary key value is pretty nasty (it will leave
 them with inconsistent databases).We could  install UPDATE triggers to
 try and detect this type of thing but I'd rather see us just log the old
 values so we can use them during replay.

 I pushed support for this. I am not yet 100% happy with this due to two
 issues:

 * it increases the xlog size logged by heap_update by 2 bytes even with
   wal_level  logical as it uses a variant of xl_heap_header that
   includes its lenght. Conditionally using xl_heap_header would make the
   code even harder to read. Is that acceptable?

I think it's important to avoid adding to DML WAL volume when
wal_level  logical.  I am not positive that 2 bytes is noticeable,
but I'm not positive that it isn't either: heap insert/update must be
our most commonly-used WAL records.  On the other hand, we also need
to keep in mind that branches in hot code paths aren't free either.  I
would be concerned more about the increased run-time cost of
constructing the correct WAL record than with the related code
complexity.  None of that code is simple anyway.

 * multi_insert should be converted to use xl_heap_header_len as well,
   instead of using xl_multi_insert_tuple, that would also reduce the
   amount of multi-insert specific code in decode.c
 * both for update and delete we should denote more explicitly that
   -oldtuple points to an index tuple, not to an full tuple

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-02-02 Thread Andres Freund

On 2013-01-28 16:55:52 -0500, Steve Singer wrote:
 If your using non-surragate /natural primary keys this tends to come up
 occasionally due to data-entry errors or renames.  I'm  looking at this from
 the point of view of what do I need to use this as a source for a production
 replication system with fewer sharp-edges compared to trigger source slony.
 My standard is a bit higher than 'first' version because I intent to use it
 in the version 3.0 of slony not 1.0.  If others feel I'm asking for too much
 they should speak up, maybe I am. Also the way things will fail if someone
 were to try and update a primary key value is pretty nasty (it will leave
 them with inconsistent databases).We could  install UPDATE triggers to
 try and detect this type of thing but I'd rather see us just log the old
 values so we can use them during replay.

I pushed support for this. I am not yet 100% happy with this due to two
issues:

* it increases the xlog size logged by heap_update by 2 bytes even with
  wal_level  logical as it uses a variant of xl_heap_header that
  includes its lenght. Conditionally using xl_heap_header would make the
  code even harder to read. Is that acceptable?
* multi_insert should be converted to use xl_heap_header_len as well,
  instead of using xl_multi_insert_tuple, that would also reduce the
  amount of multi-insert specific code in decode.c
* both for update and delete we should denote more explicitly that
  -oldtuple points to an index tuple, not to an full tuple

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-28 Thread Andres Freund

On 2013-01-26 16:20:33 -0500, Steve Singer wrote:
 On 13-01-24 11:15 AM, Steve Singer wrote:
 On 13-01-24 06:40 AM, Andres Freund wrote:
 
 Fair enough. I am also working on a user of this infrastructure but that
 doesn't help you very much. Steve Singer seemed to make some stabs at
 writing an output plugin as well. Steve, how far did you get there?
 
 I was able to get something that generated output for INSERT statements in
 a format similar to what a modified slony apply trigger would want.  This
 was with the list of tables to replicate hard-coded in the plugin.  This
 was with the patchset from the last commitfest.I had gotten a bit hung up
 on the UPDATE and DELETE support because slony allows you to use an
 arbitrary user specified unique index as your key.  It looks like better
 support for tables with a unique non-primary key is in the most recent
 patch set.  I am hoping to have time this weekend to update my plugin to
 use parameters passed in on the init and other updates in the most recent
 version.  If I make some progress I will post a link to my progress at the
 end of the weekend.  My big issue is that I have limited time to spend on
 this.
 

 This isn't a complete review just a few questions I've hit so far that I
 thought I'd ask to see if I'm not seeing something related to updates.

 + extern void relationFindPrimaryKey(Relation pkrel, Oid *indexOid,
 +int16 *nratts, int16 *attnums, Oid
 *atttypids,
 +Oid *opclasses);
 +

 I don't see this defined anywhere could it be left over from a previous
 version of the patch?

Yes, its dead and now gone.

 In decode.c
 DecodeUpdate:
 +
 +   /*
 +* FIXME: need to get/save the old tuple as well if we want primary key
 +* changes to work.
 +*/
 +   change-newtuple = ReorderBufferGetTupleBuf(reorder);

 I also don't see any code in heap_update to find + save the old primary key
 values like you added to heap_delete.   You didn't list Add ability to
 change the primary key on an UPDATE in the TODO so I'm wondering if I'm
 missing something.  Is there another way I can bet the primary key values
 for the old_tuple?

Nope, there isn't any right now. I have considered as something not all
that interesting for real-world usecases based on my experience, but
adding support shouldn't be that hard anymore, so I can just bite the
bullet...

 I think the name of the test contrib module was changed but you didn't
 update the make file. This fixes it

Yea, I had forgotten to add that hunk when committing. Fixed.

Thanks,

Andres

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-28 Thread Andres Freund

Hi,

On 2013-01-27 23:07:51 -0500, Steve Singer wrote:
 A few more comments;
 
 In decode.c DecodeDelete
 
 +   if (r-xl_len = (SizeOfHeapDelete + SizeOfHeapHeader))
 +   {
 +   elog(DEBUG2, huh, no primary key for a delete on wal_level =
 logical?);
 +   return;
 +   }
 +
 
 I think we should be passing delete's with candidate key data logged to the
 plugin.  If the table isn't a replicated table then ignoring the delete is
 fine.  If the table is a replicated table but someone has deleted the unique
 index from the table then the plugin will receive INSERT changes on the
 table but not DELETE changes. If this happens the plugin would have any way
 of knowing that it is missing delete changes.  If my plugin gets passed a
 DELETE change record but with no key data then my plugin could do any of

I basically didn't do that because I thought people would forget to
check whether oldtuple is empty I have no problem with addind support
for that though.

 1.  Start screaming for help (ie log errors)

Yes.

 2.  Drop the table from replication

No, you can't write from an output plugin, and I don't immediately see
support for that comming. There's no fundamental blockers, just makes
things more complicated.

 3.  Pass the delete (with no key values) onto the replication client and let
 it deal with it (see 1 and 2)

Hm.


While I agree that nicer behaviour would be good I think the real
enforcement should happen on a higher level, e.g. with event triggers or
somesuch. It seems way too late to do anything about it when we're
already decoding. The transaction will already have committed...

 Also, 'huh' isn't one of our standard log message phrases :)

You're right there ;). I bascially wanted to remove the log message
almost instantly but it was occasionally useful so I kept it arround...

 How do you plan on dealing with sequences?
 I don't see my plugin being called on sequence changes and I don't see
 XLOG_SEQ_LOG listed in DecodeRecordIntoReorderBuffer.  Is there a reason why
 this can't be easily added?

I basically was hoping for Simon's sequence-am to get in before doing
anything real here. That didn't really happen yet.
I am not sure whether there's a real usecase in decoding normal
XLOG_SEQ_LOG records, their content isn't all that easy to interpet
unless youre rather familiar with pg's innards.

So, adding support wouldn't hard from a technical pov but it seems the
semantics are a bit hard to nail down.

 Also what do we want to do about TRUNCATE support.  I could always leave a
 TRUNCATE trigger in place that logged the truncate to a sl_truncates and
 have my replication daemon respond to the insert on a   sl_truncates table
 by actually truncating the data on the replica.

I have planned to add some generic table_rewrite handling, but I have
to admit I haven't thought too much about it yet. Currently if somebody
rewrites a table, e.g. with an ALTER ... ADD COLUMN .. DEFAULT .. or
ALTER COLUMN ... USING ..., you will see INSERTs into a temporary
table. That basically seems to be a good thing, but the user needs to be
told about that ;)

 I've spent some time this weekend updating my prototype plugin that
 generates slony 2.2 style COPY output.  I have attached my progress here
 (also https://github.com/ssinger/slony1-engine/tree/logical_repl).  I have
 not gotten as far as modifying slon to act as a logical log receiver, or
 made a version of the slony apply trigger that would process these
 changes.

I only gave it a quick look and have a couple of questions and
remarks. The way you used the options it looks like youre thinking of
specifying all the tables as options? I would have thought those would
get stored  queried locally and only something like the 'replication
set' name or such would be set as an option.

Iterating over a list with
for(i = 0; i  options-length; i= i + 2 )
{
DefElem * def_schema = (DefElem*) list_nth(options,i);
is not a good idea btw, thats quadratic in complexity ;)

In the REORDER_BUFFER_CHANGE_UPDATE I suggest using
relation-rd_primary, just as in the DELETE case, that should always
give you a consistent candidate key in an efficient manner.

 I haven't looked into the details of what is involved in setting up a
 subscription with the snapshot exporting.

That hopefully shouldn't be too hard... At least thats the idea :P

 I couldn't get the options on the START REPLICATION command to parse so I
 just hard coded some list building code in the init method. I do plan on
 pasing the list of tables to replicate from the replica to the plugin
 (because this list comes from the replica).   Passing what could be a few
 thousand table names as a list of arguments is a bit ugly and I admit my
 list processing code is rough.  Does this make us want to reconsider the
 format of the option_list ?

Yea, something's screwed up there, sorry. Will push a fix later today.

 I guess should provide an opinion on if I think that the patch in this CF,
 if

Re: [HACKERS] logical changeset generation v4

2013-01-28 Thread Andres Freund

On 2013-01-27 12:28:21 -0500, Steve Singer wrote:
 On 13-01-22 11:30 AM, Andres Freund wrote:
 Hi,
 
 I pushed a new rebased version (the xlogreader commit made it annoying
 to merge).
 
 The main improvements are
 * way much coherent code internally for intializing logical rep
 * explicit control over slots
 * options for logical replication
 
 
 Exactly what is the syntax for using that.  My reading your changes to
 repl_gram.y make me think that any of the following should work (but they
 don't).
 
 START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1')
  ERROR:  syntax error: unexpected character (
 
 START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1' 'val1')
  ERROR:  syntax error: unexpected character (
 
 START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1','opt2')
 ERROR:  syntax error: unexpected character (

The syntax is right, the grammar (or rather scanner) support is a bit
botched, will push a new version soon.
 
 I'm also attaching a patch to pg_receivellog that allows you to specify
 these options on the command line.  I'm not saying I think that it is
 appropriate to be adding more bells and whistles to the utilities  two weeks
 into the CF but I found this useful for testing so I'm sharing it.

The CF is also there to find UI warts and such, so something like this
seems perfectly fine. Even moreso as it doesn't look this will get into
9.3 anyway.

I wanted to add such an option, but I was too lazy^Wbusy to think about
the sematics. Your current syntax doesn't really allow arguments to be
specified in a nice way.
I was thinking of -o name=value and allowing multiple specifications of
-o to build the option string.

Any arguments against that?

   /* Initiate the replication stream at specified location */
 - snprintf(query, sizeof(query), START_LOGICAL_REPLICATION '%s' %X/%X,
 -  slot, (uint32) (startpos  32), (uint32) startpos);
 + snprintf(query, sizeof(query), START_LOGICAL_REPLICATION '%s' %X/%X 
 (%s),
 +  slot, (uint32) (startpos  32), (uint32) 
 startpos,plugin_opts);

ISTM that (%s) shouldn't be specified when there are no options, but as
the options need to be pre-escaped anyway, that looks like a non-problem
in a bit more complete implementation.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-28 Thread Steve Singer


On 13-01-28 06:17 AM, Andres Freund wrote:

Hi,

3.  Pass the delete (with no key values) onto the replication client and let
it deal with it (see 1 and 2)
Hm.


While I agree that nicer behaviour would be good I think the real
enforcement should happen on a higher level, e.g. with event triggers or
somesuch. It seems way too late to do anything about it when we're
already decoding. The transaction will already have committed...


Ideally the first line of enforcement would be with event triggers. The 
thing with user-level mechanisms for enforcing things is that they 
sometimes can be disabled or by-passed.  I don't have a lot of sympathy 
for people who do this but I like the idea of at least having the option 
coding defensively to detect the situation and whine to the user.



How do you plan on dealing with sequences?
I don't see my plugin being called on sequence changes and I don't see
XLOG_SEQ_LOG listed in DecodeRecordIntoReorderBuffer.  Is there a reason why
this can't be easily added?

I basically was hoping for Simon's sequence-am to get in before doing
anything real here. That didn't really happen yet.
I am not sure whether there's a real usecase in decoding normal
XLOG_SEQ_LOG records, their content isn't all that easy to interpet
unless youre rather familiar with pg's innards.

So, adding support wouldn't hard from a technical pov but it seems the
semantics are a bit hard to nail down.


Also what do we want to do about TRUNCATE support.  I could always leave a
TRUNCATE trigger in place that logged the truncate to a sl_truncates and
have my replication daemon respond to the insert on a   sl_truncates table
by actually truncating the data on the replica.

I have planned to add some generic table_rewrite handling, but I have
to admit I haven't thought too much about it yet. Currently if somebody
rewrites a table, e.g. with an ALTER ... ADD COLUMN .. DEFAULT .. or
ALTER COLUMN ... USING ..., you will see INSERTs into a temporary
table. That basically seems to be a good thing, but the user needs to be
told about that ;)


I've spent some time this weekend updating my prototype plugin that
generates slony 2.2 style COPY output.  I have attached my progress here
(also https://github.com/ssinger/slony1-engine/tree/logical_repl).  I have
not gotten as far as modifying slon to act as a logical log receiver, or
made a version of the slony apply trigger that would process these
changes.

I only gave it a quick look and have a couple of questions and
remarks. The way you used the options it looks like youre thinking of
specifying all the tables as options? I would have thought those would
get stored  queried locally and only something like the 'replication
set' name or such would be set as an option.


The way slony works today is that the list of tables to pull for a SYNC 
comes from the subscriber because the subscriber might be behind the 
provider, where a table has been removed from the set in the meantime.  
The subscriber still needs to receive data from that table until it is 
caught up to the point where that removal happens.


Having a time-travelled version of a user table (sl_table) might fix 
that problem but I haven't yet figured out how that needs to work with 
cascading (since that is a feature of slony today I can't ignore the 
problem). I'm also not sure how that will work with table renames.  
Today if the user renames a table inside of an EXECUTE SCRIPT slony will 
update the name of the table in sl_table.   This type of change wouldn't 
be visible (yet) in the time-travelled catalog.  There might be a 
solution to this yet but I haven't figured out it.  Sticking with what 
slony does today seemed easier as a first step.



Iterating over a list with
for(i = 0; i  options-length; i= i + 2 )
{
DefElem * def_schema = (DefElem*) list_nth(options,i);
is not a good idea btw, thats quadratic in complexity ;)


Thanks I'll rewrite this to walk a list of  ListCell objects with next.



In the REORDER_BUFFER_CHANGE_UPDATE I suggest using
relation-rd_primary, just as in the DELETE case, that should always
give you a consistent candidate key in an efficient manner.


I haven't looked into the details of what is involved in setting up a
subscription with the snapshot exporting.

That hopefully shouldn't be too hard... At least thats the idea :P


I couldn't get the options on the START REPLICATION command to parse so I
just hard coded some list building code in the init method. I do plan on
pasing the list of tables to replicate from the replica to the plugin
(because this list comes from the replica).   Passing what could be a few
thousand table names as a list of arguments is a bit ugly and I admit my
list processing code is rough.  Does this make us want to reconsider the
format of the option_list ?

Yea, something's screwed up there, sorry. Will push a fix later today.


I guess should provide an opinion on if I think that the patch in this CF,
if committed

Re: [HACKERS] logical changeset generation v4

2013-01-28 Thread Steve Singer


On 13-01-28 06:23 AM, Andres Freund wrote:
The CF is also there to find UI warts and such, so something like this 
seems perfectly fine. Even moreso as it doesn't look this will get 
into 9.3 anyway. I wanted to add such an option, but I was too 
lazy^Wbusy to think about the sematics. Your current syntax doesn't 
really allow arguments to be specified in a nice way. I was thinking 
of -o name=value and allowing multiple specifications of -o to build 
the option string. Any arguments against that?


Multiple -o options sound fine to me.



/* Initiate the replication stream at specified location */
-   snprintf(query, sizeof(query), START_LOGICAL_REPLICATION '%s' %X/%X,
-slot, (uint32) (startpos  32), (uint32) startpos);
+   snprintf(query, sizeof(query), START_LOGICAL_REPLICATION '%s' %X/%X 
(%s),
+slot, (uint32) (startpos  32), (uint32) 
startpos,plugin_opts);

ISTM that (%s) shouldn't be specified when there are no options, but as
the options need to be pre-escaped anyway, that looks like a non-problem
in a bit more complete implementation.

Greetings,

Andres Freund






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-28 Thread Andres Freund

On 2013-01-28 12:23:02 +0100, Andres Freund wrote:
 On 2013-01-27 12:28:21 -0500, Steve Singer wrote:
  On 13-01-22 11:30 AM, Andres Freund wrote:
  Hi,
  
  I pushed a new rebased version (the xlogreader commit made it annoying
  to merge).
  
  The main improvements are
  * way much coherent code internally for intializing logical rep
  * explicit control over slots
  * options for logical replication
  
  
  Exactly what is the syntax for using that.  My reading your changes to
  repl_gram.y make me think that any of the following should work (but they
  don't).
  
  START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1')
   ERROR:  syntax error: unexpected character (
  
  START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1' 'val1')
   ERROR:  syntax error: unexpected character (
  
  START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1','opt2')
  ERROR:  syntax error: unexpected character (
 
 The syntax is right, the grammar (or rather scanner) support is a bit
 botched, will push a new version soon.

Pushed and rebased some minutes ago. I changed the syntax so that slot
names, plugins, and option names are identifiers and behave just as in
normal sql identifier. That means ' need to be changed to .

The new version is rebased ontop of fklocks, walsender et al, which was
a bit of work but actually makes more comprehensive logging in
heap_update easier. That will come tomorrow.

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-27 Thread Steve Singer


On 13-01-22 11:30 AM, Andres Freund wrote:

Hi,

I pushed a new rebased version (the xlogreader commit made it annoying
to merge).

The main improvements are
* way much coherent code internally for intializing logical rep
* explicit control over slots
* options for logical replication



Exactly what is the syntax for using that.  My reading your changes to 
repl_gram.y make me think that any of the following should work (but 
they don't).


START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1')
 ERROR:  syntax error: unexpected character (

START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1' 'val1')
 ERROR:  syntax error: unexpected character (

START_LOGICAL_REPLICATION 'slon1' 0/0 ('opt1','opt2')
ERROR:  syntax error: unexpected character (

I'm also attaching a patch to pg_receivellog that allows you to specify 
these options on the command line.  I'm not saying I think that it is 
appropriate to be adding more bells and whistles to the utilities  two 
weeks into the CF but I found this useful for testing so I'm sharing it.






Greetings,

Andres Freund

--
  Andres Freundhttp://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services





From 176087bacec6cbf0b86e4ffeb918f41b4a5b8d7a Mon Sep 17 00:00:00 2001
From: Steve Singer ssin...@ca.afilias.info
Date: Sun, 27 Jan 2013 12:24:33 -0500
Subject: [PATCH] allow pg_receivellog to pass plugin options from the command line to the plugin

---
 src/bin/pg_basebackup/pg_receivellog.c |   14 ++
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_receivellog.c b/src/bin/pg_basebackup/pg_receivellog.c
index 04bedbe..30b3cea 100644
--- a/src/bin/pg_basebackup/pg_receivellog.c
+++ b/src/bin/pg_basebackup/pg_receivellog.c
@@ -54,7 +54,7 @@ static XLogRecPtr	startpos;
 static bool do_init_slot = false;
 static bool do_start_slot = false;
 static bool do_stop_slot = false;
-
+static const char * plugin_opts=;
 
 static void usage(void);
 static void StreamLog();
@@ -84,6 +84,7 @@ usage(void)
 	printf(_(  -s, --status-interval=INTERVAL\n
 			  time between status packets sent to server (in seconds)\n));
 	printf(_(  -S, --slot=SLOTuse existing replication slot SLOT instead of starting a new one\n));
+	printf(_(  -o --options=OPTIONS   A comma separated list of options to the plugin\n));
 	printf(_(\nAction to be performed:\n));
 	printf(_(  --init initiate a new replication slot (for the slotname see --slot)\n));
 	printf(_(  --startstart streaming in a replication slot (for the slotname see --slot)\n));
@@ -264,8 +265,8 @@ StreamLog(void)
 slot);
 
 	/* Initiate the replication stream at specified location */
-	snprintf(query, sizeof(query), START_LOGICAL_REPLICATION '%s' %X/%X,
-			 slot, (uint32) (startpos  32), (uint32) startpos);
+	snprintf(query, sizeof(query), START_LOGICAL_REPLICATION '%s' %X/%X (%s),
+			 slot, (uint32) (startpos  32), (uint32) startpos,plugin_opts);
 	res = PQexec(conn, query);
 	if (PQresultStatus(res) != PGRES_COPY_BOTH)
 	{
@@ -560,6 +561,7 @@ main(int argc, char **argv)
 		{init, no_argument, NULL, 1},
 		{start, no_argument, NULL, 2},
 		{stop, no_argument, NULL, 3},
+		{options,required_argument,NULL,'o'},
 		{NULL, 0, NULL, 0}
 	};
 	int			c;
@@ -584,7 +586,7 @@ main(int argc, char **argv)
 		}
 	}
 
-	while ((c = getopt_long(argc, argv, f:nvd:h:p:U:wWP:s:S:,
+	while ((c = getopt_long(argc, argv, f:nvd:h:p:U:wWP:s:S:o:,
 			long_options, option_index)) != -1)
 	{
 		switch (c)
@@ -659,6 +661,10 @@ main(int argc, char **argv)
 			case 3:
 do_stop_slot = true;
 break;
+			case 'o':
+if(optarg != NULL)
+	plugin_opts = pg_strdup(optarg);
+break;
 /* action */
 
 			default:
-- 
1.7.0.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-27 Thread Steve Singer


On 13-01-24 11:15 AM, Steve Singer wrote:

On 13-01-24 06:40 AM, Andres Freund wrote:


Fair enough. I am also working on a user of this infrastructure but that
doesn't help you very much. Steve Singer seemed to make some stabs at
writing an output plugin as well. Steve, how far did you get there?


I was able to get something that generated output for INSERT 
statements in a format similar to what a modified slony apply trigger 
would want.  This was with the list of tables to replicate hard-coded 
in the plugin.  This was with the patchset from the last commitfest.I 
had gotten a bit hung up on the UPDATE and DELETE support because 
slony allows you to use an arbitrary user specified unique index as 
your key.  It looks like better support for tables with a unique 
non-primary key is in the most recent patch set.  I am hoping to have 
time this weekend to update my plugin to use parameters passed in on 
the init and other updates in the most recent version.  If I make some 
progress I will post a link to my progress at the end of the weekend.  
My big issue is that I have limited time to spend on this.






A few more comments;

In decode.c DecodeDelete

+   if (r-xl_len = (SizeOfHeapDelete + SizeOfHeapHeader))
+   {
+   elog(DEBUG2, huh, no primary key for a delete on wal_level = 
logical?);

+   return;
+   }
+

I think we should be passing delete's with candidate key data logged to 
the plugin.  If the table isn't a replicated table then ignoring the 
delete is fine.  If the table is a replicated table but someone has 
deleted the unique index from the table then the plugin will receive 
INSERT changes on the table but not DELETE changes. If this happens the 
plugin would have any way of knowing that it is missing delete changes.  
If my plugin gets passed a DELETE change record but with no key data 
then my plugin could do any of

1.  Start screaming for help (ie log errors)
2.  Drop the table from replication
3.  Pass the delete (with no key values) onto the replication client and 
let it deal with it (see 1 and 2)


Also, 'huh' isn't one of our standard log message phrases :)


How do you plan on dealing with sequences?
I don't see my plugin being called on sequence changes and I don't see 
XLOG_SEQ_LOG listed in DecodeRecordIntoReorderBuffer.  Is there a reason 
why this can't be easily added?


Also what do we want to do about TRUNCATE support.  I could always leave 
a TRUNCATE trigger in place that logged the truncate to a sl_truncates 
and have my replication daemon respond to the insert on a   sl_truncates 
table by actually truncating the data on the replica.


I've spent some time this weekend updating my prototype plugin that 
generates slony 2.2 style COPY output.  I have attached my progress here 
(also https://github.com/ssinger/slony1-engine/tree/logical_repl).  I 
have not gotten as far as modifying slon to act as a logical log 
receiver, or made a version of the slony apply trigger that would 
process these changes.  I haven't looked into the details of what is 
involved in setting up a subscription with the snapshot exporting.


I couldn't get the options on the START REPLICATION command to parse so 
I just hard coded some list building code in the init method. I do plan 
on pasing the list of tables to replicate from the replica to the plugin 
(because this list comes from the replica).   Passing what could be a 
few thousand table names as a list of arguments is a bit ugly and I 
admit my list processing code is rough.  Does this make us want to 
reconsider the format of the option_list ?


I guess should provide an opinion on if I think that the patch in this 
CF, if committed could be used to act as a source for slony instead of 
the log trigger.



The biggest missing piece I mentioned in my email yesterday, that we 
aren't logging the old primary key on row UPDATEs.  I don't see building 
a credible replication system where you don't allow users to update any 
column of a row.


The other issues I've raised (DecodeDelete hiding bad deletes, 
replication options not parsing for me) look like easy fixes


 no wal decoding support for sequences or truncate are things that I 
could work around by doing things much like slony does today.  The SYNC 
can still capture the sequence changes in  a table (where the INSERT's 
would be logged) and I can have a trigger capture truncates.


I mostly did this review from the point of view of someone trying to use 
the feature, I haven't done a line-by-line review of the code.


I suspect Andres can address these issues and get an updated patch out 
during this CF.   I think a more detailed code review by someone more 
familiar with postgres internals will reveal a handful of other issues 
that hopefully can be fixed without a lot of effort. If this were the 
only patch in the commitfest I would encourage Andres to push to get 
these changes done.  If the standard for CF4 is that a patch needs to be 
basically in a commitable state

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-26 Thread Steve Singer


On 13-01-24 11:15 AM, Steve Singer wrote:

On 13-01-24 06:40 AM, Andres Freund wrote:


Fair enough. I am also working on a user of this infrastructure but that
doesn't help you very much. Steve Singer seemed to make some stabs at
writing an output plugin as well. Steve, how far did you get there?


I was able to get something that generated output for INSERT 
statements in a format similar to what a modified slony apply trigger 
would want.  This was with the list of tables to replicate hard-coded 
in the plugin.  This was with the patchset from the last commitfest.I 
had gotten a bit hung up on the UPDATE and DELETE support because 
slony allows you to use an arbitrary user specified unique index as 
your key.  It looks like better support for tables with a unique 
non-primary key is in the most recent patch set.  I am hoping to have 
time this weekend to update my plugin to use parameters passed in on 
the init and other updates in the most recent version.  If I make some 
progress I will post a link to my progress at the end of the weekend.  
My big issue is that I have limited time to spend on this.




This isn't a complete review just a few questions I've hit so far that I 
thought I'd ask to see if I'm not seeing something related to updates.



*** a/src/include/catalog/index.h
--- b/src/include/catalog/index.h
*** extern bool ReindexIsProcessingHeap(Oid
*** 114,117 
--- 114,121 
  extern bool ReindexIsProcessingIndex(Oid indexOid);
  extern OidIndexGetRelation(Oid indexId, bool missing_ok);

+ extern void relationFindPrimaryKey(Relation pkrel, Oid *indexOid,
+int16 *nratts, int16 *attnums, Oid 
*atttypids,

+Oid *opclasses);
+
  #endif   /* INDEX_H */


I don't see this defined anywhere could it be left over from a previous 
version of the patch?



In decode.c
DecodeUpdate:
+
+   /*
+* FIXME: need to get/save the old tuple as well if we want primary key
+* changes to work.
+*/
+   change-newtuple = ReorderBufferGetTupleBuf(reorder);

I also don't see any code in heap_update to find + save the old primary 
key values like you added to heap_delete.   You didn't list Add ability 
to change the primary key on an UPDATE in the TODO so I'm wondering if 
I'm missing something.  Is there another way I can bet the primary key 
values for the old_tuple?


Also,

I think the name of the test contrib module was changed but you didn't 
update the make file. This fixes it


diff --git a/contrib/Makefile b/contrib/Makefile
index 1cc30fe..36e6bfe 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -50,7 +50,7 @@ SUBDIRS = \
tcn \
test_parser \
test_decoding   \
-   test_logical_replication \
+   test_logical_decoding \
tsearch2\
unaccent\
vacuumlo\







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Simon Riggs

On 24 January 2013 01:17, Robert Haas robertmh...@gmail.com wrote:

 I agree.  The thing that scares me about the logical replication stuff
 is not that it might be slow (and if your numbers are to be believed,
 it isn't), but that I suspect it's riddled with bugs and possibly some
 questionable design decisions.  If we commit it and release it, then
 we're going to be stuck maintaining it for a very, very long time.  If
 it turns out to have serious bugs that can't be fixed without a new
 major release, it's going to be a serious black eye for the project.

 Of course, I have no evidence that that will happen.


This is a generic argument against applying any invasive patch. I
agree 9.2 had major bugs on release, though that was because of the
invasive nature of some of the changes, even in seemingly minor
patches.

The most invasive and therefore risky changes in this release are
already committed - changes to the way WAL reading and timelines work.
If we don't apply a single additional patch in this CF, we will still
in my opinion have a major requirement for beta testing prior to
release.

The code executed here is isolated to users of the new feature and is
therefore low risk to non-users. Of course there will be bugs.
Everybody understands what new feature means and we as a project
aren't exposed to risks from this. New feature also means
groundbreaking new capabilities, so the balance of high reward, low
risk means this gets my vote to apply. I'm just about to spend some
days giving a final review on it to confirm/refute that opinion in
technical detail.

Code using these features is available and marked them clearly as full
copyright transfer to PGDG, TPL licenced. That code is external not by
author's choice, but at the specific request of the project to make it
thay way. I personally will be looking to add code to core over time.
It was useful for everybody that replication solutions started out of
core, but replication is now a core requirement for databases and we
must fully deliver on that thought.

I agree with your concern re: checksums and foreign key locks. FK
locks has had considerable review and support, so I expect that to be
a manageable issue.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Heikki Linnakangas


On 24.01.2013 00:30, Andres Freund wrote:

Hi,

I decided to reply on the patches thread to be able to find this later.

On 2013-01-23 22:48:50 +0200, Heikki Linnakangas wrote:

logical changeset generation v4
This is a boatload of infrastructure for supporting logical replication, yet
we have no code actually implementing logical replication that would go with
this. The premise of logical replication over trigger-based was that it'd be
faster, yet we cannot asses that without a working implementation. I don't
think this can be committed in this state.


Its a fair point that this is a huge amount of code without a user in
itself in-core.
But the reason it got no user included is because several people
explicitly didn't want a user in-core for now but said the first part of
this would be to implement the changeset generation as a separate
piece. Didn't you actually prefer not to have any users of this in-core
yourself?


Yes, I certainly did. But we still need to see the other piece of the 
puzzle to see how this fits with it.


BTW, why does all the transaction reordering stuff has to be in core?

How much of this infrastructure is to support replicating DDL changes? 
IOW, if we drop that requirement, how much code can we slash? Any other 
features or requirements that could be dropped? I think it's clear at 
this stage that this patch is not going to be committed as it is. If you 
can reduce it to a fraction of what it is now, that fraction might have 
a chance. Otherwise, it's just going to be pushed to the next commitfest 
as whole, and we're going to be having the same doubts and discussions then.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Andres Freund

Hi Robert, Hi all,

On 2013-01-23 20:17:04 -0500, Robert Haas wrote:
On Wed, Jan 23, 2013 at 5:30 PM, Andres Freund and...@2ndquadrant.com wrote:
The only reason the submitted version of logical decoding is
comparatively slow is that its xmin update policy is braindamaged,
working on that right now.

I agree. The thing that scares me about the logical replication stuff
is not that it might be slow (and if your numbers are to be believed,
it isn't), but that I suspect it's riddled with bugs and possibly some
questionable design decisions. If we commit it and release it, then
we're going to be stuck maintaining it for a very, very long time. If
it turns out to have serious bugs that can't be fixed without a new
major release, it's going to be a serious black eye for the project.

Thats way much more along the lines of what I am afraid of than the
performance stuff - but Heikki cited those, so I replied to that.

Note that I didn't say this must, must go in - I just don't think
Heikki's reasoning about why not hit the nail on the head.

Of course, I have no evidence that that will happen. But it is a
really big piece of code, and therefore unless you are superman, it's
probably got a really large number of bugs. The scary thing is that
it is not as if we can say, well, this is a big hunk of code, but it
doesn't really touch the core of the system, so if it's broken, it'll
be broken itself, but it won't break anything else. Rather, this code
is deeply in bed with WAL, with MVCC, and with the on-disk format of
tuples, and makes fundamental changes to the first two of those. You
agreed with Tom that 9.2 is the buggiest release in recent memory, but
I think logical replication could easily be an order of magnitude
worse.

I tried very, very hard to get the basics of the design interface
solid. Which obviously doesn't man I am succeeding - luckily not being
superhuman after all ;). And I think thats very much where input is
desparetely needed and where I failed to raise enough attention. The
output plugin interface follewed by the walsender interface is what
needs to be most closely vetted.
Those are the permanent, user/developer exposed UI and the one we should
try to keep as stable as possible.

The output plugin callbacks are defined here:
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/include/replication/output_plugin.h;hb=xlog-decoding-rebasing-cf4
To make it more agnostic of the technology to implement changeset
extraction we possibly should replace the ReorderBuffer(TXN|Change)
structs being passed by something more implementation agnostic.

walsender interface:
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=src/backend/replication/repl_gram.y;hb=xlog-decoding-rebasing-cf4
The interesting new commands are:
1) K_INIT_LOGICAL_REPLICATION NAME NAME
2) K_START_LOGICAL_REPLICATION NAME RECPTR plugin_options
3) K_FREE_LOGICAL_REPLICATION NAME

1 3 allocate (respectively free) the permanent state associated with
one changeset consumer whereas START_LOGICAL_REPLICATION streams out
changes starting at RECPTR.

Btw, there are currently *no* changes to the wal format at all if
wal_format logical except that xl_running_xacts are logged more
frequently which obviously could easily be made conditional. Baring bugs
of course.
The changes with wal_level=logical aren't that big either imo:
* heap_insert, heap_update prevent full page writes from removing their
normal record by using a separate XLogRecData block for the buffer and
the record
* heap_delete adds more data (the pkey of the tuple) after the unchanged
xl_heap_delete struct
* On changes to catalog tables (relfilenode, tid, cmin, cmax) are logged.

No changes to mvcc for normal backends at all, unless you count the very
slightly changed *Satisfies interface (getting passed a HeapTuple
instead of HeapTupleHeader).

I am not sure what you're concerned about WRT the on-disk format of the
tuples? We are pretty much nailed down on that due to pg_upgrade'ability
anyway and it could be changed from this patches POV without a problem,
the output plugin just sees normal HeapTuples? Or are you concerned
about the code extracting them from the xlog records?

So I think the won't break anything else argument can be made rather
fairly if the heapam.c changes, which aren't that complex, are vetted
closely.

Now, the disucssion about all the code thats active *during* decoding is
something else entirely :/

You
agreed with Tom that 9.2 is the buggiest release in recent memory, but
I think logical replication could easily be an order of magnitude
worse.

I unfortunately think that not providing more builtin capabilities in
this area also has significant dangers. Imo this is one of the weakest,
or even the weakest, area of postgres.

I personally have the impression that just about nobody did actual beta
testing of the lastest releases, especially 9.2, and that is the

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Heikki Linnakangas

One random thing that caught my eye in the patch, I though I'd mention 
it while I still remember: In heap_delete, you call heap_form_tuple() in 
a critical section. That's a bad idea, because if it runs out of memory 
- PANIC.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Andres Freund

On 2013-01-24 12:38:25 +0200, Heikki Linnakangas wrote:
On 24.01.2013 00:30, Andres Freund wrote:
Hi,

I decided to reply on the patches thread to be able to find this later.

On 2013-01-23 22:48:50 +0200, Heikki Linnakangas wrote:
logical changeset generation v4
This is a boatload of infrastructure for supporting logical replication, yet
we have no code actually implementing logical replication that would go with
this. The premise of logical replication over trigger-based was that it'd be
faster, yet we cannot asses that without a working implementation. I don't
think this can be committed in this state.

Its a fair point that this is a huge amount of code without a user in
itself in-core.
But the reason it got no user included is because several people
explicitly didn't want a user in-core for now but said the first part of
this would be to implement the changeset generation as a separate
piece. Didn't you actually prefer not to have any users of this in-core
yourself?

Yes, I certainly did. But we still need to see the other piece of the puzzle
to see how this fits with it.

Fair enough. I am also working on a user of this infrastructure but that
doesn't help you very much. Steve Singer seemed to make some stabs at
writing an output plugin as well. Steve, how far did you get there?

BTW, why does all the transaction reordering stuff has to be in core?

It didn't use to, but people argued pretty damned hard that no undecoded
data should ever allowed to leave the postgres cluster. And to be fair
it makes writing an output plugin *way* much easier. Check
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=contrib/test_decoding/test_decoding.c;hb=xlog-decoding-rebasing-cf4
If you skip over tuple_to_stringinfo(), which is just pretty generic
scaffolding for converting a whole tuple to a string, writing out the
changes in some format by now is pretty damn simple.

How much of this infrastructure is to support replicating DDL changes? IOW,
if we drop that requirement, how much code can we slash?

Unfortunately I don't think too much unless we add in other code that
allows us to check whether the current definition of a table is still
the same as it was back when the tuple was logged.

Any other features or requirements that could be dropped? I think it's clear
at this stage that
this patch is not going to be committed as it is. If you can reduce it to a
fraction of what it is now, that fraction might have a chance. Otherwise,
it's just going to be pushed to the next commitfest as whole, and we're
going to be having the same doubts and discussions then.

One thing that reduces complexity is to declare the following as
unsupported:
- CREATE TABLE foo(data text);
- DECODE UP TO HERE;
- INSERT INTO foo(data) VALUES(very-long-to-be-externally-toasted-tuple);
- DROP TABLE foo;
- DECODE UP TO HERE;

but thats just a minor thing.

I think what we can do more realistically than to chop of required parts
of changeset extraction is to start applying some of the preliminary
patches independently:
- the relmapper/relfilenode changes + pg_relation_by_filenode(spc,
relnode) should be independently committable if a bit boring
- allowing walsenders to connect to a database possibly needs an interface
change
but otherwise it should be fine to go in independently. It also has
other potential use-cases, so I think thats fair.
- logging xl_running_xact's more frequently could also be committed
independently and makes sense independently as it allows a standby to
enter HS faster if the master is busy
- Introducing InvalidCommandId should be relatively uncontroversial. The
fact that no invalid value for command ids exists is imo an oversight
- the *Satisfies change could be applied and they are imo ready but
there's no use-case for it without the rest, so I am not sure whether
theres a point
- currently not separately available, but we could add wal_level=logical
independently. There would be no user of it, but it would be partial
work. That includes the relcache support for keeping track of the
primary key which already is available separately.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Steve Singer

On 13-01-24 06:40 AM, Andres Freund wrote:

I was able to get something that generated output for INSERT statements
in a format similar to what a modified slony apply trigger would want.
This was with the list of tables to replicate hard-coded in the plugin.
This was with the patchset from the last commitfest.I had gotten a bit
hung up on the UPDATE and DELETE support because slony allows you to use
an arbitrary user specified unique index as your key. It looks like
better support for tables with a unique non-primary key is in the most
recent patch set. I am hoping to have time this weekend to update my
plugin to use parameters passed in on the init and other updates in the
most recent version. If I make some progress I will post a link to my
progress at the end of the weekend. My big issue is that I have limited
time to spend on this.

BTW, why does all the transaction reordering stuff has to be in core?

I think we will find that the replication systems won't be the only
users of this feature. I have often seen systems that have a logging
requirement for auditing purposes or to log then reconstruct the
sequence of changes made to a set of tables in order to feed a
downstream application. Triggers and a journaling table are the
traditional way of doing this but it should be pretty easy to write a
plugin to accomplish the same thing that should give better
performance. If the reordering stuff wasn't in core this would be much
harder.

How much of this infrastructure is to support replicating DDL changes? IOW,
if we drop that requirement, how much code can we slash?

Unfortunately I don't think too much unless we add in other code that
allows us to check whether the current definition of a table is still
the same as it was back when the tuple was logged.

Any other features or requirements that could be dropped? I think it's clear at
this stage that
this patch is not going to be committed as it is. If you can reduce it to a
fraction of what it is now, that fraction might have a chance. Otherwise,
it's just going to be pushed to the next commitfest as whole, and we're
going to be having the same doubts and discussions then.

but thats just a minor thing.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Robert Haas

On Thu, Jan 24, 2013 at 6:14 AM, Andres Freund and...@2ndquadrant.com wrote:
Thats way much more along the lines of what I am afraid of than the
performance stuff - but Heikki cited those, so I replied to that.

Note that I didn't say this must, must go in - I just don't think
Heikki's reasoning about why not hit the nail on the head.

Fair enough, no argument.

Before getting bogged down in technical commentary, let me say this
very clearly: I am enormously grateful for your work on this project.
Logical replication based on WAL decoding is a feature of enormous
value that PostgreSQL has needed for a long time, and your work has
made that look like an achievable goal. Furthermore, it seems to me
that you have pursued the community process with all the vigor and
sincerity for which anyone could ask. Serious design concerns were
raised early in the process and you made radical changes to the design
which I believe have improved it tremendously, and you've continued to
display an outstanding attitude at every phase of this process about
which I can't say enough good things. There is no question in my mind
that this work is going to be the beginning of a process that
revolutionizes the way people think about replication and PostgreSQL,
and you deserve our sincere thanks for that.

Now, the bad news is, I don't think it's very reasonable to try to
commit this to 9.3. I think it is just too much stuff too late in the
cycle. I've reviewed some of the patches from time to time but there
is a lot more stuff and it's big and complicated and it's not really
clear that we have the interface quite right yet, even though I think
it's also clear that we are a lot of closer than we were. I don't
want to be fixing that during beta, much less after release.

1 3 allocate (respectively free) the permanent state associated with
one changeset consumer whereas START_LOGICAL_REPLICATION streams out
changes starting at RECPTR.

Forgive me for not having looked at the patch, but to what extent is
all this, ah, documented?

No changes to mvcc for normal backends at all, unless you count the very
slightly changed *Satisfies interface (getting passed a HeapTuple
instead of HeapTupleHeader).

Mostly, my concern is that you've accidentally broken something, or
that your code will turn out to be flaky in ways we can't now predict.
My only really specific concern at this point is about the special
treatment of catalog tables. We've never done anything like that
before, and it feels like a bad idea. In particular, the fact that
you have to WAL-log new information about cmin/cmax really suggests
that we're committing ourselves to the MVCC infrastructure in a way
that we weren't previously. There's some category of stuff that our
MVCC implementation didn't previously

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
 Now, the bad news is, I don't think it's very reasonable to try to
 commit this to 9.3.  I think it is just too much stuff too late in the
 cycle.  I've reviewed some of the patches from time to time but there
 is a lot more stuff and it's big and complicated and it's not really
 clear that we have the interface quite right yet, even though I think
 it's also clear that we are a lot of closer than we were.  I don't
 want to be fixing that during beta, much less after release.

The only way to avoid this happening again and again, imv, is to get it
committed early in whatever cycle it's slated to release for.  We've got
some serious challenges there though because we want to encourage
everyone to focus on beta testing and going through the release process,
plus we don't want to tag/branch too early or we create more work for
ourselves.

It would have been nice to get this into 9.3, but I can certainly
understand needing to move it back, but can we get a slightly more
specific plan around getting it in then?

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Heikki Linnakangas


On 24.01.2013 20:27, Robert Haas wrote:

Before getting bogged down in technical commentary, let me say this
very clearly: I am enormously grateful for your work on this project.
Logical replication based on WAL decoding is a feature of enormous
value that PostgreSQL has needed for a long time, and your work has
made that look like an achievable goal.  Furthermore, it seems to me
that you have pursued the community process with all the vigor and
sincerity for which anyone could ask.  Serious design concerns were
raised early in the process and you made radical changes to the design
which I believe have improved it tremendously, and you've continued to
display an outstanding attitude at every phase of this process about
which I can't say enough good things.


+1. I really appreciate all the work you Andres have put into this. I've 
argued in the past myself that there should be a little tool that 
scrapes the WAL to do logical replication. Essentially, just what you've 
implemented.


That said (hah, you knew there would be a but ;-)), now that I see 
what that looks like, I'm feeling that maybe it wasn't such a good idea 
after all. It sounded like a fairly small patch that greatly reduces the 
overhead in the master with existing replication systems like slony, but 
it turned out to be a huge patch with a lot of new concepts and interfaces.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Andres Freund

Hi!

On 2013-01-24 13:27:00 -0500, Robert Haas wrote:
On Thu, Jan 24, 2013 at 6:14 AM, Andres Freund and...@2ndquadrant.com wrote:

Very much appreciated. Especially as I can echo your feeling of not only
having positive feelings about the process ;)

It pains me to admit that you have a point there.

What I am afraid though is that it basically goes on like this in the
next commitfests:
* 9.4-CF1: no serious reviewer comments because they are busy doing release
work
* 9.4-CF2: all are relieved that the release is over and a bit tired
* 9.4-CF3: first deeper review, some more complex restructuring required
* 9.4-CF4: too many changes to commit.

If you look at the development of the feature, after the first prototype
and the resulting design changes nobody with decision power had a more
than cursory look at the proposed interfaces. Thats very, very, very
understandable, you all are busy people and the patch the interfaces
are complex so it takes noticeable amounts of time, but it unfortunately
doesn't help in getting an acceptable interface nailed down.

The problem with that is not only that it sucks huge amounts of energy
out of me and others but also that its very hard to really build the
layers/users above changeset extraction without being able to rely on
the interface and semantics. So we never get to the actually benefits
:(, and we don't get the users people require for the feature to be
committed.

So far, the only really effective way of getting people to comment on
patches in this state complexity is the threat of an upcoming commit
because of the last commitfest :(

I honestly don't know how to go on about this...

1 3 allocate (respectively free) the permanent state associated with
one changeset consumer whereas START_LOGICAL_REPLICATION streams out
changes starting at RECPTR.

Forgive me for not having looked at the patch, but to what extent is
all this, ah, documented?

There are several mails on -hackers where I ask for input on whether
that interface is what people want but all the comments have been from
non-core pg people, although mildly favorable.

I couldn't convince myself of writing real low-level documentation
instead of just the example code I needed for testing anyway and some
more higher-level docs before I had input from that side. Perhaps that
was a mistake.

So, here's a slightly less quick overview of the walsender interface:

Whenever a new replication consumer wants to stream data we need to make
sure on the primary that the data can

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-24 Thread Bruce Momjian

On Fri, Jan 25, 2013 at 02:16:09AM +0100, Andres Freund wrote:
 What I am afraid though is that it basically goes on like this in the
 next commitfests:
 * 9.4-CF1: no serious reviewer comments because they are busy doing release 
 work
 * 9.4-CF2: all are relieved that the release is over and a bit tired
 * 9.4-CF3: first deeper review, some more complex restructuring required
 * 9.4-CF4: too many changes to commit.
 
 If you look at the development of the feature, after the first prototype
 and the resulting design changes nobody with decision power had a more
 than cursory look at the proposed interfaces. Thats very, very, very
 understandable, you all are busy people and the patch  the interfaces
 are complex so it takes noticeable amounts of time, but it unfortunately
 doesn't help in getting an acceptable interface nailed down.
 
 The problem with that is not only that it sucks huge amounts of energy
 out of me and others but also that its very hard to really build the
 layers/users above changeset extraction without being able to rely on
 the interface and semantics. So we never get to the actually benefits
 :(, and we don't get the users people require for the feature to be
 committed.
 
 So far, the only really effective way of getting people to comment on
 patches in this state  complexity is the threat of an upcoming commit
 because of the last commitfest :(
 
 I honestly don't know how to go on about this...

This is very accurate and the big challenge of large, invasive patches. 
You almost need to hit it perfect the first time to get it committed in
less than a year.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-23 Thread Andres Freund

On 2013-01-19 23:42:02 -0500, Steve Singer wrote:
 5) Currently its only allowed to access catalog tables, its fairly
 trivial to extend this to additional tables if you can accept some
 (noticeable but not too big) overhead for modifications on those tables.
 
 I was thinking of making that an option for tables, that would be useful
 for replication solutions configuration tables.
 
 I think this will make the life of anyone developing a new replication
 system easier.  Slony has a lot of infrastructure for allowing slonik
 scripts to wait for configuration changes to popogate everywhere before
 making other configuration changes because you can get race conditions.  If
 I were designing a new replication system and I had this feature then I
 would try to use it to come up with a simpler model of propagating
 configuration changes.

I pushed support for this, turned out to be a rather moderate patch (after a
cleanup patch that was required anyway):

 src/backend/access/common/reloptions.c | 10 ++
 src/backend/utils/cache/relcache.c |  9 -
 src/include/utils/rel.h|  9 +
 3 files changed, 27 insertions(+), 1 deletion(-)

With the (attached for convenience) patch applied you can do
# ALTER TABLE replication_metadata SET (treat_as_catalog_table = true);

to enable this.
What I wonder about is:
* does anybody have a better name for the reloption?
* Currently this can be set mid-transaction but it will only provide access for
  in the next transaction but doesn't error out when setting it
  mid-transaction. I personally find that acceptable, other opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From b535ba12fad667725247281c43be2ef81f7e40d7 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 23 Jan 2013 13:02:34 +0100
Subject: [PATCH] wal_decoding: mergme: Support declaring normal tables as
 timetraveleable

This is useful to be able to access tables used for replication metadata inside
an output plugin.

The storage option 'treat_as_catalog_table' is used for that purpose, so it can
be enabled for a table with
ALTER TABLE replication_metadata SET (treat_as_catalog_table = true);

It is currently possible to change that option mid-transaction although
timetravel access will only be possible in the next transaction.
---
 src/backend/access/common/reloptions.c |   10 ++
 src/backend/utils/cache/relcache.c |9 -
 src/include/utils/rel.h|9 +
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 456d746..f2d3c8b 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -62,6 +62,14 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			treat_as_catalog_table,
+			Treat table as a catalog table for the purpose of logical replication,
+			RELOPT_KIND_HEAP
+		},
+		false
+	},
+	{
+		{
 			fastupdate,
 			Enables \fast update\ feature for this GIN index,
 			RELOPT_KIND_GIN
@@ -1151,6 +1159,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, analyze_scale_factor)},
 		{security_barrier, RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, security_barrier)},
+		{treat_as_catalog_table, RELOPT_TYPE_BOOL,
+		 offsetof(StdRdOptions, treat_as_catalog_table)},
 	};
 
 	options = parseRelOptions(reloptions, validate, kind, numoptions);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 369a4d1..cc42ff4 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -4718,12 +4718,19 @@ RelationIsDoingTimetravelInternal(Relation relation)
 	Assert(wal_level = WAL_LEVEL_LOGICAL);
 
 	/*
-	 * XXX: Doing this test instead of using IsSystemNamespace has the
+	 * XXX: Doing this test instead of using IsSystemNamespace has the frak
 	 * advantage of classifying toast tables correctly.
 	 */
 	if (RelationGetRelid(relation)  FirstNormalObjectId)
 		return true;
 
+	/*
+	 * also log relevant data if we want the table to behave as a catalog
+	 * table, although its not a system provided one.
+	 */
+	if (RelationIsTreatedAsCatalogTable(relation))
+	return true;
+
 	return false;
 }
 
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index e07ef3f..a026612 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -219,6 +219,7 @@ typedef struct StdRdOptions
 	int			fillfactor;		/* page fill factor in percent (0..100) */
 	AutoVacOpts autovacuum;		/* autovacuum-related options */
 	bool		security_barrier;		/* for views */
+	booltreat_as_catalog_table; /* treat as timetraveleable table */
 } StdRdOptions;
 
 #define HEAP_MIN_FILLFACTOR			10
@@ -255,6 +256,14 @@ typedef struct StdRdOptions

Re: [HACKERS] logical changeset generation v4

2013-01-23 Thread Robert Haas

On Wed, Jan 23, 2013 at 7:14 AM, Andres Freund and...@2ndquadrant.com wrote:
 With the (attached for convenience) patch applied you can do
 # ALTER TABLE replication_metadata SET (treat_as_catalog_table = true);

 to enable this.
 What I wonder about is:
 * does anybody have a better name for the reloption?

IMHO, it should somehow involve the words logical and replication.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-23 Thread Andres Freund

On 2013-01-23 10:18:50 -0500, Robert Haas wrote:
 On Wed, Jan 23, 2013 at 7:14 AM, Andres Freund and...@2ndquadrant.com wrote:
  With the (attached for convenience) patch applied you can do
  # ALTER TABLE replication_metadata SET (treat_as_catalog_table = true);
 
  to enable this.
  What I wonder about is:
  * does anybody have a better name for the reloption?
 
 IMHO, it should somehow involve the words logical and replication.

Not a bad point. In the back of my mind I was thinking of reusing it to
do error checking when accessing the heap via index methods as a way of
making sure index support writers are aware of the complexities of doing
so (c.f. ALTER TYPE .. ADD VALUE only being usable outside
transactions).
But thats probably over the top.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-23 Thread Andres Freund

Hi,

I decided to reply on the patches thread to be able to find this later.

On 2013-01-23 22:48:50 +0200, Heikki Linnakangas wrote:
 logical changeset generation v4
 This is a boatload of infrastructure for supporting logical replication, yet
 we have no code actually implementing logical replication that would go with
 this. The premise of logical replication over trigger-based was that it'd be
 faster, yet we cannot asses that without a working implementation. I don't
 think this can be committed in this state.

Its a fair point that this is a huge amount of code without a user in
itself in-core.
But the reason it got no user included is because several people
explicitly didn't want a user in-core for now but said the first part of
this would be to implement the changeset generation as a separate
piece. Didn't you actually prefer not to have any users of this in-core
yourself?

Also, while the apply side surely isn't benchmarkable without any being
submitted, the changeset generation can very well be benchmarked.

A very, very adhoc benchmark:
 -c max_wal_senders=10
 -c max_logical_slots=10 --disabled for anything but logical
 -c wal_level=logical --hot_standby for anything but logical
 -c checkpoint_segments=100
 -c log_checkpoints=on
 -c shared_buffers=512MB
 -c autovacuum=on
 -c log_min_messages=notice
 -c log_line_prefix='[%p %t] '
 -c wal_keep_segments=100
 -c fsync=off
 -c synchronous_commit=off

pgbench -p 5440 -h /tmp -n -M prepared -c 16 -j 16 -T 30 

pgbench upstream:
tps: 22275.941409
space overhead: 0%
pgbench logical-submitted
tps: 16274.603046
space overhead: 2.1%
pgbench logical-HEAD (will submit updated version tomorrow or so):
tps: 20853.341551
space overhead: 2.3%
pgbench single plpgsql trigger (INSERT INTO log(data) VALUES(NEW::text))
tps: 14101.349535
space overhead: 369%

Note that in the single trigger case nobody consumed the queue while the
logical version streamed the changes out and stored them to disk.

Adding a default NOW() or similar to the tables immediately makes
logical decoding faster by a factor of about 3 in comparison to the
above trivial trigger.

The only reason the submitted version of logical decoding is
comparatively slow is that its xmin update policy is braindamaged,
working on that right now.

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-23 Thread Robert Haas

On Wed, Jan 23, 2013 at 5:30 PM, Andres Freund and...@2ndquadrant.com wrote:
 pgbench upstream:
 tps: 22275.941409
 space overhead: 0%
 pgbench logical-submitted
 tps: 16274.603046
 space overhead: 2.1%
 pgbench logical-HEAD (will submit updated version tomorrow or so):
 tps: 20853.341551
 space overhead: 2.3%
 pgbench single plpgsql trigger (INSERT INTO log(data) VALUES(NEW::text))
 tps: 14101.349535
 space overhead: 369%

 Note that in the single trigger case nobody consumed the queue while the
 logical version streamed the changes out and stored them to disk.

 Adding a default NOW() or similar to the tables immediately makes
 logical decoding faster by a factor of about 3 in comparison to the
 above trivial trigger.

 The only reason the submitted version of logical decoding is
 comparatively slow is that its xmin update policy is braindamaged,
 working on that right now.

I agree.  The thing that scares me about the logical replication stuff
is not that it might be slow (and if your numbers are to be believed,
it isn't), but that I suspect it's riddled with bugs and possibly some
questionable design decisions.  If we commit it and release it, then
we're going to be stuck maintaining it for a very, very long time.  If
it turns out to have serious bugs that can't be fixed without a new
major release, it's going to be a serious black eye for the project.

Of course, I have no evidence that that will happen.  But it is a
really big piece of code, and therefore unless you are superman, it's
probably got a really large number of bugs.  The scary thing is that
it is not as if we can say, well, this is a big hunk of code, but it
doesn't really touch the core of the system, so if it's broken, it'll
be broken itself, but it won't break anything else.  Rather, this code
is deeply in bed with WAL, with MVCC, and with the on-disk format of
tuples, and makes fundamental changes to the first two of those.  You
agreed with Tom that 9.2 is the buggiest release in recent memory, but
I think logical replication could easily be an order of magnitude
worse.

I also have serious concerns about checksums and foreign key locks.
Any single one of those three patches could really inflict
unprecedented damage on our community's reputation for stability and
reliability if they turn out to be seriously buggy, and unfortunately
I don't consider that an unlikely outcome.  I don't know what to do
about it, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4 - Heikki's thoughts about the patch state

2013-01-23 Thread Joshua D. Drake



On 01/23/2013 05:17 PM, Robert Haas wrote:


Of course, I have no evidence that that will happen.  But it is a
really big piece of code, and therefore unless you are superman, it's
probably got a really large number of bugs.  The scary thing is that
it is not as if we can say, well, this is a big hunk of code, but it
doesn't really touch the core of the system, so if it's broken, it'll
be broken itself, but it won't break anything else.  Rather, this code
is deeply in bed with WAL, with MVCC, and with the on-disk format of
tuples, and makes fundamental changes to the first two of those.  You
agreed with Tom that 9.2 is the buggiest release in recent memory, but
I think logical replication could easily be an order of magnitude
worse.


Command Prompt worked for YEARS to get logical replication right and we 
never got it to the point where I would have been happy submitting it to 
-core.


It behooves .Org to be extremely conservative about this feature. 
Granted, it is a feature we should have had years ago but still. It is 
not a simple thing, it is not an easy thing. It is complicated and 
complex to get correcft.




JD




--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-22 Thread Andres Freund

Hi,

I pushed a new rebased version (the xlogreader commit made it annoying
to merge).

The main improvements are
* way much coherent code internally for intializing logical rep
* explicit control over slots
* options for logical replication

On 2013-01-19 23:42:02 -0500, Steve Singer wrote:
 On 13-01-14 08:38 PM, Andres Freund wrote:
 2) Currently the logical replication infrastructure assigns a 'slot-id'
 when a new replica is setup. That slot id isn't really nice
 (e.g. id-321578-3). It also requires that [18] keeps state in a global
 variable to make writing regression tests easy.
 
 I think it would be better to make the user specify those replication
 slot ids, but I am not really sure about it.

 Shortly after trying out the latest version I hit the following scenario
 1. I started pg_receivellog but mistyped the name of my plugin
 2. It looped and used up all of my logical replication slots

 I killed pg_receivellog and restarted it with the correct plugin name but it
 won't do anything because I have no free slots.  I can't free the slots with
 -F because I have no clue what the names of the slots are.I can figure
 the names out by looking in pg_llog but if my replication program can't do
 that so it won't be able to clean up from a failed attempt.

 I agree with you that we should make the user program specify a slot, we
 eventually might want to provide a view that shows the currently allocated
 slots. For a logical based slony I would just generate the slot name based
 on the remote node id.  If walsender generates the slot name then I would
 need to store a mapping between slot names and slons so when a slon
 restarted it would know which slot to resume using.   I'd have to use a
 table in the slony schema on the remote database for this.  There would
 always be a risk of losing track of a slot id if the slon crashed after
 getting the slot number but before committing the mapping on the remote
 database.

This is changed now, slotnames need to be provided and there also is a
pg_stat_logical_replication view (thanks Abhijit!).

 3) Currently no options can be passed to an output plugin. I am thinking
 about making INIT_LOGICAL_REPLICATION 'plugin' accept the now widely
 used ('option' ['value'], ...) syntax and pass that to the output
 plugin's initialization function.

 I think we discussed this last CF, I like this idea.

Added to the extension and walsender interface. Its used in the few
tests we have to specify that xids should not be included in the tests
for reproducability, so its even tested ;)

I haven't added code for setting up options via pg_receivellog yet.

 5) Currently its only allowed to access catalog tables, its fairly
 trivial to extend this to additional tables if you can accept some
 (noticeable but not too big) overhead for modifications on those tables.
 
 I was thinking of making that an option for tables, that would be useful
 for replication solutions configuration tables.

 I think this will make the life of anyone developing a new replication
 system easier.  Slony has a lot of infrastructure for allowing slonik
 scripts to wait for configuration changes to popogate everywhere before
 making other configuration changes because you can get race conditions.  If
 I were designing a new replication system and I had this feature then I
 would try to use it to come up with a simpler model of propagating
 configuration changes.

Working on it now.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-20 Thread Robert Haas

On Fri, Jan 18, 2013 at 12:32 PM, Andres Freund and...@2ndquadrant.com wrote:
 Makes sense?

Yes.  The catalog timetravel stuff still gives me heartburn.  The idea
of treating system catalogs in a special way has never sat well with
me and still doesn't - not that I am sure what I'd like better.  The
complexity of the whole system is also somewhat daunting.

But my question with relation to this specific patch was mostly
whether setting the table OID everywhere was worth worrying about from
a performance standpoint, or whether any of the other adjustments this
patch makes could have negative consequences there, since the
Satisfies functions can get very hot on some workloads.  It seems like
the consensus is no, that's not worth worrying about, at least as
far as the table OIDs are concerned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-20 Thread Andres Freund

On 2013-01-20 21:45:11 -0500, Robert Haas wrote:
 On Fri, Jan 18, 2013 at 12:32 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Makes sense?
 
 Yes.  The catalog timetravel stuff still gives me heartburn.  The idea
 of treating system catalogs in a special way has never sat well with
 me and still doesn't - not that I am sure what I'd like better.  The
 complexity of the whole system is also somewhat daunting.

Understandable :(

Althoutg it seems to me most parts of it have already been someplace
else in the pg code, and the actual timetravel code is relatively small.

 But my question with relation to this specific patch was mostly
 whether setting the table OID everywhere was worth worrying about from
 a performance standpoint, or whether any of the other adjustments this
 patch makes could have negative consequences there, since the
 Satisfies functions can get very hot on some workloads.  It seems like
 the consensus is no, that's not worth worrying about, at least as
 far as the table OIDs are concerned.

I agree, I don't really see any such potential of that here, the
effectively changed amount of code is very minor since the interface
mostly stayed the same due to the HeapTupleSatisfiesVisibility wrapper.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-19 Thread Steve Singer


On 13-01-14 08:38 PM, Andres Freund wrote:

Hi everyone,

Here is the newest version of logical changeset generation.



2) Currently the logical replication infrastructure assigns a 'slot-id'
when a new replica is setup. That slot id isn't really nice
(e.g. id-321578-3). It also requires that [18] keeps state in a global
variable to make writing regression tests easy.

I think it would be better to make the user specify those replication
slot ids, but I am not really sure about it.


Shortly after trying out the latest version I hit the following scenario
1. I started pg_receivellog but mistyped the name of my plugin
2. It looped and used up all of my logical replication slots

I killed pg_receivellog and restarted it with the correct plugin name 
but it won't do anything because I have no free slots.  I can't free the 
slots with -F because I have no clue what the names of the slots are.
I can figure the names out by looking in pg_llog but if my replication 
program can't do that so it won't be able to clean up from a failed attempt.


I agree with you that we should make the user program specify a slot, we 
eventually might want to provide a view that shows the currently 
allocated slots. For a logical based slony I would just generate the 
slot name based on the remote node id.  If walsender generates the slot 
name then I would need to store a mapping between slot names and slons 
so when a slon restarted it would know which slot to resume using.   I'd 
have to use a table in the slony schema on the remote database for 
this.  There would always be a risk of losing track of a slot id if the 
slon crashed after getting the slot number but before committing the 
mapping on the remote database.






3) Currently no options can be passed to an output plugin. I am thinking
about making INIT_LOGICAL_REPLICATION 'plugin' accept the now widely
used ('option' ['value'], ...) syntax and pass that to the output
plugin's initialization function.


I think we discussed this last CF, I like this idea.


4) Does anybody object to:
-- allocate a permanent replication slot
INIT_LOGICAL_REPLICATION 'plugin' 'slotname' (options);

-- stream data
START_LOGICAL_REPLICATION 'slotname' 'recptr';

-- deallocate a permanent replication slot
FREE_LOGICAL_REPLICATION 'slotname';


+1



5) Currently its only allowed to access catalog tables, its fairly
trivial to extend this to additional tables if you can accept some
(noticeable but not too big) overhead for modifications on those tables.

I was thinking of making that an option for tables, that would be useful
for replication solutions configuration tables.


I think this will make the life of anyone developing a new replication 
system easier.  Slony has a lot of infrastructure for allowing slonik 
scripts to wait for configuration changes to popogate everywhere before 
making other configuration changes because you can get race conditions.  
If I were designing a new replication system and I had this feature then 
I would try to use it to come up with a simpler model of propagating 
configuration changes.




Andres Freund




Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-18 Thread Alvaro Herrera

Andres Freund wrote:

 [09] Adjust all *Satisfies routines to take a HeapTuple instead of a 
 HeapTupleHeader
 
 For timetravel access to the catalog we need to be able to lookup (cmin,
 cmax) pairs of catalog rows when were 'inside' that TX. This patch just
 adapts the signature of the *Satisfies routines to expect a HeapTuple
 instead of a HeapTupleHeader. The amount of changes for that is fairly
 low as the HeapTupleSatisfiesVisibility macro already expected the
 former.
 
 It also makes sure the HeapTuple fields are setup in the few places that
 didn't already do so.

I had a look at this part.  Running the regression tests unveiled a case
where the tableOid wasn't being set (and thus caused an assertion to
fail), so I added that.  I also noticed that the additions to
pruneheap.c are sometimes filling a tuple before it's strictly
necessary, leading to wasted work.  Moved those too.

Looks good to me as attached.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/contrib/pgrowlocks/pgrowlocks.c
--- b/contrib/pgrowlocks/pgrowlocks.c
***
*** 120,126  pgrowlocks(PG_FUNCTION_ARGS)
  		/* must hold a buffer lock to call HeapTupleSatisfiesUpdate */
  		LockBuffer(scan-rs_cbuf, BUFFER_LOCK_SHARE);
  
! 		if (HeapTupleSatisfiesUpdate(tuple-t_data,
  	 GetCurrentCommandId(false),
  	 scan-rs_cbuf) == HeapTupleBeingUpdated)
  		{
--- 120,126 
  		/* must hold a buffer lock to call HeapTupleSatisfiesUpdate */
  		LockBuffer(scan-rs_cbuf, BUFFER_LOCK_SHARE);
  
! 		if (HeapTupleSatisfiesUpdate(tuple,
  	 GetCurrentCommandId(false),
  	 scan-rs_cbuf) == HeapTupleBeingUpdated)
  		{
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 291,296  heapgetpage(HeapScanDesc scan, BlockNumber page)
--- 291,297 
  
  			loctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lpp);
  			loctup.t_len = ItemIdGetLength(lpp);
+ 			loctup.t_tableOid = RelationGetRelid(scan-rs_rd);
  			ItemPointerSet((loctup.t_self), page, lineoff);
  
  			if (all_visible)
***
*** 1603,1609  heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
  
  		heapTuple-t_data = (HeapTupleHeader) PageGetItem(dp, lp);
  		heapTuple-t_len = ItemIdGetLength(lp);
! 		heapTuple-t_tableOid = relation-rd_id;
  		heapTuple-t_self = *tid;
  
  		/*
--- 1604,1610 
  
  		heapTuple-t_data = (HeapTupleHeader) PageGetItem(dp, lp);
  		heapTuple-t_len = ItemIdGetLength(lp);
! 		heapTuple-t_tableOid = RelationGetRelid(relation);
  		heapTuple-t_self = *tid;
  
  		/*
***
*** 1651,1657  heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
  		 * transactions.
  		 */
  		if (all_dead  *all_dead 
! 			!HeapTupleIsSurelyDead(heapTuple-t_data, RecentGlobalXmin))
  			*all_dead = false;
  
  		/*
--- 1652,1658 
  		 * transactions.
  		 */
  		if (all_dead  *all_dead 
! 			!HeapTupleIsSurelyDead(heapTuple, RecentGlobalXmin))
  			*all_dead = false;
  
  		/*
***
*** 1781,1786  heap_get_latest_tid(Relation relation,
--- 1782,1788 
  		tp.t_self = ctid;
  		tp.t_data = (HeapTupleHeader) PageGetItem(page, lp);
  		tp.t_len = ItemIdGetLength(lp);
+ 		tp.t_tableOid = RelationGetRelid(relation);
  
  		/*
  		 * After following a t_ctid link, we might arrive at an unrelated
***
*** 2447,2458  heap_delete(Relation relation, ItemPointer tid,
  	lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));
  	Assert(ItemIdIsNormal(lp));
  
  	tp.t_data = (HeapTupleHeader) PageGetItem(page, lp);
  	tp.t_len = ItemIdGetLength(lp);
  	tp.t_self = *tid;
  
  l1:
! 	result = HeapTupleSatisfiesUpdate(tp.t_data, cid, buffer);
  
  	if (result == HeapTupleInvisible)
  	{
--- 2449,2461 
  	lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));
  	Assert(ItemIdIsNormal(lp));
  
+ 	tp.t_tableOid = RelationGetRelid(relation);
  	tp.t_data = (HeapTupleHeader) PageGetItem(page, lp);
  	tp.t_len = ItemIdGetLength(lp);
  	tp.t_self = *tid;
  
  l1:
! 	result = HeapTupleSatisfiesUpdate(tp, cid, buffer);
  
  	if (result == HeapTupleInvisible)
  	{
***
*** 2817,2822  heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
--- 2820,2826 
  	lp = PageGetItemId(page, ItemPointerGetOffsetNumber(otid));
  	Assert(ItemIdIsNormal(lp));
  
+ 	oldtup.t_tableOid = RelationGetRelid(relation);
  	oldtup.t_data = (HeapTupleHeader) PageGetItem(page, lp);
  	oldtup.t_len = ItemIdGetLength(lp);
  	oldtup.t_self = *otid;
***
*** 2829,2835  heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
  	 */
  
  l2:
! 	result = HeapTupleSatisfiesUpdate(oldtup.t_data, cid, buffer);
  
  	if (result == HeapTupleInvisible)
  	{
--- 2833,2839 
  	 */
  
  l2:
! 	result = HeapTupleSatisfiesUpdate(oldtup, cid, buffer);
  
  	if (result == HeapTupleInvisible)
  	{

Re: [HACKERS] logical changeset generation v4

2013-01-18 Thread Alvaro Herrera

Alvaro Herrera wrote:

 I had a look at this part.  Running the regression tests unveiled a case
 where the tableOid wasn't being set (and thus caused an assertion to
 fail), so I added that.  I also noticed that the additions to
 pruneheap.c are sometimes filling a tuple before it's strictly
 necessary, leading to wasted work.  Moved those too.

Actually I missed that downthread there are some fixes to this part; I
had fixed one of these independently, but there's one I missed.  Added
that one too now (not attaching a new version).

(Also, it seems pointless to commit this unless we know for sure that
the downstream change that requires it is good; so I'm holding commit
until we've discussed the other stuff more thoroughly.  Per discussion
with Andres.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-18 Thread Robert Haas

On Fri, Jan 18, 2013 at 11:33 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Andres Freund wrote:

 [09] Adjust all *Satisfies routines to take a HeapTuple instead of a 
 HeapTupleHeader

 For timetravel access to the catalog we need to be able to lookup (cmin,
 cmax) pairs of catalog rows when were 'inside' that TX. This patch just
 adapts the signature of the *Satisfies routines to expect a HeapTuple
 instead of a HeapTupleHeader. The amount of changes for that is fairly
 low as the HeapTupleSatisfiesVisibility macro already expected the
 former.

 It also makes sure the HeapTuple fields are setup in the few places that
 didn't already do so.

 I had a look at this part.  Running the regression tests unveiled a case
 where the tableOid wasn't being set (and thus caused an assertion to
 fail), so I added that.  I also noticed that the additions to
 pruneheap.c are sometimes filling a tuple before it's strictly
 necessary, leading to wasted work.  Moved those too.

 Looks good to me as attached.

I took a quick look at this and am just curious why we're adding the
requirement that t_tableOid has to be initialized?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-18 Thread Andres Freund

On 2013-01-18 11:48:43 -0500, Robert Haas wrote:
 On Fri, Jan 18, 2013 at 11:33 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Andres Freund wrote:
 
  [09] Adjust all *Satisfies routines to take a HeapTuple instead of a 
  HeapTupleHeader
 
  For timetravel access to the catalog we need to be able to lookup (cmin,
  cmax) pairs of catalog rows when were 'inside' that TX. This patch just
  adapts the signature of the *Satisfies routines to expect a HeapTuple
  instead of a HeapTupleHeader. The amount of changes for that is fairly
  low as the HeapTupleSatisfiesVisibility macro already expected the
  former.
 
  It also makes sure the HeapTuple fields are setup in the few places that
  didn't already do so.
 
  I had a look at this part.  Running the regression tests unveiled a case
  where the tableOid wasn't being set (and thus caused an assertion to
  fail), so I added that.  I also noticed that the additions to
  pruneheap.c are sometimes filling a tuple before it's strictly
  necessary, leading to wasted work.  Moved those too.
 
  Looks good to me as attached.

 I took a quick look at this and am just curious why we're adding the
 requirement that t_tableOid has to be initialized?

Its a stepping stone for catalog timetravel. I separated it into a different
patch because it seems to make the real patch easier to review without having
to deal with all those unrelated hunks.

The reason why we need t_tableOid and a valid ItemPointer is that during
catalog timetravel (so we can decode the heaptuples in WAL) we need to
see tuples in the catalog that have been changed in the transaction we
travelled to. That means we need to lookup cmin/cmax values which aren't
stored separately anymore.

My first approach was to build support for logging allocated combocids
(only for catalog tables) and use the existing combocid infrastructure
to look them up.
Turns out thats not a correct solution, consider this:
* T100: INSERT (xmin: 100, xmax: Invalid, (cmin|cmax): 3)
* T101: UPDATE (xmin: 100, xmax: 101, (cmin|cmax): 10)

If you know travel to T100 and you want to decide whether that tuple is
visible when in CommandId = 5 you have the problem that the original
cmin value has been overwritten by the cmax from T101. Note that in this
scenario no ComboCids have been generated!
The problematic part is that the information about what happened is
only available in T101.

I took resolve to doing something similar to what the heap rewrite code
uses to track update chains. Everytime a catalog tuple
inserted/updated/deleted (filenode, ctid, cmin, cmax) is wal logged (if
wal_level=logical) and while traveling to a transaction all those are
put up in a hash table so they can get looked up if we need the
respective cmin/cmax values. As we do that for all modifications of
catalog tuples in that transaction we only ever need that mapping when
inspecting that specific transaction.

Seems to work very nicely, I have made quite some tests with it and I
know of no failure cases.


To be able to make that lookup we need to get the relfilenode  item
pointer of the tuple were just looking up. Thats why I changed the
signature to pass a HeapTuple instead of a HeapTupleHeader. We get the
relfilenode from the buffer that has been passed *not* from the passed
table oid.
So requiring a valid table oid isn't strictly required as long as the
item pointer is valid, but it has made debugging noticeably easier.

Makes sense?

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-18 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 I took a quick look at this and am just curious why we're adding the
 requirement that t_tableOid has to be initialized?

I assume he meant it had been left at a random value, which is surely
bad practice even if a specific usage doesn't fall over today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Andres Freund

On 2013-01-15 17:41:50 +1300, Mark Kirkwood wrote:
 On 15/01/13 17:37, Mark Kirkwood wrote:
 On 15/01/13 14:38, Andres Freund wrote:
 Hi everyone,
 
 Here is the newest version of logical changeset generation.
 
 
 
 
 
 I'm quite interested in this feature - so tried applying the 19 patches to
 the latest 9.3 checkout. Patch and compile are good.

Thanks! Any input welcome.

The git tree might make it easier to follow development ;)

 However portals seem busted:
 
 bench=# BEGIN;
 BEGIN
 bench=# DECLARE c1 CURSOR FOR SELECT * FROM pgbench_accounts;
 DECLARE CURSOR
 bench=# FETCH 2 FROM c1;
  aid | bid | abalance | filler
 
 -+-+--+-
 
 -
1 |   1 |0 |
 
2 |   1 |0 |
 
 (2 rows)
 
 bench=# DELETE FROM pgbench_accounts WHERE CURRENT OF c1;
 The connection to the server was lost. Attempting reset: Failed.
 
 
 Sorry - forgot to add: assert and debug build, and it is an assertion
 failure that is being picked up:
 
 TRAP: FailedAssertion(!(htup-t_tableOid != ((Oid) 0)), File: tqual.c,
 Line: 940)

I unfortunately don't see the error here, I guess its related to how
stack is reused. But I think I found the error, check the attached patch
which I also pushed to the git repository.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 25bd9aeefb03ec39ff1d1cbbac4d2507d533f6d1 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 15 Jan 2013 11:50:33 +0100
Subject: [PATCH] wal_decoding: mergeme *Satisfies: Setup a correct
 tup-t_tableOid in heap_get_latest_tid

Code review found one other case where tableOid potentially didn'T get set, in
nodeBitmapHeapscan. Thats fixed as well.

Found independently by Mark Kirkwood and Abhijit Menon-Sen
---
 src/backend/access/heap/heapam.c  | 1 +
 src/backend/executor/nodeBitmapHeapscan.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 1ff58a4..3346c8a 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1789,6 +1789,7 @@ heap_get_latest_tid(Relation relation,
 		tp.t_self = ctid;
 		tp.t_data = (HeapTupleHeader) PageGetItem(page, lp);
 		tp.t_len = ItemIdGetLength(lp);
+		tp.t_tableOid = RelationGetRelid(relation);
 
 		/*
 		 * After following a t_ctid link, we might arrive at an unrelated
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index c83f972..eda1394 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -258,6 +258,7 @@ BitmapHeapNext(BitmapHeapScanState *node)
 
 		scan-rs_ctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lp);
 		scan-rs_ctup.t_len = ItemIdGetLength(lp);
+		scan-rs_ctup.t_tableOid = scan-rs_rd-rd_id;
 		ItemPointerSet(scan-rs_ctup.t_self, tbmres-blockno, targoffset);
 
 		pgstat_count_heap_fetch(scan-rs_rd);
-- 
1.7.12.289.g0ce9864.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Andres Freund

On 2013-01-15 01:00:00 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
 
 I've been giving a couple of these parts a look.  In particular
 
  [03] Split out xlog reading into its own module called xlogreader
 
 Cleaned this one up a bit last week.  I will polish it some more,
 publish for some final comments, and commit.

I have some smaller bugfixes in my current version that you probably
don't have yet (on grounds of being fixed this weekend)... So we need to
be a bit careful not too loose those.
  Second, I don't think the test_logical_replication functions should live
  in core as they shouldn't be used for a production replication scenario
  (causes longrunning transactions, requires polling) , but I have failed
  to find a neat way to include a contrib extension in the plain
  regression tests.
 
 I think this would work if you make a stamp file in the contrib
 module, similar to how doc/src/sgml uses those.

I tried that, the problem is not the building itself but getting it
installed into the temporary installation...
But anyway, testing decoding requires a different wal_level so I was
hesitant to continue on grounds of that alone.
Should we just up that? Its probably problematic for tests arround some
WAL optimizations an such?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Alvaro Herrera

Andres Freund wrote:
 On 2013-01-15 01:00:00 -0300, Alvaro Herrera wrote:
  Andres Freund wrote:
  
  I've been giving a couple of these parts a look.  In particular
  
   [03] Split out xlog reading into its own module called xlogreader
  
  Cleaned this one up a bit last week.  I will polish it some more,
  publish for some final comments, and commit.
 
 I have some smaller bugfixes in my current version that you probably
 don't have yet (on grounds of being fixed this weekend)... So we need to
 be a bit careful not too loose those.

Sure.  Do you have them as individual commits?  I'm assuming you rebased
the tree.  Maybe in your reflog?  IIRC I also have at least one minor
bug fix.

   Second, I don't think the test_logical_replication functions should live
   in core as they shouldn't be used for a production replication scenario
   (causes longrunning transactions, requires polling) , but I have failed
   to find a neat way to include a contrib extension in the plain
   regression tests.
  
  I think this would work if you make a stamp file in the contrib
  module, similar to how doc/src/sgml uses those.
 
 I tried that, the problem is not the building itself but getting it
 installed into the temporary installation...

Oh, hm.  Maybe the contrib module's make installcheck, then?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Andres Freund

On 2013-01-15 09:56:41 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2013-01-15 01:00:00 -0300, Alvaro Herrera wrote:
   Andres Freund wrote:
   
   I've been giving a couple of these parts a look.  In particular
   
[03] Split out xlog reading into its own module called xlogreader
   
   Cleaned this one up a bit last week.  I will polish it some more,
   publish for some final comments, and commit.
  
  I have some smaller bugfixes in my current version that you probably
  don't have yet (on grounds of being fixed this weekend)... So we need to
  be a bit careful not too loose those.
 
 Sure.  Do you have them as individual commits?  I'm assuming you rebased
 the tree.  Maybe in your reflog?  IIRC I also have at least one minor
 bug fix.

I can check, which commit did you base your modifications on?

Second, I don't think the test_logical_replication functions should live
in core as they shouldn't be used for a production replication scenario
(causes longrunning transactions, requires polling) , but I have failed
to find a neat way to include a contrib extension in the plain
regression tests.
   
   I think this would work if you make a stamp file in the contrib
   module, similar to how doc/src/sgml uses those.
  
  I tried that, the problem is not the building itself but getting it
  installed into the temporary installation...
 
 Oh, hm.  Maybe the contrib module's make installcheck, then?

Thats what I do right now, but I really would prefer to have it checked
during normal make checks, installchecks aren't run all that commonly :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2013-01-15 09:56:41 -0300, Alvaro Herrera wrote:
 Oh, hm.  Maybe the contrib module's make installcheck, then?

 Thats what I do right now, but I really would prefer to have it checked
 during normal make checks, installchecks aren't run all that commonly :(

Sure they are, in every buildfarm cycle.  I don't see the problem.

(In the case of contrib, make installcheck is a whole lot faster than
make check, as well as being older.  So I don't really see why you
think it's less used.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Andres Freund

On 2013-01-15 10:28:28 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-01-15 09:56:41 -0300, Alvaro Herrera wrote:
  Oh, hm.  Maybe the contrib module's make installcheck, then?
 
  Thats what I do right now, but I really would prefer to have it checked
  during normal make checks, installchecks aren't run all that commonly :(
 
 Sure they are, in every buildfarm cycle.  I don't see the problem.
 
 (In the case of contrib, make installcheck is a whole lot faster than
 make check, as well as being older.  So I don't really see why you
 think it's less used.)

Oh. Because I was being dumb ;). And I admittedly never ran a buildfarm
animal so far.

But the other part of the problem is hiding in the unfortunately removed
part of the problem description - the tests require the non-default
options wal_level=logical and max_logical_slots=3+.
Is there a problem of making those the default in the buildfarm created
config?

I guess there would need to be an alternative output file for wal_level
 logical? Afaics there is no way to make a test conditional?

I shortly thought something like
DO $$
BEGIN
IF current_setting('wal_level') != 'df' THEN
   RAISE FATAL 'wal_level needs to be logical';
END IF;
END
$$;
could be used to avoid creating a huge second output file, but we can't
raise FATAL errors from plpgsql.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 But the other part of the problem is hiding in the unfortunately removed
 part of the problem description - the tests require the non-default
 options wal_level=logical and max_logical_slots=3+.

Oh.  Well, that's not going to work.

 Is there a problem of making those the default in the buildfarm created
 config?

Even if we hacked the buildfarm script to do so, it'd be a nonstarter
because it would cause ordinary manual make installcheck to fail.

I think the only reasonable way to handle this would be to (1) make
make installcheck a no-op in this contrib module, and (2) make
make check work, being careful to start the test postmaster with
the necessary options.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Andres Freund

On 2013-01-15 11:10:22 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  But the other part of the problem is hiding in the unfortunately removed
  part of the problem description - the tests require the non-default
  options wal_level=logical and max_logical_slots=3+.
 
 Oh.  Well, that's not going to work.

An alternative would be to have max_logical_slots default to a low value
and make the amount of logged information a wal_level independent
GUC that can be changed on the fly.
ISTM that that would result in too complicated code to deal with other
backends not having the same notion of that value and such, but its
possible.

  Is there a problem of making those the default in the buildfarm created
  config?
 
 Even if we hacked the buildfarm script to do so, it'd be a nonstarter
 because it would cause ordinary manual make installcheck to fail.

I thought we could have a second expected file for that case. Not nice
:(

 I think the only reasonable way to handle this would be to (1) make
 make installcheck a no-op in this contrib module, and (2) make
 make check work, being careful to start the test postmaster with
 the necessary options.

Youre talking about adding a contrib-module specific make check or
changing the normal make check's wal_level?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2013-01-15 11:10:22 -0500, Tom Lane wrote:
 I think the only reasonable way to handle this would be to (1) make
 make installcheck a no-op in this contrib module, and (2) make
 make check work, being careful to start the test postmaster with
 the necessary options.

 Youre talking about adding a contrib-module specific make check or
 changing the normal make check's wal_level?

This contrib module's make check would change the wal_level.  Global
change no good for any number of reasons, the most obvious being that
we want to be able to test other wal_levels too.

I'm not sure whether the make check infrastructure currently supports
passing arguments through to the test postmaster's command line, but it
shouldn't be terribly hard to add if not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Alvaro Herrera

Andres Freund wrote:
 On 2013-01-15 09:56:41 -0300, Alvaro Herrera wrote:
  Andres Freund wrote:
   On 2013-01-15 01:00:00 -0300, Alvaro Herrera wrote:
Andres Freund wrote:

I've been giving a couple of these parts a look.  In particular

 [03] Split out xlog reading into its own module called xlogreader

Cleaned this one up a bit last week.  I will polish it some more,
publish for some final comments, and commit.
   
   I have some smaller bugfixes in my current version that you probably
   don't have yet (on grounds of being fixed this weekend)... So we need to
   be a bit careful not too loose those.
  
  Sure.  Do you have them as individual commits?  I'm assuming you rebased
  the tree.  Maybe in your reflog?  IIRC I also have at least one minor
  bug fix.
 
 I can check, which commit did you base your modifications on?

Dunno.  It's probably easier to reverse-apply the version you submitted
to see what changed, and then forward-apply again.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-15 Thread Andres Freund

On 2013-01-15 15:16:44 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2013-01-15 09:56:41 -0300, Alvaro Herrera wrote:
   Andres Freund wrote:
On 2013-01-15 01:00:00 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
 
 I've been giving a couple of these parts a look.  In particular
 
  [03] Split out xlog reading into its own module called xlogreader
 
 Cleaned this one up a bit last week.  I will polish it some more,
 publish for some final comments, and commit.

I have some smaller bugfixes in my current version that you probably
don't have yet (on grounds of being fixed this weekend)... So we need to
be a bit careful not too loose those.
   
   Sure.  Do you have them as individual commits?  I'm assuming you rebased
   the tree.  Maybe in your reflog?  IIRC I also have at least one minor
   bug fix.
  
  I can check, which commit did you base your modifications on?
 
 Dunno.  It's probably easier to reverse-apply the version you submitted
 to see what changed, and then forward-apply again.

There's at least the two attached patches...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 5ca4b81f03bd7a4bf5101bd68811548023ac12fe Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Mon, 14 Jan 2013 21:43:13 +0100
Subject: [PATCH] xlogreader: fix

---
 src/backend/access/transam/xlogreader.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 6a420e6..9439c05 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -350,7 +350,7 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 
 			/* Wait for the next page to become available */
 			readOff = ReadPageInternal(state, targetPagePtr,
-	   Min(len, XLOG_BLCKSZ));
+	   Min(total_len - gotlen + SizeOfXLogShortPHD, XLOG_BLCKSZ));
 
 			if (readOff  0)
 goto err;
@@ -383,6 +383,11 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 
 			/* Append the continuation from this page to the buffer */
 			pageHeaderSize = XLogPageHeaderSize(pageHeader);
+
+			if (readOff  pageHeaderSize)
+readOff = ReadPageInternal(state, targetPagePtr,
+		   pageHeaderSize);
+
 			Assert(pageHeaderSize = readOff);
 
 			contdata = (char *) state-readBuf + pageHeaderSize;
@@ -390,6 +395,10 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg)
 			if (pageHeader-xlp_rem_len  len)
 len = pageHeader-xlp_rem_len;
 
+			if (readOff  (pageHeaderSize + len))
+readOff = ReadPageInternal(state, targetPagePtr,
+		   pageHeaderSize + len);
+
 			memcpy(buffer, (char *) contdata, len);
 			buffer += len;
 			gotlen += len;
-- 
1.7.12.289.g0ce9864.dirty

From 995d723239df325b48412878fa818c94cb33f724 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 15 Jan 2013 00:58:49 +0100
Subject: [PATCH] xlogreader: use correct type

---
 src/backend/access/transam/xlogreader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 9439c05..f2b9355 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -927,7 +927,7 @@ XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr)
uint32  pageHeaderSize;
XLogPageHeader header;
XLogRecord *record;
-   uint32 readLen;
+   int readLen;
char   *errormsg;
 
if (RecPtr == InvalidXLogRecPtr)
-- 
1.7.12.289.g0ce9864.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread Josh Berkus

Andreas,

Is there a git fork for logical replication somewhere?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread anara...@anarazel.de



Josh Berkus j...@agliodbs.com schrieb:

Andreas,

Is there a git fork for logical replication somewhere?

Check the bottom of the email ;)

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread Abhijit Menon-Sen

At 2013-01-14 18:15:39 -0800, j...@agliodbs.com wrote:

 Is there a git fork for logical replication somewhere?

git://git.postgresql.org/git/users/andresfreund/postgres.git, branch
xlog-decoding-rebasing-cf4 (and xlogreader_v4).

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread Abhijit Menon-Sen

At 2013-01-15 02:38:45 +0100, and...@2ndquadrant.com wrote:

 2) Currently the logical replication infrastructure assigns a
 'slot-id' when a new replica is setup. That slot id isn't really
 nice (e.g. id-321578-3). It also requires that [18] keeps state
 in a global variable to make writing regression tests easy.
 
 I think it would be better to make the user specify those replication
 slot ids, but I am not really sure about it.

I agree, it would be better to let the user name the slot (and report an
error if the given name is already in use).

 3) Currently no options can be passed to an output plugin. I am
 thinking about making INIT_LOGICAL_REPLICATION 'plugin' accept the
 now widely used ('option' ['value'], ...) syntax and pass that to the
 output plugin's initialization function.

Sounds good.

 4) Does anybody object to:
 -- allocate a permanent replication slot
 INIT_LOGICAL_REPLICATION 'plugin' 'slotname' (options);
 
 -- stream data
 START_LOGICAL_REPLICATION 'slotname' 'recptr';
 
 -- deallocate a permanent replication slot
 FREE_LOGICAL_REPLICATION 'slotname';

That looks fine, but I think it should be:

INIT_LOGICAL_REPLICATION 'slotname' 'plugin' (options);

i.e., swap 'plugin' and 'slotname' in your proposal to make the slotname
come first for all three commands. Not important, but a wee bit nicer.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread Alvaro Herrera

Andres Freund wrote:

I've been giving a couple of these parts a look.  In particular

 [03] Split out xlog reading into its own module called xlogreader

Cleaned this one up a bit last week.  I will polish it some more,
publish for some final comments, and commit.

 [08] wal_decoding: Introduce InvalidCommandId and declare that to be the new 
 maximum for CommandCounterIncrement

This seems reasonable.  Mainly it has the effect that a transaction can
have exactly one less command than before.  I don't think this is a
problem for anyone in practice.

 [09] Adjust all *Satisfies routines to take a HeapTuple instead of a 
 HeapTupleHeader

Seemed okay when I looked at it.


 Second, I don't think the test_logical_replication functions should live
 in core as they shouldn't be used for a production replication scenario
 (causes longrunning transactions, requires polling) , but I have failed
 to find a neat way to include a contrib extension in the plain
 regression tests.

I think this would work if you make a stamp file in the contrib
module, similar to how doc/src/sgml uses those.



-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread Mark Kirkwood


On 15/01/13 14:38, Andres Freund wrote:

Hi everyone,

Here is the newest version of logical changeset generation.






I'm quite interested in this feature - so tried applying the 19 patches 
to the latest 9.3 checkout. Patch and compile are good.


However portals seem busted:

bench=# BEGIN;
BEGIN
bench=# DECLARE c1 CURSOR FOR SELECT * FROM pgbench_accounts;
DECLARE CURSOR
bench=# FETCH 2 FROM c1;
 aid | bid | abalance | filler

-+-+--+-
-
   1 |   1 |0 |

   2 |   1 |0 |

(2 rows)

bench=# DELETE FROM pgbench_accounts WHERE CURRENT OF c1;
The connection to the server was lost. Attempting reset: Failed.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v4

2013-01-14 Thread Mark Kirkwood


On 15/01/13 17:37, Mark Kirkwood wrote:

On 15/01/13 14:38, Andres Freund wrote:

Hi everyone,

Here is the newest version of logical changeset generation.






I'm quite interested in this feature - so tried applying the 19 
patches to the latest 9.3 checkout. Patch and compile are good.


However portals seem busted:

bench=# BEGIN;
BEGIN
bench=# DECLARE c1 CURSOR FOR SELECT * FROM pgbench_accounts;
DECLARE CURSOR
bench=# FETCH 2 FROM c1;
 aid | bid | abalance | filler

-+-+--+- 


-
   1 |   1 |0 |

   2 |   1 |0 |

(2 rows)

bench=# DELETE FROM pgbench_accounts WHERE CURRENT OF c1;
The connection to the server was lost. Attempting reset: Failed.



Sorry - forgot to add: assert and debug build, and it is an assertion 
failure that is being picked up:


TRAP: FailedAssertion(!(htup-t_tableOid != ((Oid) 0)), File: 
tqual.c, Line: 940)




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

55 matches

Mail list logo