Re: [HACKERS] pgbench regression test failure

2017-11-11 Thread Steve Singer
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

This causes the pgbench tests to fail (consistently) with

not ok 194 - pgbench late throttling stdout /(?^:processed: [01]/10)/


When I run pgbench manually I get (-t 10 --rate=10 --latency-limit=1 -n -r)

number of transactions actually processed: 10/10
number of transactions skipped: 10 (100.000 %)

Prior to the patch I was getting.

number of transactions actually processed: 0/10
number of transactions skipped: 10 (100.000 %)



@@ -3539,7 +3542,7 @@ printResults(TState *threads, StatsData *total, 
instr_time total_time,^M
{^M
printf("number of transactions per client: %d\n", nxacts);^M
printf("number of transactions actually processed: " 
INT64_FORMAT "/%d\n",^M
-  total->cnt - total->skipped, nxacts * nclients);^M
+  total->cnt, nxacts * nclients);^M

I think you want ntx instead of total->cnt here. 




The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10 beta docs: different replication solutions

2017-07-31 Thread Steve Singer

On Mon, 31 Jul 2017, Merlin Moncure wrote:


On Sun, Jul 30, 2017 at 8:34 PM, Steve Singer <st...@ssinger.info> wrote:


We don't seem to describe logical replication on

https://www.postgresql.org/docs/10/static/different-replication-solutions.html

The attached patch adds a section.


This is a good catch.  Two quick observations:

1) Super pedantic point. I don't like the 'repl.' abbreviation in the
'most common implementation' both for the existing hs/sr and for the
newly added logical.


Updated



2) This lingo:
+ Logical replication allows the data changes from individual tables
+ to be replicated. Logical replication doesn't require a particular server
+ to be designated as a master or a slave but allows data to flow
in multiple
+ directions. For more information on logical replication, see
.

Is good, but I would revise it just a bit to emphasize the
subscription nature of logical replication to link the concepts
expressed strongly in the main section.  For example:

Logical replication allows the data changes [remove: "from individual
tables to be replicated"] to be published to subscriber nodes.  Data
can flow in any direction between nodes on a per-table basis; there is
no concept of a master server.  Conflict resolution must be handled
completely by the application.  For more information on...

what do you think?


Sounds good.

I've incorporated these changes into an updated patch but I changed the 
language around conflict resolution.  Conflict resolution could be handled 
by triggers or adding extra columns to the primary key, that wouldn't be 
'completely by the application'





merlin

diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 138bdf2..19b26f8 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -158,6 +158,27 @@ protocol to make nodes agree on a serializable transactional order.

   
 
+
+  
+   Logical Replication
+   
+
+
+  Logical replication allows a database server to send a stream of
+  data modifications to another server.
+  PostgreSQL logical replication constructs
+  a stream of logical data modifications from the WAL.
+
+
+  Logical replication allows the data changes to be published to subscriber
+  nodes. Data can flow in any direction between nodes on a per-table basis;
+  there is no concept of a master server. PostgreSQL
+  does not include support for conflict resolution.
+  For more information on logical replication, see .
+   
+   
+  
+ 
   
Trigger-Based Master-Standby Replication

@@ -290,6 +311,7 @@ protocol to make nodes agree on a serializable transactional order.
  Shared Disk Failover
  File System Replication
  Write-Ahead Log Shipping
+ Logical Replication
  Trigger-Based Master-Standby Replication
  Statement-Based Replication Middleware
  Asynchronous Multimaster Replication
@@ -303,7 +325,8 @@ protocol to make nodes agree on a serializable transactional order.
  Most Common Implementation
  NAS
  DRBD
- Streaming Repl.
+ Streaming Replication.
+ Logical Replication.
  Slony
  pgpool-II
  Bucardo
@@ -315,6 +338,7 @@ protocol to make nodes agree on a serializable transactional order.
  shared disk
  disk blocks
  WAL
+ Logical decoding
  table rows
  SQL
  table rows
@@ -330,6 +354,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
 
 
 
@@ -337,6 +362,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  
@@ -349,6 +375,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  
@@ -360,6 +387,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  with sync off
  
+ 
  
  
  
@@ -371,6 +399,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  with sync on
  
+ 
  
  
  
@@ -385,6 +414,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
 
 
 
@@ -393,6 +423,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  
@@ -403,6 +434,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 10 beta docs: different replication solutions

2017-07-30 Thread Steve Singer


We don't seem to describe logical replication on

https://www.postgresql.org/docs/10/static/different-replication-solutions.html

The attached patch adds a section.

Steve


diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 138bdf2..1329d1f 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -158,6 +158,26 @@ protocol to make nodes agree on a serializable transactional order.

   
 
+
+  
+   Logical Replication
+   
+
+
+  Logical replication allows a database server to send a stream of
+  data modifications to another server.
+  PostgreSQL logical replication constructs
+  a stream of logical data modifications from the WAL.
+
+
+ Logical replication allows the data changes from individual tables
+ to be replicated. Logical replication doesn't require a particular server
+ to be designated as a master or a slave but allows data to flow in multiple
+ directions. For more information on logical replication, see .
+
+   
+  
+ 
   
Trigger-Based Master-Standby Replication

@@ -290,6 +310,7 @@ protocol to make nodes agree on a serializable transactional order.
  Shared Disk Failover
  File System Replication
  Write-Ahead Log Shipping
+ Logical Replication
  Trigger-Based Master-Standby Replication
  Statement-Based Replication Middleware
  Asynchronous Multimaster Replication
@@ -304,6 +325,7 @@ protocol to make nodes agree on a serializable transactional order.
  NAS
  DRBD
  Streaming Repl.
+ Logical Repl.
  Slony
  pgpool-II
  Bucardo
@@ -315,6 +337,7 @@ protocol to make nodes agree on a serializable transactional order.
  shared disk
  disk blocks
  WAL
+ Logical decoding
  table rows
  SQL
  table rows
@@ -330,6 +353,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
 
 
 
@@ -337,6 +361,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  
@@ -349,6 +374,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  
@@ -360,6 +386,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  with sync off
  
+ 
  
  
  
@@ -371,6 +398,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  with sync on
  
+ 
  
  
  
@@ -385,6 +413,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
 
 
 
@@ -393,6 +422,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  
@@ -403,6 +433,7 @@ protocol to make nodes agree on a serializable transactional order.
  
  
  
+ 
  
  
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-16 Thread Steve Singer

On 04/10/2017 01:28 PM, Fujii Masao wrote:

Hi,

  src/backend/replication/logical/launcher.c
  * Worker started and attached to our shmem. This check is safe
  * because only launcher ever starts the workers, so nobody can steal
  * the worker slot.

The tablesync patch enabled even worker to start another worker.
So the above assumption is not valid for now.

This issue seems to cause the corner case where the launcher picks up
the same worker slot that previously-started worker has already picked
up to start another worker.

Regards,



I'm not sure if what I am seeing is related to this or another issue.

If I create a subscription for a publication that doesn't exist (yet) I 
see 'publication mypub does not exist" in the subscribers log


If I then create the publication on the publisher the error doesn't go 
away, the worker process keeps spinning with the same error.


If I then drop the subscription and recreate it it works.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-01-08 Thread Steve Singer

On 12/31/2016 01:27 AM, Peter Eisentraut wrote:
Another updated patch, with quite a bit of rebasing and some minor 
code polishing.




Patch applies cleanly and the tests pass
The feature seems to work as expected.

I've tried this out and it behaves as expected and desired. I also 
tested the PG dump changes when dumping from both 8.3 and 9.5 and tables 
with serial types and standalone sequences restore as I would expect.



The only concern I have with the functionality is that there isn't a way 
to change the type of a sequence.


For example

create table foo(id serial4);
--hmm I"m running out of id space
alter table foo alter column id type int8;
alter sequence foo_id_seq maxvalue
9223372036854775807;

2017-01-08 14:29:27.073 EST [4935] STATEMENT:  alter sequence foo_id_seq 
maxvalue

9223372036854775807;
ERROR:  MAXVALUE (9223372036854775807) is too large for sequence data 
type integer


Since we allow for min/maxvalue to be changed I feel we should also 
allow the type to be changed.





@@ -1236,7 +1239,15 @@ init_params(ParseState *pstate, List *options, 
bool isInit,

{
DefElem*defel = (DefElem *) lfirst(option);

-   if (strcmp(defel->defname, "increment") == 0)
+   if (strcmp(defel->defname, "as") == 0)
+   {
+   if (as_type)
+   ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("conflicting or 
redundant options")));

+   as_type = defel;
+   }
+   else if (strcmp(defel->defname, "increment") == 0)

Should we including parser_errposition(pstate, defel->location)));  like 
for the other errors below this?



Other than that the patch looks good



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2017-01-01 Thread Steve Singer

On 12/30/2016 05:53 AM, Petr Jelinek wrote:

Hi,

I rebased this for the changes made to inheritance and merged in the
fixes that I previously sent separately.






I'm not sure if the following is expected or not

I have 1 publisher and 1 subscriber.
I then do pg_dump on my subscriber
./pg_dump -h localhost --port 5441 --include-subscriptions 
--no-create-subscription-slot test|./psql --port 5441 test_b


I now can't do a drop database test_b  , which is expected

but I can't drop the subscription either


test_b=# drop subscription mysub;
ERROR:  could not drop replication origin with OID 1, in use by PID 24996

 alter subscription mysub disable;
ALTER SUBSCRIPTION
drop subscription mysub;
ERROR:  could not drop replication origin with OID 1, in use by PID 24996

drop subscription mysub nodrop slot;

doesn't work either.  If I first drop the working/active subscription on 
the original 'test' database it works but I can't seem to drop the 
subscription record on test_b






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-12-19 Thread Steve Singer

On 12/18/2016 09:04 PM, Petr Jelinek wrote:

On 18/12/16 19:02, Steve Singer wrote:


pg_dump is also generating warnings

pg_dump: [archiver] WARNING: don't know how to set owner for object type
SUBSCRIPTION

I know that the plan is to add proper ACL's for publications and
subscriptions later. I don't know if we want to leave the warning in
until then or do something about it.


No, ACLs are separate from owner. This is thinko on my side. I was
thinking we can live without ALTER ... OWNER TO for now, but we actually
need it for pg_dump and for REASSIGN OWNED. So now I added the OWNER TO
for both PUBLICATION and SUBSCRIPTION.



When I try to restore my pg_dump with publications I get

./pg_dump  -h localhost --port 5440 test |./psql -h localhost --port 
5440 test2



ALTER TABLE
CREATE PUBLICATION
ERROR:  unexpected command tag "PUBLICATION

This comes from a
ALTER PUBLICATION mypub OWNER TO ssinger;


Does the OWNER TO  clause need to be added to AlterPublicationStmt: 
instead of AlterOwnerStmt ?
Also we should update the tab-complete for ALTER PUBLICATION to show the 
OWNER to options  + the \h help in psql and the reference SGML






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-12-18 Thread Steve Singer

On 12/18/2016 05:28 AM, Petr Jelinek wrote:

On 17/12/16 18:34, Steve Singer wrote:

On 12/16/2016 07:49 AM, Petr Jelinek wrote:
Yeah subscriptions are per database. I don't want to make v14 just 
for these 2 changes as that would make life harder for anybody 
code-reviewing the v13 so attached is diff with above fixes that 
applies on top of v13. 





Thanks that fixes those issues.

A few more I've noticed


pg_dumping subscriptions doesn't seem to work

./pg_dump -h localhost --port 5441 --include-subscriptions test
pg_dump: [archiver (db)] query failed: ERROR:  missing FROM-clause entry 
for table "p"

LINE 1: ...LECT rolname FROM pg_catalog.pg_roles WHERE oid = p.subowner...
 ^
pg_dump: [archiver (db)] query was: SELECT s.tableoid, s.oid, 
s.subname,(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = 
p.subowner) AS rolname, s.subenabled,  s.subconninfo, s.subslotname, 
s.subpublications FROM pg_catalog.pg_subscription s WHERE s.subdbid = 
(SELECT oid FROM pg_catalog.pg_database   WHERE datname 
= current_database())


I have attached a patch that fixes this.

pg_dump is also generating warnings

pg_dump: [archiver] WARNING: don't know how to set owner for object type 
SUBSCRIPTION


I know that the plan is to add proper ACL's for publications and 
subscriptions later. I don't know if we want to leave the warning in 
until then or do something about it.



Also the tab-competion for create subscription doesn't seem to work as 
intended.
I've attached a patch that fixes it and patches to add tab completion 
for alter publication|subscription




>From 3ed2ad471766490310b1102d8c9166a597ac11af Mon Sep 17 00:00:00 2001
From: Steve Singer <st...@ssinger.info>
Date: Sun, 18 Dec 2016 12:37:29 -0500
Subject: [PATCH 3/4] Fix tab complete for CREATE SUBSCRIPTION

---
 src/bin/psql/tab-complete.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 8816f6c..6dd47f6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2298,8 +2298,10 @@ psql_completion(const char *text, int start, int end)
 /* CREATE SUBSCRIPTION */
 	else if (Matches3("CREATE", "SUBSCRIPTION", MatchAny))
 		COMPLETE_WITH_CONST("CONNECTION");
+	else if (Matches5("CREATE", "SUBSCRIPTION", MatchAny, "CONNECTION",MatchAny))
+		COMPLETE_WITH_CONST("PUBLICATION");	
 	/* Complete "CREATE SUBSCRIPTION  ...  WITH ( " */
-	else if (Matches2("CREATE", "SUBSCRIPTION") && TailMatches2("WITH", "("))
+	else if (HeadMatches2("CREATE", "SUBSCRIPTION") && TailMatches2("WITH", "("))
 		COMPLETE_WITH_LIST5("ENABLED", "DISABLED", "CREATE SLOT",
 			"NOCREATE SLOT", "SLOT NAME");
 
-- 
2.1.4

>From 36a0d4382f7ffd2b740298a2bafd49471766bdad Mon Sep 17 00:00:00 2001
From: Steve Singer <st...@ssinger.info>
Date: Sun, 18 Dec 2016 12:51:14 -0500
Subject: [PATCH 2/4] Add tab-complete for ALTER SUBSCRIPTION

---
 src/bin/psql/tab-complete.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4cbb848..8816f6c 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1410,7 +1410,7 @@ psql_completion(const char *text, int start, int end)
 			"EVENT TRIGGER", "EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", "FUNCTION",
 			"GROUP", "INDEX", "LANGUAGE", "LARGE OBJECT", "MATERIALIZED VIEW", "OPERATOR",
 			"POLICY", "PUBLICATION", "ROLE", "RULE", "SCHEMA", "SERVER", "SEQUENCE",
-			"SYSTEM", "TABLE", "TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE",
+			"SUBSCRIPTION", "SYSTEM", "TABLE", "TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE",
 		"USER", "USER MAPPING FOR", "VIEW", NULL};
 
 		COMPLETE_WITH_LIST(list_ALTER);
@@ -1446,6 +1446,15 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_LIST6("PUBLISH INSERT", "NOPUBLISH INSERT", "PUBLISH UPDATE",
 			"NOPUBLISH UPDATE", "PUBLISH DELETE", "NOPUBLISH DELETE");
 	}
+	/* ALTER SUBSCRIPTION  ... */
+	else if (Matches3("ALTER","SUBSCRIPTION",MatchAny))
+	{
+		COMPLETE_WITH_LIST5("WITH", "CONNECTION", "SET PUBLICATION", "ENABLE", "DISABLE");
+	}
+	else if (HeadMatches3("ALTER", &qu

Re: [HACKERS] Logical Replication WIP

2016-12-17 Thread Steve Singer

On 12/16/2016 07:49 AM, Petr Jelinek wrote:

Hi,

attached is version 13 of the patch.

I merged in changes from PeterE. And did following changes:
- fixed the ownership error messages for both provider and subscriber
- added ability to send invalidation message to invalidate whole
relcache and use it in publication code
- added the post creation/alter/drop hooks
- removed parts of docs that refer to initial sync (which does not exist
yet)
- added timeout handling/retry, etc to apply/launcher based on the GUCs
that exist for wal receiver (they could use renaming though)
- improved feedback behavior
- apply worker now uses owner of the subscription as connection user
- more tests
- check for max_replication_slots in launcher
- clarify the update 'K' sub-message description in protocol


A few things I've noticed so far

If I shutdown the publisher I see the following in the log

2016-12-17 11:33:49.548 EST [1891] LOG:  worker process: ?)G? (PID 1987) 
exited with exit code 1


but then if I shutdown the subscriber postmaster and restart it switches to
2016-12-17 11:43:09.628 EST [2373] LOG:  worker process:  (PID 2393) 
exited with exit code 1


Not sure where the 'G' was coming from (other times I have seen an 'I' 
here or other random characters)



I don't think we are cleaning up subscriptions on a drop database

If I do the following

1) Create a subscription in a new database
2) Stop the publisher
3) Drop the database on the subscriber

test=# create subscription mysuba connection 'host=localhost dbname=test 
port=5440' publication mypub;

test=# \c b
b=# drop database test;
DROP DATABASE
b=# select * FROM pg_subscription ;
 subdbid | subname | subowner | subenabled | subconninfo  | 
subslotname | subpublications

-+-+--++--+-+-
   16384 | mysuba  |   10 | t  | host=localhost dbname=test 
port=5440 | mysuba  | {mypub}


b=# select datname FROM pg_database where oid=16384;
 datname
-
(0 rows)

Also I don't think I can now drop mysuba
b=# drop subscription mysuba;
ERROR:  subscription "mysuba" does not exist











--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-20 Thread Steve Singer

On Sun, 20 Nov 2016, Petr Jelinek wrote:


On 13/11/16 23:02, Steve Singer wrote:



There is one exception though:

*** 195,214 


  A conflict will produce an error and will stop the replication; it
! must be resolved manually by the user.


! The resolution can be done either by changing data on the subscriber
! so that it does not conflict with incoming change or by skipping the
! transaction that conflicts with the existing data. The transaction
! can be skipped by calling the
! 
! pg_replication_origin_advance() function
! with a node_name corresponding to the subscription name. The
! current position of origins can be seen in the
! 
! pg_replication_origin_status system view.
!   
  
  


I don't see why this needs to be removed? Maybe it could be improved but
certainly not removed?



Sorry, I was confused. I noticed that the function was missing in the patch 
and thought it was documentation for a function that you had removed from 
recent versions of the patch versus referencing a function that is already 
committed.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-11-13 Thread Steve Singer

On 10/31/2016 06:38 AM, Petr Jelinek wrote:

On 31/10/16 00:52, Steve Singer wrote:
There are some fundamental issues with initial sync that need to be 
discussed on list but this one is not known. I'll try to convert this 
to test case (seems like useful one) and fix it, thanks for the 
report. In meantime I realized I broke the last patch in the series 
during rebase so attached is the fixed version. It also contains the 
type info in the protocol.




Attached are some proposed documentation updates (to be applied ontop of 
your 20161031 patch set)


Also


  Publication


+  
+The tables are matched using fully qualified table name. Renaming of
+tables or schemas is not supported.
+  

Is renaming of tables any less supported than other DDL operations
For example

alter table nokey2 rename to nokey3
select * FROM pg_publication_tables ;
 pubname | schemaname | tablename
-++---
 tpub| public | nokey3
(1 row)


If I then kill the postmaster on my subscriber and restart it, I get

2016-11-13 16:17:11.341 EST [29488] FATAL:  the logical replication 
target public.nokey3 not found
2016-11-13 16:17:11.342 EST [29272] LOG:  worker process: logical 
replication worker 41076 (PID 29488) exited with exit code 1
2016-11-13 16:17:16.350 EST [29496] LOG:  logical replication apply for 
subscription nokeysub started
2016-11-13 16:17:16.358 EST [29498] LOG:  logical replication sync for 
subscription nokeysub, table nokey2 started
2016-11-13 16:17:16.515 EST [29498] ERROR:  table public.nokey2 not 
found on publisher
2016-11-13 16:17:16.517 EST [29272] LOG:  worker process: logical 
replication worker 41076 sync 24688 (PID 29498) exited with exit code 1


but if I then rename the table on the subscriber everything seems to work.

(I suspect the need to kill+restart is a bug, I've seen other instances 
where a hard restart of the subscriber following changes to is required)



I am also having issues adding a table to a publication ,it doesn't seem 
work


P: create publication tpub for table a;
S: create subscription mysub connection 'host=localhost dbname=test 
port=5440' publication tpub;

P: insert into a(b) values ('1');
P: alter publication tpub add table b;
P: insert into b(b) values ('1');
P: insert into a(b) values ('2');


select * FROM pg_publication_tables ;
 pubname | schemaname | tablename
-++---
 tpub| public | a
 tpub| public | b


but


S:  select * FROM b;
 a | b
---+---
(0 rows)
S: select * FROM a;
 a | b
---+---
 5 | 1
 6 | 2
(2 rows)


diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
new file mode 100644
index 63af1a5..89723c3
*** a/doc/src/sgml/logical-replication.sgml
--- b/doc/src/sgml/logical-replication.sgml
***
*** 95,110 
  how the table is accessed. Each table can be added to multiple
  publications if needed.  Publications may currently only contain
  tables. Objects must be added explicitly, except when a publication
! is created for ALL TABLES. There is no default name for
! a publication which specifies all tables.


  Publications can choose to limit the changes they produce to show
  any combination of INSERT, UPDATE and
  DELETE in a similar way to the way triggers are fired by
! particular event types. Only tables with a REPLICA IDENTITY
! index can be added to a publication which replicates UPDATE
! and DELETE operation.


  The definition of a publication object will be included within
--- 95,110 
  how the table is accessed. Each table can be added to multiple
  publications if needed.  Publications may currently only contain
  tables. Objects must be added explicitly, except when a publication
! is created for ALL TABLES.


  Publications can choose to limit the changes they produce to show
  any combination of INSERT, UPDATE and
  DELETE in a similar way to the way triggers are fired by
! particular event types. If a table with without a REPLICA IDENTITY
! index is added to a publication which replicates UPDATE or
! DELETE operations then subsequent UPDATE
! or DELETE operations will fail on the publisher.


  The definition of a publication object will be included within
***
*** 195,214 


  A conflict will produce an error and will stop the replication; it
! must be resolved manually by the user.


! The resolution can be done either by changing data on the subscriber
! so that it does not conflict with incoming change or by skipping the
! transaction that conflicts with the existing data. The transaction
! can be skipped by calling the
! 
! pg_replication_origin_advance() function
! with a node_name corresponding to the subscription name. The
! current position of origins can be seen

Re: [HACKERS] Logical Replication WIP

2016-11-05 Thread Steve Singer

On 10/31/2016 06:38 AM, Petr Jelinek wrote:
There are some fundamental issues with initial sync that need to be 
discussed on list but this one is not known. I'll try to convert this 
to test case (seems like useful one) and fix it, thanks for the 
report. In meantime I realized I broke the last patch in the series 
during rebase so attached is the fixed version. It also contains the 
type info in the protocol.





I don't know if this is covered by the known initial_sync problems or not


If I have a 'all tables' publication and then create a new table the 
data doesn't seem to replicate to the new table.



P: create table a(a serial4 primary key, b text);
S: create table a(a serial4 primary key, b text);
P: create publication mypub for all tables;
S: create subscription mysub connection 'host=localhost dbname=test 
port=5441' publication mypub;

P: create table b(a serial4 primary key, b text);
P: insert into b(b) values ('foo2');
P: insert into a(b) values ('foo3');

Then I check my subscriber

select * FROM a;
 a |  b
---+--
 1 | foo
 2 | foo3
(2 rows)

test=# select * FROM b;
 a | b
---+---
(0 rows)


However, if the table isn't on the subscriber I do get an error:

ie

P: create table c(a serial4 primary key, b text);
P: insert into c(b) values('foo');

2016-11-05 11:49:31.456 EDT [14938] FATAL:  the logical replication 
target public.c not found
2016-11-05 11:49:31.457 EDT [13703] LOG:  worker process: logical 
replication worker 16457 (PID 14938) exited with exit code 1


but if then add the table
S: create table c(a serial4 primary key, b text);
2016-11-05 11:51:08.583 EDT [15014] LOG:  logical replication apply for 
subscription mysub started


but the data doesn't replicate to table c either.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-10-30 Thread Steve Singer

On 10/24/2016 09:22 AM, Petr Jelinek wrote:

Hi,

attached is updated version of the patch.

There are quite a few improvements and restructuring, I fixed all the
bugs and basically everything that came up from the reviews and was
agreed on. There are still couple of things missing, ie column type
definition in protocol and some things related to existing data copy.


Here are a few things I've noticed so far.

+
+CREATE SUBSCRIPTION mysub WITH CONNECTION dbname=foo host=bar 
user=repuser PUBLICATION mypub;

+
+  
+  

The documentation above doesn't match the syntax, CONNECTION needs to be 
in single quotes not double quotes

I think you want
+
+CREATE SUBSCRIPTION mysub WITH CONNECTION 'dbname=foo host=bar 
user=repuser' PUBLICATION mypub;

+
+  
+  



I am not sure if this is a known issue covered by your comments about 
data copy but I am still having issues with error reporting on a failed 
subscription.


I created a subscription, dropped the subscription and created a second 
one.  The second subscription isn't active but shows no errors.



P: create publication mypub for table public.a;
S: create subscription mysub with connection 'dbname=test host=localhost 
port=5440' publication mypub;

P:  insert into a(b) values ('t');
S: select * FROM a;
 a | b
---+---
 1 | t
(1 row)

Everything is good
Then I do
S: drop subscription mysub;

S: create subscription mysub2 with connection 'dbname=test 
host=localhost port=5440' publication mypub;

P:  insert into a(b) values ('f');
S: select * FROM a;
 a | b
---+---
 1 | t

The data doesn't replicate


select * FROM pg_stat_subscription;
 subid | subname | pid | relid | received_lsn | last_msg_send_time | 
last_msg_receipt_time | latest_end_lsn | latest_end_time

---+-+-+---+--++---++-
 16398 | mysub2  | |   |  | |   
||

(1 row)


The only thing in my log is

2016-10-30 15:27:27.038 EDT [6028] NOTICE:  dropped replication slot 
"mysub" on publisher
2016-10-30 15:27:36.072 EDT [6028] NOTICE:  created replication slot 
"mysub2" on publisher

2016-10-30 15:27:36.082 EDT [6028] NOTICE:  synchronized table states


I'd expect an error in the log or something.
However, if I delete everything from the table on the subscriber then 
the subscription proceeds


I think there are still problems with signal handling in the initial sync

If I try to drop mysub2 (while the subscription is stuck instead of 
deleting the data) the drop hangs
If I then try to kill the postmaster for the subscriber nothing happens, 
have to send it a -9 to go away.


However once I do that and then restart the postmaster for the 
subscriber I start to see the duplicate key errors in the log


2016-10-30 16:00:54.635 EDT [7018] ERROR:  duplicate key value violates 
unique constraint "a_pkey"

2016-10-30 16:00:54.635 EDT [7018] DETAIL:  Key (a)=(1) already exists.
2016-10-30 16:00:54.635 EDT [7018] CONTEXT:  COPY a, line 1
2016-10-30 16:00:54.637 EDT [7007] LOG:  worker process: logical 
replication worker 16400 sync 16387 (PID 7018) exited with exit code 1


I'm not sure why I didn't get those until I restarted the postmaster but 
it seems to happen whenever I drop a subscription then create a new one. 
Creating the second subscription from the same psql session as I 
create/drop the first seems important  in reproducing this





I am also having issues dropping a second subscription from the same 
psql session


(table a is empty on both nodes to avoid duplicate key errors)
S: create subscription sub1 with connection  'host=localhost dbname=test 
port=5440' publication mypub;
S: create subscription sub2 with connection  'host=localhost dbname=test 
port=5440' publication mypub;

S: drop subscription sub1;
S: drop subscription sub2;

At this point the drop subscription hangs.








The biggest changes are:

I added one more prerequisite patch (the first one) which adds ephemeral
slots (or well implements UI on top of the code that was mostly already
there). The ephemeral slots are different in that they go away either on
error or when session is closed. This means the initial data sync does
not have to worry about cleaning up the slots after itself. I think this
will be useful in other places as well (for example basebackup). I
originally wanted to call them temporary slots in the UI but since the
behavior is bit different from temp tables I decided to go with what the
underlying code calls them in UI as well.

I also split out the libpqwalreceiver rewrite to separate patch which
does just the re-architecture and does not really add new functionality.
And I did the re-architecture bit differently based on the review.

There is now new executor module in execReplication.c, no new nodes but
several utility commands. I moved there the tuple lookup functions from
apply and also wrote new interfaces for doing inserts/updates/deletes to

Re: [HACKERS] Logical Replication WIP

2016-09-20 Thread Steve Singer

On Tue, 20 Sep 2016, Peter Eisentraut wrote:


On 9/18/16 4:17 PM, Steve Singer wrote:




I think if we want to prevent the creation of subscriptions that point
to self, then we need to create a magic token when the postmaster starts
and check for that when we connect.  So more of a running-instance
identifier instead of a instance-on-disk identifier.

The other option is that we just allow it and make it more robust.


I think we should go with the second option for now. I feel that the effort 
is  better spent making sure that initial syncs that have don't subscribe 
(for whatever reasons) can be aborted instead of trying to build a concept of 
node identity before we really need it.


Steve



--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-18 Thread Steve Singer

On 09/08/2016 06:59 PM, Petr Jelinek wrote:

Hi,

Updated version, this should address most of the things in Peter's 
reviews so far, not all though as some of it needs more discussion.




Another bug report.

I had subscribed a subscriber database to a publication with 1 table
create table a (a serial4 primary key, b text);

* I then dropped column b on the subscriber
* inserted some rows on the publisher
* Noticed the expected error about column b not existing in the 
subscriber log

* Added column c on the subscriber, then added column b after column C

I now get the following stack trace

#1  0x007dc8f9 in cstring_to_text (
s=0x16f238af0 )
at varlena.c:152
#2  0x008046a3 in InputFunctionCall (
flinfo=flinfo@entry=0x7fffa02d0250,
str=str@entry=0x16f238af0 0x16f238af0>, typioparam=typioparam@entry=25, typmod=typmod@entry=-1) at 
fmgr.c:1909

#3  0x00804971 in OidInputFunctionCall (functionId=,
str=0x16f238af0 ,
typioparam=25, typmod=-1) at fmgr.c:2040
#4  0x006aa485 in SlotStoreCStrings (slot=0x2748670,
values=0x7fffa02d0330) at apply.c:569
#5  0x006ab45c in handle_insert (s=0x274d088) at apply.c:756
#6  0x006abcea in handle_message (s=0x7fffa02d3e20) at apply.c:978
#7  LogicalRepApplyLoop (last_received=117457680) at apply.c:1146
#8  0x006ac37e in ApplyWorkerMain (main_arg=)
at apply.c:1530


In SlotStoreCStrings values only has 2 elements but natts is 4




Changes:
- I moved the publication.c to pg_publication.c, subscription.c to 
pg_subscription.c.

- changed \drp and \drs to \dRp and \dRs
- fixed definitions of the catalogs (BKI_ROWTYPE_OID)
- changed some GetPublication calls to get_publication_name
- fixed getObjectIdentityParts for OCLASS_PUBLICATION_REL
- fixed get_object_address_publication_rel
- fixed the dependencies between pkeys and publications, for this I 
actually had to add new interface to depenency.c that allows dropping 
single dependency

- fixed the 'for all tables' and 'for tables all in schema' publications
- changed the alter publication from FOR to SET
- added more test cases for the publication DDL
- fixed compilation of subscription patch alone and docs
- changed subpublications to name[]
- added check for publication list duplicates
- made the subscriptions behave more like they are inside the database 
instead of shared catalog (even though the catalog is still shared)
- added options for for CREATE SUBSCRIPTION to optionally not create 
slot and not do the initial data sync - that should solve the 
complaint about CREATE SUBSCRIPTION always connecting
- the CREATE SUBSCRIPTION also tries to check if the specified 
connection connects back to same db (although that check is somewhat 
imperfect) and if it gets stuck on create slot it should be normally 
cancelable (that should solve the issue Steve Singer had)

- fixed the tests to work in any timezone
- added DDL regress tests for subscription
- added proper detection of missing schemas and tables on subscriber
- rebased on top of 19acee8 as the DefElem changes broke the patch

The table sync is still far from ready.







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-18 Thread Steve Singer

On 09/08/2016 06:59 PM, Petr Jelinek wrote:
- the CREATE SUBSCRIPTION also tries to check if the specified 
connection connects back to same db (although that check is somewhat 
imperfect) and if it gets stuck on create slot it should be normally 
cancelable (that should solve the issue Steve Singer had) 




When I create my subscriber database by doing a physical backup of the 
publisher cluster (with cp before I add any data) then I am unable to 
connect subscribe.

ie
initdb ../data
cp -r ../data ../data2
./postgres -D ../data
./postgres -D ../data2


This make sense when I look at your code, but it might not be what we want

I had the same issue when I created my subscriber cluster with 
pg_basebackup (The timeline on the destination cluster still shows as 1)






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-05 Thread Steve Singer

On 09/05/2016 03:58 PM, Steve Singer wrote:

On 08/31/2016 04:51 PM, Petr Jelinek wrote:

Hi,

and one more version with bug fixes, improved code docs and couple 
more tests, some general cleanup and also rebased on current master 
for the start of CF.








A few more things I noticed when playing with the patches

1, Creating a subscription to yourself ends pretty badly,
the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill 
it.  The background process seems to be waiting for a transaction to 
commit (I assume the create subscription command).  I had to kill -9 the 
various processes to get things to stop.  Getting confused about 
hostnames and ports is a common operator error.


2. Failures during the initial subscription  aren't recoverable

For example

on db1
  create table a(id serial4 primary key,b text);
  insert into a(b) values ('1');
  create publication testpub for table a;

on db2
  create table a(id serial4 primary key,b text);
  insert into a(b) values ('1');
  create subscription testsub connection 'host=localhost port=5440 
dbname=test' publication testpub;


I then get in my db2 log

ERROR:  duplicate key value violates unique constraint "a_pkey"
DETAIL:  Key (id)=(1) already exists.
LOG:  worker process: logical replication worker 16396 sync 16387 (PID 
10583) exited with exit code 1

LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR: 
replication slot "testsub_sync_a" already exists



LOG:  worker process: logical replication worker 16396 sync 16387 (PID 
10585) exited with exit code 1

LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR: 
replication slot "testsub_sync_a" already exists



and it keeps looping.
If I then truncate "a" on db2 it doesn't help. (I'd expect at that point 
the initial subscription to work)


If I then do on db2
 drop subscription testsub cascade;

I still see a slot in use on db1

select * FROM pg_replication_slots ;
   slot_name|  plugin  | slot_type | datoid | database | active | 
active_pid | xmin | catalog_xmin | rest

art_lsn | confirmed_flush_lsn
+--+---++--+++--+--+-
+-
 testsub_sync_a | pgoutput | logical   |  16384 | test | f 
||  | 1173 | 0/15

66E08   | 0/1566E40






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-09-05 Thread Steve Singer

On 08/31/2016 04:51 PM, Petr Jelinek wrote:

Hi,

and one more version with bug fixes, improved code docs and couple 
more tests, some general cleanup and also rebased on current master 
for the start of CF.






To get the 'subscription' TAP tests to pass I need to set

export PGTZ=+02

Shouldn't the expected output be with reference to PST8PDT?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication WIP

2016-08-13 Thread Steve Singer

On 08/05/2016 11:00 AM, Petr Jelinek wrote:

Hi,

as promised here is WIP version of logical replication patch.



Thanks for keeping on this.  This is important work


Feedback is welcome.



+
+  Publication
+  
+A Publication object can be defined on any master node, owned by one
+user. A Publication is a set of changes generated from a group of
+tables, and might also be described as a Change Set or Replication Set.
+Each Publication exists in only one database.

'A publication object can be defined on *any master node*'.  I found 
this confusing the first time I read it because I thought it was 
circular (what makes a node a 'master' node? Having a publication object 
published from it?).   On reflection I realized that you mean ' any 
*physical replication master*'.  I think this might be better worded as 
'A publication object can be defined on any node other than a standby 
node'.  I think referring to 'master' in the context of logical 
replication might confuse people.


I am raising this in the context of the larger terminology that we want 
to use and potential confusion with the terminology we use for physical 
replication. I like the publication / subscription terminology you've 
gone with.



 
+Publications are different from table schema and do not affect
+how the table is accessed. Each table can be added to multiple
+Publications if needed.  Publications may include both tables
+and materialized views. Objects must be added explicitly, except
+when a Publication is created for "ALL TABLES". There is no
+default name for a Publication which specifies all tables.
+  
+  
+The Publication is different from table schema, it does not affect
+how the table is accessed and each table can be added to multiple

Those 2 paragraphs seem to start the same way.  I get the feeling that 
there is some point your trying to express that I'm not catching onto. 
Of course a publication is different than a tables schema, or different 
than a function.


The definition of publication you have on the CREATE PUBLICATION page 
seems better and should be repeated here (A publication is essentially a 
group of tables intended for managing logical replication. See Section 
30.1  for details about how 
publications fit into logical replication setup. )



+  
+Conflicts happen when the replicated changes is breaking any
+specified constraints (with the exception of foreign keys which are
+not checked). Currently conflicts are not resolved automatically and
+cause replication to be stopped with an error until the conflict is
+manually resolved.

What options are there for manually resolving conflicts?  Is the only 
option to change the data on the subscriber to avoid the conflict?
I assume there isn't a way to flag a particular row coming from the 
publisher and say ignore it.  I don't think this is something we need to 
support for the first version.



+  Architecture
+  
+Logical replication starts by copying a snapshot of the data on
+the Provider database. Once that is done, the changes on Provider

I notice the user of 'Provider' above do you intend to update that to 
'Publisher' or does provider mean something different. If we like the 
'publication' terminology then I think 'publishers' should publish them 
not providers.



I'm trying to test a basic subscription and I do the following

I did the following:

cluster 1:
create database test1;
create table a(id serial8 primary key,b text);
create publication testpub1;
 alter publication testpub1 add table a;
insert into a(b) values ('1');

cluster2
create database test1;
create table a(id serial8 primary key,b text);
create subscription testsub2 publication testpub1 connection 
'host=localhost port=5440 dbname=test1';

NOTICE:  created replication slot "testsub2" on provider
NOTICE:  synchronized table states
CREATE SUBSCRIPTION

This resulted in
LOG:  logical decoding found consistent point at 0/15625E0
DETAIL:  There are no running transactions.
LOG:  exported logical decoding snapshot: "0494-1" with 0 
transaction IDs

LOG:  logical replication apply for subscription testsub2 started
LOG:  starting logical decoding for slot "testsub2"
DETAIL:  streaming transactions committing after 0/1562618, reading WAL 
from 0/15625E0

LOG:  logical decoding found consistent point at 0/15625E0
DETAIL:  There are no running transactions.
LOG:  logical replication sync for subscription testsub2, table a started
LOG:  logical decoding found consistent point at 0/1562640
DETAIL:  There are no running transactions.
LOG:  exported logical decoding snapshot: "0495-1" with 0 
transaction IDs

LOG:  logical replication synchronization worker finished processing


The initial sync completed okay, then I did

insert into a(b) values ('2');

but the second insert never replicated.

I had the following output

LOG:  terminating walsender process due to 

Re: [HACKERS] pglogical - logical replication contrib module

2016-02-02 Thread Steve Singer

On 01/26/2016 10:43 AM, Craig Ringer wrote:
On 23 January 2016 at 11:17, Steve Singer <st...@ssinger.info 
<mailto:st...@ssinger.info>> wrote:





** Schema changes involving rewriting big tables

Sometimes you have a DDL change on a large table that will involve
a table rewrite and the best way of deploying the change is to
make the DDL change
on a replicate then once it is finished promote the replica to the
origin in some controlled fashion.  This avoids having to lock the
table on the origin for hours.

pglogical seems to allow minor schema changes on the replica such
as changing a type but it doesn't seem to allow a DO INSTEAD
trigger on the replica.  I don't think pglogical currently meets
this use case particularly well


I'm not sure I fully understand that one.


Say you have a table A with column b

and the next version of your application you want to create a second 
table B that has column B


create table B (b text);
insert into B select b from a;
alter table a drop column b;

but you want to do this on a replica because it is a very big table and 
you want to minimize downtown.


You could have a trigger on the replica that performed updates on B.b 
instead of A except triggers don't seem to get run on the replica.




 Craig Ringer http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pglogical - logical replication contrib module

2016-02-02 Thread Steve Singer
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:tested, failed

Here is some more review

+- `pglogical.replication_set_add_table(set_name name, table_name regclass, 
synchronize boolean)`
+  Adds a table to replication set.
+
+  Parameters:
+  - `set_name` - name of the existing replication set
+  - `table_name` - name or OID of the table to be added to the set
+  - `synchronize` - if true, the table data is synchronized on all subscribers
+which are subscribed to given replication set, default false
+

The argument to this function is actually named "relation" not "table_name" 
though we might want to update the function to name the argument table_name.

Also we don't explain what 'synchronize' means I first thought that a value of 
false would mean that existing data won't be copied but any new changes will be.
A value of false actually seems to mean that nothing will happen with the table 
until the synchronize function is manually called. We seem to be using the word 
'synchronize' in different sense in different places I find it confusing (ie 
synchronize_data and syncronize_structure in create_subscription).





*** a/contrib/pglogical/pglogical_sync.c
--- b/contrib/pglogical/pglogical_sync.c
+ static void
+ dump_structure(PGLogicalSubscription *sub, const char *snapshot)
+ {
+   charpg_dump[MAXPGPATH];
+   uint32  version;
+   int res;
+   StringInfoData  command;
+
+   if (find_other_exec_version(my_exec_path, PGDUMP_BINARY, , pg_dump))
+   elog(ERROR, "pglogical subscriber init failed to find pg_dump relative 
to binary %s",
+my_exec_path);
+
+   if (version / 100 != PG_VERSION_NUM / 100)
+   elog(ERROR, "pglogical subscriber init found pg_dump with wrong major 
version %d.%d, expected %d.%d",
+version / 100 / 100, version / 100 % 100,
+PG_VERSION_NUM / 100 / 100, PG_VERSION_NUM / 100 % 100);
+
+   initStringInfo();
+ #if PG_VERSION_NUM < 90500
+   appendStringInfo(, "%s --snapshot=\"%s\" -s -N %s -N 
pglogical_origin -F c -f \"/tmp/pglogical-%d.dump\" \"%s\"",
+ #else
+   appendStringInfo(, "%s --snapshot=\"%s\" -s -N %s -F c -f 
\"/tmp/pglogical-%d.dump\" \"%s\"",

1) I am not sure we can assume/require that the pg_dump binary be in the same 
location as the postgres binary.  I don't know think we've ever required that 
client binaries (ie psql, pg_dump, pg_restore ...) be in the same directory as 
postgres.  pg_upgrade does require this so maybe this isn't a problem in 
practice but I thought I'd point it out. Ideally wouldn't need to call an 
external program to get a schema dump but turning pg_dump into a library is 
beyond the scope of this patch.


2) I don't think we can hard-coded /tmp as the directory for the schema dump.  
I don't think will work on most windows systems and even on a unix system 
$TMPDIR might be set to something else.  Maybe writing this into pgsql_tmp 
would be a better choice.  

Furtherdown in
pglogical_sync_subscription(PGLogicalSubscription *sub)
+   switch (status)
+   {
+   /* Already synced, nothing to do except cleanup. */
+   case SYNC_STATUS_READY:
+   MemoryContextDelete(myctx);
+   return;
+   /* We can recover from crashes during these. */
+   case SYNC_STATUS_INIT:
+   case SYNC_STATUS_CATCHUP:
+   break;
+   default:
+   elog(ERROR,
+"subscriber %s initialization failed during nonrecoverable 
step (%c), please try the setup again",
+sub->name, status);
+   break;
+   }

I think the default case needs to do something to unregister the background 
worker.  We already discussed trying to get the error message to a user in a 
better way either way there isn't any sense in this background worker being 
launched again if the error is nonrecoverable.


+
+   tables = copy_replication_sets_data(sub->origin_if->dsn,
+   sub->target_if->dsn,
+   snapshot,
+   sub->replication_sets);
+
+   /* Store info about all the synchronized tables. */
+   StartTransactionCommand();
+   foreach (lc, tables)

Shouldn't we be storing the info about the synchronized tables as part of the 
same transaction that does the sync?  


I'll keeping going through the code as I have time.   I think it is appropriate 
to move this to the next CF since the CF is past the end date and the patch has 
received some review.   When you have an updated version of the patch post it, 
don't wait until March.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pglogical - logical replication contrib module

2016-01-22 Thread Steve Singer
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested


This reply will covers a 10,000 foot level review of the feature (some of my 
other replies to the thread cover specifics that came up in testing and code 
level review will come later)

1) Do we want logical replication in core/contrib

10 year ago a popular feeling in the postgresql project was that replication 
didn't belong in core because there were too many different styles.  People 
then went on to complain that it were too many replication projects to choose 
from and that they were hard to use and had lots of corner cases.   The 
evolution of WAL based replication showed us how popular in-core replication 
is.  Users like being able to use in-core features and our community process 
tends to produce better quality in-core solutions than external projects.

I am of the opinion that if we can come up with a solution that meets some 
common use cases then it would be good to have those features in core/contrib.  
At this stage I am not going to get into a discussion of a contrib extension 
versus built in as not an extension.   I don't think a single replication 
solution
will ever meet all use-cases.   I feel that the extensible infrastructure we 
have so far built for logical replication means that people who want to develop 
solutions for use-cases not covered will be in a good position.  This doesn't 
mean we can't or shouldn't try to cover some use cases in core.


2) Does this patch provide a set of logical replication features that meet many 
popular use-cases

Below I will review some use-cases and try to assess how pglogical meets them.

 ** Streaming Postgresql Upgrade

pg_upgrade is great for many situations but sometimes you don't want an in 
place upgrade but you want a streaming upgrade.  Possibly because you don't 
want application downtime but instead you just want to point your applications 
at the upgraded database server in a controlled manner.   Othertimes you
might want an option of upgrading to a newer version of PG but maintain the 
option of having to rollback to the older version if things go badly.

I think pglogical should be able to handle this use case pretty well (assuming 
the source version of PG is actually new enough to include pglogical). 
Support for replicating sequences would need to be added before this is as 
smooth but once sequence support was added I think this would work well.
I also don't see any reason why you couldn't replicate from 9.7 -> 9.6 thought 
since the wire format is abstracted from the internal representation.  This is 
of course dependent not the application not doing anything that is inherently 
in-compatible between the two versions 


** Query only replicas (with temp tables or additional indexes)

Sometimes you want a replica for long running or heavy queries.  Requirements 
for temp tables, additional indexes or maybe the effect on vacuum means that 
our existing WAL based replicas are unsuitable. 

I think pglogical should be able to handle this use case pretty well with the 
caveat being that your replica is an asynchronous replica and will always lag
the origin by some amount.

** Replicating a subset of tables into a different database

Sometimes you wan to replicate a handful of tables from one database to another 
database.  Maybe the first database is the system of record for the data and 
the second database needs an up to date copy for querying.

Pglogical should meet this use case pretty well, it has flexible support for 
selecting which tables get replicated from which source.  Pglogical doesn't 
have any facilities to rename the tables between the origin and replica but 
they could be added later.

** Sharding

Systems that do application level sharding (or even sharding with a fdw) often 
have non-sharded tables that need to be available on all shards for relational 
integrity or joins.   Logical replication is one way to make sure that the 
replicated data gets to all the shards.  Sharding systems also sometimes want
to take the data from individual shards and replicate it to a consolidation 
server for reporting purposes.

Pglogical seems to meet this use case, I guess you would have a designated 
origin for the shared data/global data that all shards would subscribe to
with a set containing the designated data.  For the consolidation use case you 
would have the consolidation server subscribe to all shards

I am less clear about how someone would want DDL changes to work for these 
cases.  The DDL support in the patch is pretty limited so I am not going to 
think much now about how we would want DDL to work.


** Schema changes involving rewriting big tables

Sometimes you have a DDL change on a large table that will involve a table 
rewrite and the best way of deploying the change is to make the DDL 

Re: [HACKERS] pglogical - logical replication contrib module

2016-01-16 Thread Steve Singer

On 01/15/2016 12:07 PM, Petr Jelinek wrote:

That's bug, fixed.



Can you posted an updated patch with whatever fixes you have so far made?


There are several statuses the table goes through, during the COPY 
it's in synchronizing status, so next logical step seemed to be 
synchronized. Maybe it should be renamed to 'replicating' instead as 
that's what it actually means (table has finished synchronization and 
is now replicating normally).




I agree 'replicating' is clearer



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pglogical - logical replication contrib module

2016-01-10 Thread Steve Singer

On 01/09/2016 01:30 PM, Steve Singer wrote:

On 12/31/2015 06:34 PM, Petr Jelinek wrote:



I'm not really sure what to do to 'recover' my cluster at this point 
so I'll send this off and rebuild my cluster and start over.





I had a setup test1--->test2 (with 2 tables in the default set)

I then created a third database (all three hosted on the same PG cluster)

In the third database (test3)
test3=# create extension pglogical;
CREATE EXTENSION
test3=# select pglogical.create_node(node_name:='test3', 
dsn:='host=localhost dbname=test3 port=5436');

 create_node
-
  2001662995
(1 row)

test3=# select 
pglogical.create_subscription(subscription_name:='defaultsub',provider_dsn:='host=localhost 
dbname=test2 port=5436');

 create_subscription
-
  2974019075

It copied the schema over but not the data (if I use test2 as the 
provider_dsn then it does copy the data).


I then tried inserting a row into a table on test1.  Things crashed and 
after crash recovery I keep getting


2016-01-10 13:03:15 EST LOG:  database system is ready to accept connections
2016-01-10 13:03:15 EST LOG:  autovacuum launcher started
2016-01-10 13:03:15 EST LOG:  starting apply for subscription defaultsub
2016-01-10 13:03:15 EST LOG:  starting apply for subscription defaultsub
2016-01-10 13:03:15 EST test2LOG:  starting logical decoding for slot 
"pgl_test3

_test2_defaultsub"
2016-01-10 13:03:15 EST test2DETAIL:  streaming transactions committing 
after 0/

18292D8, reading WAL from 0/18292D8
2016-01-10 13:03:15 EST test2LOG:  logical decoding found consistent 
point at 0/

18292D8
2016-01-10 13:03:15 EST test2DETAIL:  Logical decoding will begin using 
saved sn

apshot.
TRAP: FailedAssertion("!(IsTransactionState())", File: "catcache.c", 
Line: 1127)

2016-01-10 13:03:15 EST test2LOG:  unexpected EOF on standby connection
2016-01-10 13:03:15 EST LOG:  worker process: pglogical apply 
17016:2974019075 (

PID 24746) was terminated by signal 6: Aborted

The stack trace is

#3  0x007b83af in SearchCatCache (cache=0xe27d18, v1=15015784,
v2=v2@entry=0, v3=v3@entry=0, v4=v4@entry=0) at catcache.c:1127
#4  0x007c503e in SearchSysCache (cacheId=cacheId@entry=47,
key1=, key2=key2@entry=0, key3=key3@entry=0,
key4=key4@entry=0) at syscache.c:981
#5  0x006996d4 in replorigin_by_name (
roname=0xe51f30 "pgl_test2_test1_defaultsub",
missing_ok=missing_ok@entry=0 '\000') at origin.c:216
#6  0x7fdb54a908d3 in handle_origin (s=0x7ffd873f6da0)
at pglogical_apply.c:235
#7  replication_handler (s=0x7ffd873f6da0) at pglogical_apply.c:1031
#8  apply_work (streamConn=streamConn@entry=0xe84fb0) at 
pglogical_apply.c:1309

#9  0x7fdb54a911cc in pglogical_apply_main (main_arg=)
at pglogical_apply.c:1691
#10 0x00674912 in StartBackgroundWorker () at bgworker.c:726
---Type  to continue, or q  to quit---
#11 0x0067f7e2 in do_start_bgworker (rw=0xe03890) at 
postmaster.c:5501

#12 maybe_start_bgworker () at postmaster.c:5676
#13 0x00680206 in sigusr1_handler 
(postgres_signal_arg=)

at postmaster.c:4937
#14 
#15 0x7fdb54fa2293 in __select_nocancel ()
at ../sysdeps/unix/syscall-template.S:81
#16 0x00468285 in ServerLoop () at postmaster.c:1648
#17 0x0068161e in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0xddede0) at postmaster.c:1292
#18 0x0046979d in main (argc=3, argv=0xddede0) at main.c:223



I tried dropping the subscription and re-adding it.  I keep getting

2016-01-10 13:21:48 EST test1LOG:  logical decoding found consistent 
point at 0/1830080

2016-01-10 13:21:48 EST test1DETAIL:  There are no running transactions.
2016-01-10 13:21:48 EST test1LOG:  exported logical decoding snapshot: 
"04DE-1" with 0 transaction IDs

2016-01-10 13:21:48 EST test3ERROR:  relation "a" already exists
2016-01-10 13:21:48 EST test3STATEMENT:  CREATE TABLE a (
a integer NOT NULL,
b integer
);



pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 182; 1259 16700 TABLE a 
ssinger
pg_restore: [archiver (db)] could not execute query: ERROR: relation "a" 
already exists

Command was: CREATE TABLE a (
a integer NOT NULL,
b integer
);



2016-01-10 13:21:48 EST ERROR:  could not execute command 
"/usr/local/pgsql96gitlogical/bin/pg_restore --section="pre-data" 
--exit-on-error -1 -d "host=localhost dbname=test3 port=5436" 
"/tmp/pglogical-28079.dump""
2016-01-10 13:21:48 EST test1LOG:  unexpected EOF on client connection 
with an open transaction
2016-01-10 13:21:48 EST LOG:  worker process: pglogical apply 
17016:844915593 (PID 28079) exited with exit code 1
2016-01-10 13:21:48 EST ERROR:  subscriber defaultsub4 initialization 
failed during nonrecoverable step (s), please try the setup again


Which is 

Re: [HACKERS] pglogical - logical replication contrib module

2016-01-09 Thread Steve Singer

On 12/31/2015 06:34 PM, Petr Jelinek wrote:

Hi,

I'd like to submit the replication solution which is based on the 
pglogical_output [1] module (which is obviously needed for this to 
compile).


The pglogical contrib module provides extension which does the 
master-slave logical replication based on the logical decoding.


The basic documentation is in README.md, I didn't bother making sgml 
docs yet since I expect that there will be ongoing changes happening 
and it's easier for me to update the markdown docs than sgml. I will 
do the conversion once we start approaching committable state.
I am going to send my comments/issues out in batches as I find them 
instead of waiting till I look over everything.



I find this part of the documentation a bit unclear


+Once the provider node is setup, subscribers can be subscribed to it. 
First the

+subscriber node must be created:
+
+SELECT pglogical.create_node(
+node_name := 'subscriber1',
+dsn := 'host=thishost port=5432 dbname=db'
+);
+

My initial reading was that I should execute this on the provider node.  
Perhaps instead

-
Once the provider node is setup you can then create subscriber nodes.  
Create the subscriber nodes and

then execute the following commands on each subscriber node

create extension pglogical

select pglogical.create_node(node_name:='subsriberX',dsn:='host=thishost 
dbname=db port=5432');


---

Also the documentation for create_subscription talks about

+  - `synchronize_structure` - specifies if to synchronize structure from
+provider to the subscriber, default true



I did the following

test2=# select pglogical.create_subscription(subscription_name:='default 
sub',provider_dsn:='host=localhost dbname=test1 port=5436');

 create_subscription
-
   247109879


Which then resulted in the following showing up in my  PG log

LOG:  worker process: pglogical apply 16542:247109879 (PID 4079) exited 
with exit code 1
ERROR:  replication slot name "pgl_test2_test1_default sub" contains 
invalid character
HINT:  Replication slot names may only contain lower case letters, 
numbers, and the underscore character.
FATAL:  could not send replication command "CREATE_REPLICATION_SLOT 
"pgl_test2_test1_default sub" LOGICAL pglogical_output": status 
PGRES_FATAL_ERROR: ERROR:  replication slot name 
"pgl_test2_test1_default sub" contains invalid character
HINT:  Replication slot names may only contain lower case letters, 
numbers, and the underscore character.



The create_subscription command should check if the subscription name is 
valid (meets the rules that will be applied against the slot command).


I wondered how I could fix my mistake.

The docs say

+- `pglogical.pglogical_drop_subscription(subscription_name name, 
ifexists bool)`

+  Disconnects the subscription and removes it from the catalog.
+

test2=# select pglogical.pglogical_drop_subscription('default sub', true);
ERROR:  function pglogical.pglogical_drop_subscription(unknown, boolean) 
does not exist



The command is actually called pglogical.drop_subscription the docs 
should be fixed to show the actual command name



I then wanted to add a second table to my database. ('b').

select pglogical.replication_set_add_table('default','public.b',true);
 replication_set_add_table
---
 t
(1 row)

In my pglog I then got

LOG:  starting sync of table public.b for subscriber defaultsub
ERROR:  replication slot name "pgl_test2_test1_defaultsub_public.b" 
contains invalid character
HINT:  Replication slot names may only contain lower case letters, 
numbers, and the underscore character.
FATAL:  could not send replication command "CREATE_REPLICATION_SLOT 
"pgl_test2_test1_defaultsub_public.b" LOGICAL pglogical_output": status 
PGRES_FATAL_ERROR: ERROR:  replication slot name 
"pgl_test2_test1_defaultsub_public.b" contains invalid character
HINT:  Replication slot names may only contain lower case letters, 
numbers, and the underscore character.



I then did

test1=# select pglogical.replication_set_remove_table('default','public.b');
 replication_set_remove_table
--
 t
(1 row)


but my log still keep repeating the error, so I tried connecting to the 
replica and did the same


test2=# select pglogical.replication_set_remove_table('default','public.b');
ERROR:  replication set mapping -303842815:16726 not found

Is there any way to recover from this situation?

The documenation says I can drop a replication set, maybe that will let 
replication continue.


+- `pglogical.delete_replication_set(set_name text)`
+  Removes the replication set.
+

select pglogical.delete_replication_set('default');
ERROR:  function pglogical.delete_replication_set(unknown) does not exist
LINE 1: select pglogical.delete_replication_set('default');
   ^
HINT:  No function matches the given name and argument types. You might 
need to add explicit type 

Re: [HACKERS] pglogical - logical replication contrib module

2016-01-08 Thread Steve Singer

On 12/31/2015 06:34 PM, Petr Jelinek wrote:

Hi,

I'd like to submit the replication solution which is based on the 
pglogical_output [1] module (which is obviously needed for this to 
compile).




Hi,

make check  gives me


for extra in ../../contrib/pglogical_output contrib/pglogical; do make 
-C '../..'/$extra DESTDIR='/usr/local/src/postgresql'/tmp_install 
install >>'/usr/local/src/postgresql'/tmp_install/log/install.log || 
exit; done
make[1]: *** ../../../../contrib/pglogical_output: No such file or 
directory.  Stop.

../../src/Makefile.global:325: recipe for target 'temp-install' failed
make: *** [temp-install] Error 2
ssinger@ssinger-laptop:/usr/local/src/postgresql/contrib/pglogical$

The attached patch fixes that but it then is creating the test database 
'contrib_regression' not 'regression'
changing pglogical.provider_dsn = 'contrib_regression' still leaves me 
with a lot of failures.



diff --git a/contrib/pglogical/Makefile b/contrib/pglogical/Makefile
new file mode 100644
index 1640f63..a4dab88
*** a/contrib/pglogical/Makefile
--- b/contrib/pglogical/Makefile
*** include $(top_srcdir)/contrib/contrib-gl
*** 27,34 
  # typical installcheck users do not have (e.g. buildfarm clients).
  @installcheck: ;
  
! EXTRA_INSTALL += $(top_srcdir)/contrib/pglogical_output
! EXTRA_REGRESS_OPTS += $(top_srcdir)/contrib/pglogical/regress-postgresql.conf
  
  override CPPFLAGS := $(CPPFLAGS) -I$(top_srcdir)/contrib/pglogical_output
  
--- 27,34 
  # typical installcheck users do not have (e.g. buildfarm clients).
  @installcheck: ;
  
! EXTRA_INSTALL += contrib/pglogical_output
! EXTRA_REGRESS_OPTS += --temp-config=regress-postgresql.conf
  
  override CPPFLAGS := $(CPPFLAGS) -I$(top_srcdir)/contrib/pglogical_output
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapshot too old, configured by time

2015-11-08 Thread Steve Singer

On 10/15/2015 05:47 PM, Kevin Grittner wrote:
All other issues raised by Álvaro and Steve have been addressed, 
except for this one, which I will argue against: 


I've been looking through the updated patch


In snapmgr.c


+ * XXX: If we can trust a read of an int64 value to be atomic, we can 
skip the

+ * spinlock here.
+ */
+int64
+GetOldSnapshotThresholdTimestamp(void)


Was your intent with the XXX for it to be a TODO to only aquire the lock 
on platforms without the atomic 64bit operations?


On a more general note:

I've tried various manual tests of this feature and it sometimes works 
as expected and sometimes doesn't.
I'm getting the feeling that how I expect it to work isn't quite in sync 
with how it does work.


I'd expect the following to be sufficient to generate the test

T1: Obtains a snapshot that can see some rows
T2: Waits 60 seconds and performs an update on those rows
T2: Performs a vacuum
T1: Waits 60 seconds, tries to select from the table.  The snapshot 
should be too old



For example it seems that in test 002 the select_ok on conn1 following 
the vacuum but right before the final sleep is critical to the snapshot 
too old error showing up (ie if I remove that select_ok but leave in the 
sleep I don't get the error)


Is this intended and if so is there a better way we can explain how 
things work?



Also is 0 intended to be an acceptable value for old_snapshot_threshold 
and if so what should we expect to see then?






So if I understand correctly, every call to BufferGetPage needs to have
a TestForOldSnapshot immediately afterwards?  It seems easy to later
introduce places that fail to test for old snapshots.  What happens if
they do?  Does vacuum remove tuples anyway and then the query returns
wrong results?  That seems pretty dangerous.  Maybe the snapshot could
be an argument of BufferGetPage?

There are 486 occurences of BufferGetPage in the source code, and
this patch follows 36 of them with TestForOldSnapshot.  This only
needs to be done when a page is going to be used for a scan which
produces user-visible results.  That is, it is *not* needed for
positioning within indexes to add or vacuum away entries, for heap
inserts or deletions (assuming the row to be deleted has already
been found).  It seems wrong to modify about 450 BufferGetPage
references to add a NULL parameter; and if we do want to make that
noop change as insurance, it seems like it should be a separate
patch, since the substance of this patch would be buried under the
volume of that.

I will add this to the November CF.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapshot too old, configured by time

2015-09-08 Thread Steve Singer

On 08/31/2015 10:07 AM, Kevin Grittner wrote:


Kevin,

I've started to do a review on this patch but I am a bit confused with 
some of what I am seeing.


The attached testcase fails I  replace the cursor in your test case with 
direct selects from the table.
I would have expected this to generate the snapshot too old error as 
well but it doesn't.




#   Failed test 'expect "snapshot too old" error'
#   at t/002_snapshot_too_old_select.pl line 64.
#  got: ''
# expected: '72000'
# Looks like you failed 1 test of 9.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/9 subtests

Am I misunderstanding something or is the patch not working as expected?





As discussed when the "proof of concept" patch was submitted during
9.5 development, here is a version intended to be considered for
commit to 9.6, with the following changes:

1. It is configured using time rather than number of transactions.
Not only was there unanimous agreement here that this was better,
but the EDB customer who had requested this feature and who had
been testing it independently made the same request.

2. The "proof of concept" patch only supported heap and btree
checking; this supports all index types.

3. Documentation has been added.

4. Tests have been added.  They are currently somewhat minimal,
since this is using a whole new technique for testing from any
existing committed tests -- I wanted to make sure that this
approach to testing was OK with everyone before expanding it.  If
it is, I assume we will want to move some of the more generic
portions to a .pm file to make it available for other tests.

Basically, this patch aims to limit bloat when there are snapshots
that are kept registered for prolonged periods.  The immediate
reason for this is a customer application that keeps read-only
cursors against fairly static data open for prolonged periods, and
automatically fields SQLSTATE 72000 to re-open them if necessary.
When used, it should also provide some protections against extreme
bloat from forgotten "idle in transaction" connections which are
left holding a snapshot.

Once a snapshot reaches the age threshold, it can be terminated if
reads data modified after the snapshot was built.  It is expected
that useful ranges will normally be somewhere from a few hours to a
few days.

By default old_snapshot_threshold is set to -1, which disables the
new behavior.

The customer has been testing a preliminary version of this
time-based patch for several weeks, and is happy with the results
-- it is preventing bloat for them and not generating "snapshot too
old" errors at unexpected times.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






002_snapshot_too_old_select.pl
Description: Perl program

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] libpq: Allow specifying multiple host names to try to connect to

2015-07-08 Thread Steve Singer

On 04/19/2015 11:18 AM, Mikko Tiihonen wrote:


Hi,


I would like allow specifying multiple host names for libpq to try to 
connecting to. This is currently only supported if the host name 
resolves to multiple addresses. Having the support for it without 
complex dns setup would be much easier.



Example:

psql -h dbslave,dbmaster -p 5432 dbname

psql 'postgresql://dbslave,dbmaster:5432/dbname'


Here the idea is that without any added complexity of pgbouncer or 
similar tool I can get any libpq client to try connecting to multiple 
nodes until one answers. I have added the similar functionality to the 
jdbc driver few years ago.



Because libpq almost supported the feature already the patch is very 
simple. I just split the given host name and do a dns lookup on each 
separately, and link the results.



If you configure a port that does not exist you can see the libpq 
trying to connect to multiple hosts.



psql -h 127.0.0.2,127.0.0.3, -p 

psql: could not connect to server: Connection refused
Is the server running on host 127.0.0.2,127.0.0.3 (127.0.0.2) 
and accepting

TCP/IP connections on port ?
could not connect to server: Connection refused
Is the server running on host 127.0.0.2,127.0.0.3 (127.0.0.3) 
and accepting

TCP/IP connections on port ?

Further improvement would be to add a connection parameter to limit 
connection only to master (writable) or to slave (read only).






I like the idea of allowing multiple hosts to be specified where if it 
can't connect to the server libpq will try the next host.



psql -h dns-fail-name,localhost
psql: could not translate host name dns-fail-name,localhost to 
address: System error



If name in the list doesn't resolve it fails to try the next name. I 
think it should treat this the same as connection refused.


In the error messages when it can't connect to a host you print the 
entire host string not the actual host being connected to. Ie
Is the server running on host 127.0.0.2,127.0.0.3 (127.0.0.3) and 
accepting


It should print just the host that had the failed connection.

We also need to decide how we want this feature to behave if libpq can 
contact the postmaster but can't establish a connection (user/password 
failure, the database is in recovery mode etc..) do we want to try the 
next host or stop.


My thinking is that the times you would actually use this feature are

1) To connect to a group of replica systems (either read-only streaming 
replicas or FDW proxies or BDR machines)
2) To connect to a cluster of pgbouncer or plproxy systems so the proxy 
isn't a single point of failure
3) To connect to a set of servers master1, standby-server1, 
standby-server2  where you would want it to try the next server in the list.


In all of these cases I think you would want to try the next machine in 
the list if you can't actually establish a usable connection.
I also don't think the patch is enough to be helpful with  case 3 since 
you don't actually want a connection to a standby-server unless that 
server has been promoted to the master.


Another concern I have is that the server needs to be listening on the 
same port against all hosts this means that in a development environment 
we can't fully test this feature using just a single server.  I can't 
think of anything else we have in core that couldn't be tested on a 
single server (all the replication stuff works fine if you setup two 
separate clusters on different ports on one server)


You update the documentation just for  psql but your change effects any 
libpq application if we go forward with this patch we should update the 
documentation for libpq as well.


This approach seems to work with the url style of conninfo

For example
 postgres://some-down-host.info,some-other-host.org:5435/test1

seems to work as expected but I don't like that syntax I would rather see
postgres://some-down-host.info:5435/test1,postgres://some-other-host.org:5435/test1

This would be a more invasive change but I think the syntax is more usable.






-Mikko







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] HINT: pg_hba.conf changed since last config reload

2014-12-20 Thread Steve Singer

On 12/19/2014 10:41 AM, Alex Shulgin wrote:

I don't think so.  The scenario this patch relies on assumes that the
DBA will remember to look in the log if something goes wrong, and in
your case there would be a message like the following:

WARNING:  pg_hba.conf not reloaded

So an extra hint about file timestamp is unneeded.


That makes sense to me.
I haven't found any new issues with this patch.
I think it is ready for a committer.




--
Alex






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] HINT: pg_hba.conf changed since last config reload

2014-12-19 Thread Steve Singer

On 12/15/2014 11:38 AM, Alex Shulgin wrote:

These are all valid concerns IMHO. Attached is the modified version of 
the original patch by Craig, addressing the handling of the new 
hint_log error data field and removing the client-side HINT. I'm also 
moving this to the current CF. -- Alex






The updated patch removes the client message. I feel this addresses 
Peter's concern.  The updated patch also addresses the missing free I 
mentioned in my original review.


The patch applies cleanly to head,

One thing I'm noticed while testing is that if you do the following

1. Edit pg_hba.conf  to allow a connection from somewhere
2. Attempt to connect, you get an error + the hind in the server log
3. Make another change to pg_hba.conf and introduce an error in the file
4. Attempt to reload the config files, pg_hba.conf the reload fails 
because of the above error
5. Attempt to connect you still can't connect since we have the original 
pg_hba.conf loaded


You don't get the warning in step 5 since we update PgReloadTime even if 
the reload didn't work.


Is this enough of a concern to justify the extra complexity in tracking 
the reload time of the pg_hba and pg_ident times directly?









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-19 Thread Steve Singer

On 11/19/2014 08:22 AM, Alvaro Herrera wrote:


I think we're overblowing the pg_upgrade issue.  Surely we don't need to
preserve commit_ts data when upgrading across major versions; and
pg_upgrade is perfectly prepared to remove old data when upgrading
(actually it just doesn't copy it; consider pg_subtrans or pg_serial,
for instance.)  If we need to change binary representation in a future
major release, we can do so without any trouble.



That sounds reasonable. I am okay with Petr removing the LSN portion 
this patch.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding - reading a user catalog table

2014-11-17 Thread Steve Singer

On 11/16/2014 04:49 PM, Steve Singer wrote:



I installed things following the above steps on a different system 
than my usual development laptop and I have been unable to reproduce 
the error so for (on that system).  But I am still able to reproduce 
it on occasion on my normal development laptop.






After continuously running the test I was eventually able to reproduce 
the error on the other system as well.




Greetings,

Andres Freund









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding - reading a user catalog table

2014-11-17 Thread Steve Singer

On 11/17/2014 10:37 AM, Andres Freund wrote:

On 2014-11-13 22:23:02 -0500, Steve Singer wrote:


Also since updating (to 2c267e47afa4f9a7c) I've seen a assertion failure in
a normal client connection, not the walsender

#3  0x006b4978 in GetSerializableTransactionSnapshotInt (
 snapshot=snapshot@entry=0xbfa8a0 CurrentSnapshotData,
 sourcexid=sourcexid@entry=0) at predicate.c:1738
#4  0x006b66c3 in GetSafeSnapshot (origSnapshot=optimized out)
 at predicate.c:1517
#5  GetSerializableTransactionSnapshot (
 snapshot=0xbfa8a0 CurrentSnapshotData) at predicate.c:1598
#6  0x007d16dd in GetTransactionSnapshot () at snapmgr.c:200
#7  0x006c0e35 in exec_simple_query (
 query_string=0x1fd01b8 select ev_origin, ev_seqno, ev_timestamp,
ev_snapshot, \pg_catalog\.txid_snapshot_xmin(ev_snapshot),
\pg_catalog\.txid_snapshot_xmax(ev_snapshot),
coalesce(ev_provider_xid,\...)
 at postgres.c:959
#8  PostgresMain (argc=optimized out, argv=argv@entry=0x1f5ab50,


I have no idea if this has anything to do with your recent changes or some
other change. I haven't so far been able to replicate that since the first
time I saw it.
That crash is decidedly odd. Any chance you still have the full
backtrace around?


Yes I still have the core file



This is in the SSI code... I'm not immediately seeing how this could be
related to logical decoding. It can't even be a imported snapshot,
because the exported snapshot is marked repeatable read.

Are you actually using serializable transactions? If so, how and why?


Yes, the test client that performs the 'simulated workload' does some 
serializable transactions.  It connects as a normal client via JDBC and 
does a prepareStatement(SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) 
and then does some DML. Why? because it seemed like a good thing to 
include in the test suite.


Your right this might have nothing to do with logical decoding.   I 
haven't been able to reproduce again either, I can't even say
 if this problem was introduced to 9.4 in the past month or if it has 
been around much longer and I just haven't hit it before.







Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding - reading a user catalog table

2014-11-17 Thread Steve Singer

On 11/17/2014 11:34 AM, Andres Freund wrote:

Hi,

Kevin: CCed you, because it doesn't really look like a logical decoding
related issue.

On 2014-11-17 11:25:40 -0500, Steve Singer wrote:

On 11/17/2014 10:37 AM, Andres Freund wrote:

On 2014-11-13 22:23:02 -0500, Steve Singer wrote:


Also since updating (to 2c267e47afa4f9a7c) I've seen a assertion failure in
a normal client connection, not the walsender

#3  0x006b4978 in GetSerializableTransactionSnapshotInt (
 snapshot=snapshot@entry=0xbfa8a0 CurrentSnapshotData,
 sourcexid=sourcexid@entry=0) at predicate.c:1738
#4  0x006b66c3 in GetSafeSnapshot (origSnapshot=optimized out)
 at predicate.c:1517
#5  GetSerializableTransactionSnapshot (
 snapshot=0xbfa8a0 CurrentSnapshotData) at predicate.c:1598
#6  0x007d16dd in GetTransactionSnapshot () at snapmgr.c:200
#7  0x006c0e35 in exec_simple_query (
 query_string=0x1fd01b8 select ev_origin, ev_seqno, ev_timestamp,
ev_snapshot, \pg_catalog\.txid_snapshot_xmin(ev_snapshot),
\pg_catalog\.txid_snapshot_xmax(ev_snapshot),
coalesce(ev_provider_xid,\...)
 at postgres.c:959
#8  PostgresMain (argc=optimized out, argv=argv@entry=0x1f5ab50,


I have no idea if this has anything to do with your recent changes or some
other change. I haven't so far been able to replicate that since the first
time I saw it.
That crash is decidedly odd. Any chance you still have the full
backtrace around?

Yes I still have the core file

Cool, could you show the full thing? Or did you just snip it because
it's just the Assert/ExceptionalCondition thing?


I just snipped the exception stuff. Here is the full thing

(gdb) backtrace
#0  0x7f6fb22e8295 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f6fb22eb438 in __GI_abort () at abort.c:90
#2  0x007a08e7 in ExceptionalCondition (
conditionName=conditionName@entry=0x918e88 
!(TransactionIdFollows(snapshot-xmin, PredXact-SxactGlobalXmin)),

errorType=errorType@entry=0x7da390 FailedAssertion,
fileName=fileName@entry=0x9182c3 predicate.c,
lineNumber=lineNumber@entry=1738) at assert.c:54
#3  0x006b4978 in GetSerializableTransactionSnapshotInt (
snapshot=snapshot@entry=0xbfa8a0 CurrentSnapshotData,
sourcexid=sourcexid@entry=0) at predicate.c:1738
#4  0x006b66c3 in GetSafeSnapshot (origSnapshot=optimized out)
at predicate.c:1517
#5  GetSerializableTransactionSnapshot (
snapshot=0xbfa8a0 CurrentSnapshotData) at predicate.c:1598
#6  0x007d16dd in GetTransactionSnapshot () at snapmgr.c:200
#7  0x006c0e35 in exec_simple_query (
query_string=0x1fd01b8 select ev_origin, ev_seqno, 
ev_timestamp,ev_snapshot, 
\pg_catalog\.txid_snapshot_xmin(ev_snapshot), 
\pg_catalog\.txid_snapshot_xmax(ev_snapshot), 
coalesce(ev_provider_xid,\...)

at postgres.c:959
#8  PostgresMain (argc=optimized out, argv=argv@entry=0x1f5ab50,
---Type return to continue, or q return to quit---
dbname=0x1f5ab30 test2, username=optimized out) at postgres.c:4016
#9  0x0046a08e in BackendRun (port=0x1f83010) at postmaster.c:4123
#10 BackendStartup (port=0x1f83010) at postmaster.c:3797
#11 ServerLoop () at postmaster.c:1576
#12 0x00665eae in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x1f59d00) at postmaster.c:1223
#13 0x0046ab58 in main (argc=3, argv=0x1f59d00) at main.c:227




Could you print *snapshot in frame #3?


(gdb) p *snapshot
$1 = {satisfies = 0x7d0330 HeapTupleSatisfiesMVCC, xmin = 412635,
  xmax = 412637, xip = 0x1f86e40, xcnt = 1, subxcnt = 0, subxip = 
0x1fd2190,
  suboverflowed = 0 '\000', takenDuringRecovery = 0 '\000', copied = 0 
'\000',

  curcid = 0, active_count = 0, regd_count = 0}
(gdb)



This is in the SSI code... I'm not immediately seeing how this could be
related to logical decoding. It can't even be a imported snapshot,
because the exported snapshot is marked repeatable read.

Are you actually using serializable transactions? If so, how and why?

Yes, the test client that performs the 'simulated workload' does some
serializable transactions.  It connects as a normal client via JDBC and does
a prepareStatement(SET TRANSACTION ISOLATION LEVEL SERIALIZABLE) and then
does some DML. Why? because it seemed like a good thing to include in the
test suite.

Yes, it's a good reason as the above backtrace proves ;). I'm just
trying to understand uner which circumstances it happens.

The above backtrace looks like it can only be triggered if you do a
BEGIN TRANSACTION SERIALIZABLE DEFERRABLE READ ONLY; Is that something
you do?


The slony remote listener connection does this when it selects from 
sl_event.  We switched to this shortly after 9.1 came out.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding - reading a user catalog table

2014-11-16 Thread Steve Singer

On 11/13/2014 02:44 PM, Andres Freund wrote:

Hi Steve,



If it still happens, could you send me instructions of how to reproduce
the problem after cloning the necessary source repositories? It's quite
hard to validate a possible fix otherwise.


1. Install PG 9.4

2. Perform an initdb
   max_connections = 200
   wal_level=logical
   max_walsenders=50
   wal_keep_segments = 100
   wal_sender_timeout = 600s
   max_replication_slots = 120



Create a user slon with superuser
create a user test
(set passwords for them you'll use them later)

Configure pg_hba.conf to allow slon to connect as a replication user


3. Download slony from github (https://github.com/ssinger/slony1-engine.git)
   checkout  the branch logical_remote_helper
./configure --with-pgconfigdir=/path_to_your_pgcbindir
make
make install

4. Download clustertest  compile clustertest from 
(https://github.com/clustertest/clustertest-framework).

5. In the slony source tree cd to the clustertest directory
6. cp conf/disorder.properties.sample to conf/disorder.properties
Edit the file to have the correct paths, ports, users, passwords.

7 cp conf/java.properties.sample to conf/java.properties edit the file 
to point at the JAR you downloaded earlier.


I run with all 5 databases on the same PG cluster. Performance will be 
much better with 5 different clusters, but I recommend trying to emulate 
my configuration as much as possible to replicate this


To run the tests then do
./run_all_disorder_tests.sh


At the moment (commit df5acfd1a3) is configured to just run the Failover 
node test cases where I am seeing this not the entire suite.


It typically takes between 30 minutes to an hour to run though.

I installed things following the above steps on a different system than 
my usual development laptop and I have been unable to reproduce the 
error so for (on that system).  But I am still able to reproduce it on 
occasion on my normal development laptop.





Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-14 Thread Steve Singer

On 11/14/2014 08:21 PM, Simon Riggs wrote:

The requested information is already available, as discussed. Logical
decoding adds commit ordering for *exactly* the purpose of using it
for replication, available to all solutions. This often requested
feature has now been added and doesn't need to be added twice.

So what we are discussing is adding a completely superfluous piece of
information.

Not including the LSN info does nothing to trigger based replication,
which will no doubt live on happily for many years. But adding LSN
will slow down logical replication, for no purpose at all.



Simon,
The use cases I'm talking about aren't really replication related. Often 
I have come across systems that want to do something such as 'select * 
from orders where X  the_last_row_I_saw order by X' and then do further 
processing on the order.


This is kind of awkard to do today because you don't have a good 
candidate for 'X' to order on.   Using either a sequence or insert-row 
timestamp doesn't work well because a transaction with a lower value for 
X might end up committing after the higher value in in a query result.


Yes you could setup a logical wal slot and listen on the stream of 
inserts into your order table but thats a lot of administration overhead 
compared to just issuing an SQL query for what really is a query type 
operation.


Using the commit timestamp for my X sounded very tempting but could 
allow duplicates.


One could argue that this patch is about replication features, and 
providing commit ordering for query purposes should be a separate patch 
to add that on top of this infrastructure. I see merit to smaller more 
focused patches but that requires leaving the door open to easily 
extending things later.


It could also be that I'm the only one who wants to order and filter 
queries in this manner (but that would surprise me).  If the commit lsn 
has limited appeal and we decide we don't want it at all  then we 
shouldn't add it.  I've seen this type of requirement in a number of 
different systems at a number of different companies.  I've generally 
seen it dealt with by either selecting rows behind the last now() 
timestamp seen and then filtering out already processed rows or by 
tracking the 'processed' state of each row individually (ie performing 
an update on each row once its been processed) which performs poorly.


Steve







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding - reading a user catalog table

2014-11-13 Thread Steve Singer

On 11/13/2014 02:44 PM, Andres Freund wrote:

H



I've pushed a fix for a bug that could possibly also cause
this. Although it'd be odd that it always hits the user catalog
table. Except if your tests mostly modify the slony tables, but do not
do much DDL otherwise?


The test I was running doesn't do DDL, so only the user catalog tables 
would have changes being tracked.


I still sometimes get the error. When I get sometime on the weekend I'll 
work on some detailed  instructions on how to grab and setup the various 
components to replicate this test.


Also since updating (to 2c267e47afa4f9a7c) I've seen a assertion failure 
in a normal client connection, not the walsender


#3  0x006b4978 in GetSerializableTransactionSnapshotInt (
snapshot=snapshot@entry=0xbfa8a0 CurrentSnapshotData,
sourcexid=sourcexid@entry=0) at predicate.c:1738
#4  0x006b66c3 in GetSafeSnapshot (origSnapshot=optimized out)
at predicate.c:1517
#5  GetSerializableTransactionSnapshot (
snapshot=0xbfa8a0 CurrentSnapshotData) at predicate.c:1598
#6  0x007d16dd in GetTransactionSnapshot () at snapmgr.c:200
#7  0x006c0e35 in exec_simple_query (
query_string=0x1fd01b8 select ev_origin, ev_seqno, 
ev_timestamp,ev_snapshot, 
\pg_catalog\.txid_snapshot_xmin(ev_snapshot), 
\pg_catalog\.txid_snapshot_xmax(ev_snapshot), 
coalesce(ev_provider_xid,\...)

at postgres.c:959
#8  PostgresMain (argc=optimized out, argv=argv@entry=0x1f5ab50,


I have no idea if this has anything to do with your recent changes or 
some other change. I haven't so far been able to replicate that since 
the first time I saw it.





I'll send you tar of the data directory off list with things in this state.


Do you have a testcase that would allow me to easily reproduce the
problem?

I don't have a isolated test case that does this.  The test that I'm hitting
this with does lots of stuff and doesn't even always hit this.

If it still happens, could you send me instructions of how to reproduce
the problem after cloning the necessary source repositories? It's quite
hard to validate a possible fix otherwise.

Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Steve Singer

On 11/10/2014 08:39 AM, Petr Jelinek wrote:

On 09/11/14 17:57, Steve Singer wrote:

On 11/07/2014 07:07 PM, Petr Jelinek wrote:

The list of what is useful might be long, but we can't have everything
there as there are space constraints, and LSN is another 8 bytes and I
still want to have some bytes for storing the origin or whatever you
want to call it there, as that's the one I personally have biggest
use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can
pull some tricks to lower that a bit.



The reason why Jim and myself are asking for the LSN and not just the
timestamp is that I want to be able to order the transactions. Jim
pointed out earlier in the thread that just ordering on timestamp allows
for multiple transactions with the same timestamp.

Maybe we don't need the entire LSN to solve that.  If you already have
the commit timestamp maybe you only need another byte or two of
granularity to order transactions that are within the same microsecond.



Hmm maybe just one part of LSN, but I don't really like that either, 
if we want to store LSN we should probably store it as is as somebody 
might want to map it to txid for other reasons.


I did the calculation above wrong btw, it's actually 20 bytes not 24 
bytes per record, I am inclined to just say we can live with that.


Since we agreed that the (B) case is not really feasible and we are 
doing the (C), I also wonder if extradata should be renamed to nodeid 
(even if it's not used at this point as nodeid). And then there is 
question about the size of it, since the nodeid itself can live with 2 
bytes probably (64k of nodes ought to be enough for everybody ;) ).
Or leave the extradata as is but use as reserved space for future use 
and not expose it at this time on SQL level at all?


I guess Andres could answer what suits him better here.



I am happy with renaming extradata to nodeid and not exposing it at this 
time.


If we feel that commit-order (ie LSN or something equivalent) is really 
a different patch/feature than commit-timestamp then I am okay with that 
also but we should make sure to warn users of the commit-timestamp in 
the documentation that two transactions might have the same timestamp 
and that the commit order might not be the same as ordering by the 
commit timestamp.






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-09 Thread Steve Singer

On 11/07/2014 07:07 PM, Petr Jelinek wrote:


The list of what is useful might be long, but we can't have everything 
there as there are space constraints, and LSN is another 8 bytes and I 
still want to have some bytes for storing the origin or whatever you 
want to call it there, as that's the one I personally have biggest 
use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can 
pull some tricks to lower that a bit.




The reason why Jim and myself are asking for the LSN and not just the 
timestamp is that I want to be able to order the transactions. Jim 
pointed out earlier in the thread that just ordering on timestamp allows 
for multiple transactions with the same timestamp.


Maybe we don't need the entire LSN to solve that.  If you already have 
the commit timestamp maybe you only need another byte or two of 
granularity to order transactions that are within the same microsecond.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Steve Singer

On 11/05/2014 11:23 AM, Jim Nasby wrote:



Except that commit time is not guaranteed unique *even on a single 
system*. That's my whole point. If we're going to bother with all the 
commit time machinery it seems really silly to provide a way to 
uniquely order every commit.


Clearly trying to uniquely order commits across multiple systems is a 
far larger problem, and I'm not suggesting we attempt that. But for a 
single system AIUI all we need to do is expose the LSN of each commit 
record and that will give you the exact and unique order in which 
transactions committed.


This isn't a hypothetical feature either; if we had this, logical 
replication systems wouldn't have to try and fake this via batches. 
You could actually recreate exactly what data was visible at what time 
to all transactions, not just repeatable read ones (as long as you 
kept snapshot data as well, which isn't hard).


As for how much data to keep, if you have a process that's doing 
something to record this information permanently all it needs to do is 
keep an old enough snapshot around. That's not that hard to do, even 
from user space.


+1 for this.

It isn't just 'replication' systems that have a need for getting the 
commit order of transactions on a single system.  I have a application 
(not slony) where we want to query a table but order the output based on 
the transaction commit order of when the insert into the table was done 
(think of a queue). I'm not replicating the output but passing the data 
to other applications for further processing.  If I just had the commit 
timestamp I would need to put in some other condition to break ties in a 
consistent way.  I think being able to get an ordering by commit LSN is 
what I really want in this case not the timestamp.


Logical decoding is one solution to this (that I was considering) but 
being able to do something like

select * FROM event_log order by commit_id would be a lot simpler.






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Steve Singer

On 11/05/2014 05:43 PM, Andres Freund wrote:

On 2014-11-05 17:17:05 -0500, Steve Singer wrote:
Imo that's essentially a different feature. What you essentially would
need here is a 'commit sequence number' - but no timestamps. And
probably to be useful that number has to be 8 bytes in itself.


I think this gets to the heart of some of the differing views people 
have expressed on this patch


Is this patch supposed to:

A)  Add commit timestamp tracking but nothing more

B) Add infrastructure to store commit timestamps and provide a facility 
for storing additional bits of data extensions might want to be 
associated with the commit


C).  Add commit timestamps and node identifiers to commits

If the answer is (A) then I can see why some people are objecting to 
adding extradata.If the answer is (B) then it's fair to ask how well 
does this patch handle various types of things people might want to 
attach to the commit record (such as the LSN).   I think the problem is 
that once you start providing a facility extensions can use to store 
data along with the commit record then being restricted to 4 or 8 bytes 
is very limiting.  It also doesn't allow you to load two extensions at 
once on a system.   You wouldn't be able to have both the  
'steve_commit_order' extension and BDR installed at the same time.  I 
don't think this patch does a very good job at (B) but It wasn't 
intended to.


If what we are really doing is C, and just calling the space 'extradata' 
until we get the logical identifier stuff in and then we are going to 
rename extradata  to nodeid then we should say so.  If we are really 
concerned about the pg_upgrade impact of expanding the record later then 
maybe we should add 4 bytes of padding to the CommitTimeStampEntry now 
and but leave the manipulating the node id until later.


Steve





Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapshot too large error when initializing logical replication (9.4)

2014-10-30 Thread Steve Singer

On 10/28/2014 01:27 PM, Andres Freund wrote:

Hi,

On 2014-10-25 18:09:36 -0400, Steve Singer wrote:

I sometimes get the error snapshot too large from my logical replication
walsender process when in response to a CREATE_REPLICATION_SLOT.

Yes. That's possible if 'too much' was going on until a consistent point
was reached.  I think we can just use a much larger size for the array
if necessary.

I've attached patch for this. Could you try whether that helps? I don't
have a testcase handy that reproduces the problem.


This patch seems to fix things.
I've done numerous runs of the test with I was doing before with your 
patch applied and don't seem to be having this issue anymore.




This is in SnapBuildExportSnapshot in snapbuild.c

newxcnt is 212 at that point

I have max_connections = 200

procArray-maxProcs=212

Should we be testing
newxcnt  GetMaxSnapshotXidCount()

instead of
newxcnt = GetMaxSnapshotXidCount()

It actually looks correct to me new - newxcnt is used as an offset into
an array of size GetMaxSnapshotXidCount().

Greetings,

Andres Freund







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding - reading a user catalog table

2014-10-28 Thread Steve Singer

On 10/28/2014 01:31 PM, Andres Freund wrote:

On 2014-10-25 18:18:07 -0400, Steve Singer wrote:

My logical decoding plugin is occasionally getting this  error

could not resolve cmin/cmax of catalog tuple

I get this when my output plugin is trying to read one of the user defined
catalog tables (user_catalog_table=true)

Hm. That should obviously not happen.

Could you describe how that table is modified? Does that bug happen
initially, or only after a while?


It doesn't happen right away, in this case it was maybe 4 minutes after 
creating the slot.
The error also doesn't always happen when I run the this test workload 
but it is reproducible with some trying.
I' don't do anything special to that table, it gets created then I do 
inserts on it.  I don't do an alter table or anything fancy like that.  
I was running the slony failover test (all nodes under the same 
postmaster) which involves the occasional dropping and recreating of 
databases along with normal query load + replication.


I'll send you tar of the data directory off list with things in this state.


Do you have a testcase that would allow me to easily reproduce the
problem?


I don't have a isolated test case that does this.  The test that I'm 
hitting this with does lots of stuff and doesn't even always hit this.



I am not sure if this is a bug in the time-travel support in the logical
decoding support of if I'm just using it wrong (ie not getting a sufficient
lock on the relation or something).

I don't know yet...


This is the interesting part of the stack trace

#4  0x0091bbc8 in HeapTupleSatisfiesHistoricMVCC
(htup=0x7fffcf42a900,
 snapshot=0x7f786ffe92d8, buffer=10568) at tqual.c:1631
#5  0x004aedf3 in heapgetpage (scan=0x28d7080, page=0) at
heapam.c:399
#6  0x004b0182 in heapgettup_pagemode (scan=0x28d7080,
 dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:747
#7  0x004b1ba6 in heap_getnext (scan=0x28d7080,
 direction=ForwardScanDirection) at heapam.c:1475
#8  0x7f787002dbfb in lookupSlonyInfo (tableOid=91754, ctx=0x2826118,
 origin_id=0x7fffcf42ab8c, table_id=0x7fffcf42ab88,
set_id=0x7fffcf42ab84)
 at slony_logical.c:663
#9  0x7f787002b7a3 in pg_decode_change (ctx=0x2826118, txn=0x28cbec0,
 relation=0x7f787a3446a8, change=0x7f786ffe3268) at slony_logical.c:237
#10 0x007497d4 in change_cb_wrapper (cache=0x28cbda8, txn=0x28cbec0,
 relation=0x7f787a3446a8, change=0x7f786ffe3268) at logical.c:704



Here is what the code in lookupSlonyInfo is doing
--

   sltable_oid = get_relname_relid(sl_table,slony_namespace);

   sltable_rel = relation_open(sltable_oid,AccessShareLock);
   tupdesc=RelationGetDescr(sltable_rel);
   scandesc=heap_beginscan(sltable_rel,
GetCatalogSnapshot(sltable_oid),0,NULL);
   reloid_attnum = get_attnum(sltable_oid,tab_reloid);

   if(reloid_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_reloid column);
   set_attnum = get_attnum(sltable_oid,tab_set);

   if(set_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_set column);
   tableid_attnum = get_attnum(sltable_oid, tab_id);

   if(tableid_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_id column);

   while( (tuple = heap_getnext(scandesc,ForwardScanDirection) ))

(Except missing spaces ;)) I don't see anything obviously wrong with
this.

Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] snapshot too large error when initializing logical replication (9.4)

2014-10-25 Thread Steve Singer
I sometimes get the error snapshot too large from my logical 
replication walsender process when in response to a CREATE_REPLICATION_SLOT.


This is in SnapBuildExportSnapshot in snapbuild.c

newxcnt is 212 at that point

I have max_connections = 200

procArray-maxProcs=212

Should we be testing
newxcnt  GetMaxSnapshotXidCount()

instead of
newxcnt = GetMaxSnapshotXidCount()






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] logical decoding - reading a user catalog table

2014-10-25 Thread Steve Singer

My logical decoding plugin is occasionally getting this  error

could not resolve cmin/cmax of catalog tuple

I get this when my output plugin is trying to read one of the user 
defined catalog tables (user_catalog_table=true)



I am not sure if this is a bug in the time-travel support in the logical 
decoding support of if I'm just using it wrong (ie not getting a 
sufficient lock on the relation or something).




This is the interesting part of the stack trace

#4  0x0091bbc8 in HeapTupleSatisfiesHistoricMVCC 
(htup=0x7fffcf42a900,

snapshot=0x7f786ffe92d8, buffer=10568) at tqual.c:1631
#5  0x004aedf3 in heapgetpage (scan=0x28d7080, page=0) at 
heapam.c:399

#6  0x004b0182 in heapgettup_pagemode (scan=0x28d7080,
dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:747
#7  0x004b1ba6 in heap_getnext (scan=0x28d7080,
direction=ForwardScanDirection) at heapam.c:1475
#8  0x7f787002dbfb in lookupSlonyInfo (tableOid=91754, ctx=0x2826118,
origin_id=0x7fffcf42ab8c, table_id=0x7fffcf42ab88, 
set_id=0x7fffcf42ab84)

at slony_logical.c:663
#9  0x7f787002b7a3 in pg_decode_change (ctx=0x2826118, txn=0x28cbec0,
relation=0x7f787a3446a8, change=0x7f786ffe3268) at slony_logical.c:237
#10 0x007497d4 in change_cb_wrapper (cache=0x28cbda8, 
txn=0x28cbec0,

relation=0x7f787a3446a8, change=0x7f786ffe3268) at logical.c:704



Here is what the code in lookupSlonyInfo is doing
--

  sltable_oid = get_relname_relid(sl_table,slony_namespace);

  sltable_rel = relation_open(sltable_oid,AccessShareLock);
  tupdesc=RelationGetDescr(sltable_rel);
  scandesc=heap_beginscan(sltable_rel, 
GetCatalogSnapshot(sltable_oid),0,NULL);

  reloid_attnum = get_attnum(sltable_oid,tab_reloid);

  if(reloid_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_reloid column);
  set_attnum = get_attnum(sltable_oid,tab_set);

  if(set_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_set column);
  tableid_attnum = get_attnum(sltable_oid, tab_id);

  if(tableid_attnum == InvalidAttrNumber)
  elog(ERROR,sl_table does not have a tab_id column);

  while( (tuple = heap_getnext(scandesc,ForwardScanDirection) ))





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving of INT64_FORMAT to c.h

2014-10-22 Thread Steve Singer

On 10/16/2014 08:08 AM, Andres Freund wrote:

On 2014-10-16 08:04:17 -0400, Jan Wieck wrote:

Hi,

PostgreSQL has for ages defined INT64_FORMAT and UINT64_FORMAT in
pg_config.h. This commit

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ce486056ecd28050

moved those definitions to c.h, which breaks compilation of all released
Slony-I versions against current master. Can those be moved back to where
they used to be?

Well, you could add additional configure stuff to also emit what you
want.


Slony uses the definitions in external tools, like slon and slonik, to
format sequence numbers in log output.

Then it should include c.h/postgres_fe.h?


So the header of c.h says Note that the definitions here are not 
intended to be exposed to clients

but
postgres_fe.h says This should be the first file included by PostgreSQL 
client libraries and


Should client programs that live outside the postgres source tree be 
including postgres_fe.h ?  I have a feeling the answer is no. If the 
answer is no, then why does a make install install postgres_fe.h ?


Slonik used to include postgre_fe.h but back in 2011 or so we stopped 
doing so because it was causing issues (I think on win32 builds)


Maybe slony client programs shouldn't be trying to steal portability 
definitions from postgres headers, but I doubt we are the only ones 
doing that.  It isn't a big deal for slony to define it's own 
INT64_FORMAT for 9.5+ but third party projects that have been including 
pg_config.h will hit similar issues.  if there was good reason for the 
change then fine (Postgres isn't intended to be a general purpose C 
portability layer).




Steve


Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Moving of INT64_FORMAT to c.h

2014-10-22 Thread Steve Singer

On 10/22/2014 04:17 PM, Robert Haas wrote:



Slonik used to include postgre_fe.h but back in 2011 or so we stopped doing
so because it was causing issues (I think on win32 builds)

That seems like something we ought to consider fixing, but obviously
we'd need more details.

When I'll try to find sometime to get a windows environment running and 
try to figure out what the problems were.
It's also possible that I removed the postgres_fe include at the same 
time as I was fixing other win32 issues and the postgres_fe include 
wasn't causing a problem it was just removed because it was unnecessary.


At the moment slony only includes the server include dir if we are 
building with pgport (which we normally do on windows)
We have had reports of issues like described 
(http://www.slony.info/bugzilla/show_bug.cgi?id=315) where some installs 
make us pick up server and client includes from different versions of PG.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] HINT: pg_hba.conf changed since last config reload

2014-10-19 Thread Steve Singer

On 10/16/2014 11:34 PM, Craig Ringer wrote:



Given the generally positive reception to this, here's a patch.

The first patch adds an errhint_log , akin to the current errdetail_log,
so we can send a different HINT to the server log than we do to the client.


The patch behaves as you describe.I feel that this feature would be 
useful , and you implemented the suggestions given that requested the 
reload notice but be sent to the client but instead just a hint about 
checking the server log.


You follow the pattern set with detail_log which makes sense.  The 
variable name hint_log doesn't make it obvious to me that
the hint goes to the server log, but not the client.  The comment for 
errhint_log should maybe explicitly say that.



One question about the code:

Does errfinish (elog.c at around line 505) need to free hint_log  ? (I 
would assume it does)


Other than that the patch looks good to me.


-

Something else I noticed while testing.  This isn't introduced by your 
patch but I am wondering if it an existing bug if I setup my 
configuration like this:



#data_directory = 'ConfigDir'   # use data in another directory
# (change requires restart)
hba_file = 'ConfigDir/pg_hba2.conf' # host-based authentication file

and start postgres like

./postgres -D ../data

it looks for pg2hba2.conf at bin/ConfigDir/pg_hba2.conf   (relative to 
the bin directory I started it from)


Then if I change my pg_hba.conf and do a reload I get the following in 
the log


LOG:  parameter hba_file cannot be changed without restarting the server
LOG:  configuration file 
/usr/local/pgsql95git/bin/../data/postgresql.conf contains errors; 
unaffected changes were applied



set_config_option is comparing the relative path with the absolute path.

Steve



(Even if DETAIL was appropriate for this info, which it isn't, I can't
use errdetail_log because it's already used for other information in
some of the same error sites.)

The second patch adds a test during errors to report if pg_hba.conf is
stale, or if pg_ident.conf is stale.


Typical output, client:

psql: FATAL:  Peer authentication failed for user fred
HINT:  See the server error log for additional information.


Typical output, server:

LOG:  provided user name (fred) and authenticated user name (craig) do
not match
FATAL:  Peer authentication failed for user fred
DETAIL:  Connection matched pg_hba.conf line 84: local   all
   all peer
HINT:  pg_hba.conf has been changed since last server configuration
reload. Reload the server configuration to apply the changes.



I've added this to the next CF.








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication identifiers, take 3

2014-09-27 Thread Steve Singer

On 09/26/2014 06:05 PM, Andres Freund wrote:

On 2014-09-26 14:57:12 -0400, Robert Haas wrote:
Sure, it'll possibly not be trivial to move them elsewhere. On the other
hand, the padding bytes have been unused for 8+ years without somebody
laying claim on them but me. I don't think it's a good idea to leave
them there unused when nobody even has proposed another use for a long
while. That'll just end up with them continuing to be unused.  And
there's actually four more consecutive bytes on 64bit systems that are
unused.

Should there really be a dire need after that, we can simply bump the
record size. WAL volume wise it'd not be too bad to make the record a
tiny bit bigger - the header is only a relatively small fraction of the
entire content.


If we were now increasing the WAL record size anyway for some unrelated 
reason, would we be willing to increase it by a further 2 bytes for the 
node identifier?
If the answer is 'no' then I don't think we can justify using the 2 
padding bytes just because they are there and have been unused for 
years.  But if the answer is yes, we feel this important enough to 
justfiy a slightly (2 byte) larger WAL record header then we shouldn't 
use the excuse of maybe needing those 2 bytes for something else.   When 
something else comes along that needs the WAL space we'll have to 
increase the record size.


To say that if some other patch comes along that needs the space we'll 
redo this feature to use the method Robert describes is unrealistic.  If 
we think that the replication identifier  isn't 
general/important/necessary to justify 2 bytes of WAL header space then 
we should start out with something that doesn't use the WAL header,




Steve


I've also wondered about that. Perhaps we simply should have an
additional 'name' column indicating the replication solution?

Yeah, maybe, but there's still the question of substructure within the
non-replication-solution part of the name.  Not sure if we can assume
that a bipartite identifier, specifically, is right, or whether some
solutions will end up with different numbers of components.

Ah. I thought you only wanted to suggest a separator between the
replication solution and it's internal dat. But you actually want to
suggest an internal separator to be used in the solution's namespace?
I'm fine with that. I don't think we can suggest much beyond that -
different solutions will have fundamentally differing requirements about
which information to store.

Agreed.

So, let's recommend underscores as that separator?

Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication identifiers, take 3

2014-09-27 Thread Steve Singer

On 09/26/2014 10:21 AM, Andres Freund wrote:

On 2014-09-26 09:53:09 -0400, Robert Haas wrote:

On Fri, Sep 26, 2014 at 5:05 AM, Andres Freund and...@2ndquadrant.com wrote:

Let me try to summarize the information requirements for each of these
things.  For #1, you need to know, after crash recovery, for each
standby, the last commit LSN which the client has confirmed via a
feedback message.

I'm not sure I understand what you mean here? This is all happening on
the *standby*. The standby needs to know, after crash recovery, the
latest commit LSN from the primary that it has successfully replayed.

Ah, sorry, you're right: so, you need to know, after crash recovery,
for each machine you are replicating *from*, the last transaction (in
terms of LSN) from that server that you successfully replayed.

Precisely.


I don't think a solution which logs the change of origin will be
simpler. When the origin is in every record, you can filter without keep
track of any state. That's different if you can switch the origin per
tx. At the very least you need a in memory entry for the origin.

But again, that complexity pertains only to logical decoding.
Somebody who wants to tweak the WAL format for an UPDATE in the future
doesn't need to understand how this works, or care.

I agree that that's a worthy goal. But I don't see how this isn't the
case with what I propose? This isn't happening on the level of
individual rmgrs/ams - there've been two padding bytes after 'xl_rmid'
in struct XLogRecord for a long time.

There's also the significant advantage that not basing this on the xid
allows it to work correctly with records not tied to a
transaction. There's not that much of that happening yet, but I've
several features in mind:

* separately replicate 2PC commits. 2PC commits don't have an xid
   anymore... With some tooling on top replication 2PC in two phases
   allow for very cool stuff. Like optionally synchronous multimaster
   replication.
* I have a pending patch that allows to send 'messages' through logical
   decoding - yielding a really fast and persistent queue. For that it's
   useful have transactional *and* nontransactional messages.
* Sanely replicating CONCURRENTLY stuff gets harder if you tie things to
   the xid.


At what point in the decoding stream should something related to a 
CONCURRENTLY command show up?
Also, for a logical message queue why couldn't you have a xid associated 
with the message that had nothing else in the transaction?


l


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 logical decoding assertion

2014-08-15 Thread Steve Singer

On 08/15/2014 09:42 AM, Andres Freund wrote:

On 2014-08-14 16:03:08 -0400, Steve Singer wrote:

I hit the following on 9.4 testing logical decoding.


TRAP: FailedAssertion(!(prev_first_lsn  cur_txn-first_lsn), File:
reorderbuffer.c, Line: 618)
LOG:  server process (PID 3801) was terminated by signal 6: Aborted

I saw that recently while hacking around, but I thought it was because
of stuff I'd added. But apparently not.

Hm. I think I see how that might happen. It might be possible (and
harmless) if two subxacts of the same toplevel xact have the same
first_lsn. But if it's not just = vs  it'd be worse.


Unfortunately I don't have a core file and I haven't been able to reproduce
this.

Any information about the workload? Any chance you still have the data
directory around?


I was running the slony regression tests  but  I ran the same tests 
script after a number of times after and the problem didn't reproduce 
itself.


The last thing the tests did before the crash was part of the slony 
failover process.


I am doing my testing running with all 5 nodes/databases under the same 
postmaster (giving something like 20 replication slots open)


A few milliseconds before the one of the connections had just done a
START_REPLICATION SLOT slon_4_2 LOGICAL 0/32721A58

and then that connection reported the socket being closed,

but because so much was going on concurrently I can't say for sure if 
that connection experienced the assert or was closed because another 
backend asserted.



I haven't done an initdb since, so I have the data directory but I've 
dropped and recreated all of my slots many times since so the wal files 
are long gone.




Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 logical decoding assertion

2014-08-14 Thread Steve Singer

I hit the following on 9.4 testing logical decoding.


TRAP: FailedAssertion(!(prev_first_lsn  cur_txn-first_lsn), File: 
reorderbuffer.c, Line: 618)

LOG:  server process (PID 3801) was terminated by signal 6: Aborted

Unfortunately I don't have a core file and I haven't been able to 
reproduce this.










--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 logical replication - walsender keepalive replies

2014-08-11 Thread Steve Singer

On 07/14/2014 01:19 PM, Steve Singer wrote:

On 07/06/2014 10:11 AM, Andres Freund wrote:

Hi Steve,



Right. I thought about this for a while, and I think we should change
two things. For one, don't request replies here. It's simply not needed,
as this isn't dealing with timeouts. For another don't just check 
-flush

 sentPtr but also  -write  sentPtr. The reason we're sending these
feedback messages is to inform the 'logical standby' that there's been
WAL activity which it can't see because they don't correspond to
anything that's logically decoded (e.g. vacuum stuff).
Would that suit your needs?

Greetings,


Yes I think that will work for me.
I tested with the attached patch that I think  does what you describe 
and it seems okay.






Any feedback on this?  Do we want that change for 9.4, or do we want 
something else?






Andres Freund









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 logical replication - walsender keepalive replies

2014-07-14 Thread Steve Singer

On 07/06/2014 10:11 AM, Andres Freund wrote:

Hi Steve,



Right. I thought about this for a while, and I think we should change
two things. For one, don't request replies here. It's simply not needed,
as this isn't dealing with timeouts. For another don't just check -flush
 sentPtr but also  -write  sentPtr. The reason we're sending these
feedback messages is to inform the 'logical standby' that there's been
WAL activity which it can't see because they don't correspond to
anything that's logically decoded (e.g. vacuum stuff).
Would that suit your needs?

Greetings,


Yes I think that will work for me.
I tested with the attached patch that I think  does what you describe 
and it seems okay.





Andres Freund



diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
new file mode 100644
index 3189793..844a5de
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*** WalSndWaitForWal(XLogRecPtr loc)
*** 1203,1211 
  		 * possibly are waiting for a later location. So we send pings
  		 * containing the flush location every now and then.
  		 */
! 		if (MyWalSnd-flush  sentPtr  !waiting_for_ping_response)
  		{
! 			WalSndKeepalive(true);
  			waiting_for_ping_response = true;
  		}
  
--- 1203,1213 
  		 * possibly are waiting for a later location. So we send pings
  		 * containing the flush location every now and then.
  		 */
! 		if (MyWalSnd-flush  sentPtr 
! 			MyWalSnd-write  sentPtr 
! 			!waiting_for_ping_response)
  		{
! 			WalSndKeepalive(false);
  			waiting_for_ping_response = true;
  		}
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 pg_control corruption

2014-07-08 Thread Steve Singer
I've encountered a corrupt pg_control  file on my 9.4 development 
cluster.  I've mostly been using the cluster for changeset extraction / 
slony testing.


This is a 9.4 (currently commit 6ad903d70a440e  + a walsender change 
discussed in another thread) but would have had the initdb done with an 
earlier 9.4 snapshot.




/usr/local/pgsql94wal/bin$ ./pg_controldata ../data
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting.  The results below are untrustworthy.

pg_control version number:937
Catalog version number:   201405111
Database system identifier:   6014096177254975326
Database cluster state:   in production
pg_control last modified: Tue 08 Jul 2014 06:15:58 PM EDT
Latest checkpoint location:   5/44DC5FC8
Prior checkpoint location:5/44C58B88
Latest checkpoint's REDO location:5/44DC5FC8
Latest checkpoint's REDO WAL file:000100050044
Latest checkpoint's TimeLineID:   1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/1558590
Latest checkpoint's NextOID:  505898
Latest checkpoint's NextMultiXactId:  3285
Latest checkpoint's NextMultiOffset:  6569
Latest checkpoint's oldestXID:1281
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  0
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint:Tue 08 Jul 2014 06:15:23 PM EDT
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline:   0
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
Current wal_level setting:logical
Current wal_log_hints setting:off
Current max_connections setting:  200
Current max_worker_processes setting: 8
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:   8
Database block size:  8192
Blocks per segment of large relation: 131072
WAL block size:   8192
Bytes per WAL segment:16777216
Maximum length of identifiers:64
Maximum columns in an index:  32
Maximum size of a TOAST chunk:1996
Size of a large-object chunk: 65793
Date/time type storage:   floating-point numbers
Float4 argument passing:  by reference
Float8 argument passing:  by reference
Data page checksum version:   2602751502
ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$


Before this postgres crashed, and seemed to have problems recovering. I 
might have hit CTRL-C but I didn't do anything drastic like issue a kill -9.



test1 [unknown] 2014-07-08 18:15:18.986 EDTFATAL:  the database system 
is in recovery mode
test1 [unknown] 2014-07-08 18:15:20.482 EDTWARNING:  terminating 
connection because of crash of another server process
test1 [unknown] 2014-07-08 18:15:20.482 EDTDETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and 
exit, because another server process exited abnormally and possibly 
corrupted shared memory.
test1 [unknown] 2014-07-08 18:15:20.482 EDTHINT:  In a moment you should 
be able to reconnect to the database and repeat your command.
  2014-07-08 18:15:20.483 EDTLOG:  all server processes terminated; 
reinitializing
  2014-07-08 18:15:20.720 EDTLOG:  database system was interrupted; 
last known up at 2014-07-08 18:15:15 EDT
  2014-07-08 18:15:20.865 EDTLOG:  database system was not properly 
shut down; automatic recovery in progress

  2014-07-08 18:15:20.954 EDTLOG:  redo starts at 5/41023848
  2014-07-08 18:15:23.153 EDTLOG:  unexpected pageaddr 4/D8DC6000 in 
log segment 000100050044, offset 14442496

  2014-07-08 18:15:23.153 EDTLOG:  redo done at 5/44DC5F60
  2014-07-08 18:15:23.153 EDTLOG:  last completed transaction was at 
log time 2014-07-08 18:15:17.874937-04
test2 [unknown] 2014-07-08 18:15:24.247 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:24.772 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:25.281 EDTFATAL:  the database system 
is in recovery mode
test1 [unknown] 2014-07-08 18:15:25.547 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:25.548 EDTFATAL:  the database system 
is in recovery mode
test3 [unknown] 2014-07-08 18:15:25.549 EDTFATAL:  the database system 
is in recovery mode
test4 [unknown] 2014-07-08 18:15:25.557 EDTFATAL:  the database system 
is in recovery mode
test5 [unknown] 2014-07-08 18:15:25.582 EDTFATAL:  the database system 
is in recovery mode
test2 [unknown] 2014-07-08 18:15:25.584 EDTFATAL:  the database system 
is in 

Re: [HACKERS] 9.4 pg_control corruption

2014-07-08 Thread Steve Singer

On 07/08/2014 10:14 PM, Tom Lane wrote:

Steve Singer st...@ssinger.info writes:

I've encountered a corrupt pg_control  file on my 9.4 development
cluster.  I've mostly been using the cluster for changeset extraction /
slony testing.
This is a 9.4 (currently commit 6ad903d70a440e  + a walsender change
discussed in another thread) but would have had the initdb done with an
earlier 9.4 snapshot.

Somehow or other you missed the update to pg_control version number 942.
There's no obvious reason to think that this pg_control file is corrupt
on its own terms, but the pg_controldata version you're using expects
the 942 layout.  The fact that the server wasn't complaining about this
suggests that you've not recompiled the server, or at least not xlog.c.
Possibly the odd failure to restart indicates that you have a partially
updated server executable?



The server  is complaining about that, it started to after the crash 
(which is why I ran pg_controldata)


ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$ ./postgres -D ../data
  2014-07-08 22:28:57.796 EDTFATAL:  database files are incompatible 
with server
  2014-07-08 22:28:57.796 EDTDETAIL:  The database cluster was 
initialized with PG_CONTROL_VERSION 937, but the server was compiled 
with PG_CONTROL_VERSION 942.

  2014-07-08 22:28:57.796 EDTHINT:  It looks like you need to initdb.
ssinger@ssinger-laptop:/usr/local/pgsql94wal/bin$


The server seemed fine (and it was 9.4 because I was using 9.4 features)
The server crashed
The server performed crash recovery
The server server wouldn't start and pg_controldata shows the attached 
output


I wasn't recompiling or reinstalling around this time either.




regards, tom lane






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 logical replication - walsender keepalive replies

2014-06-30 Thread Steve Singer

In 9.4 we've the below  block of code to walsender.c as

/*
 * We only send regular messages to the client for full decoded
 * transactions, but a synchronous replication and walsender shutdown
 * possibly are waiting for a later location. So we send pings
 * containing the flush location every now and then.
*/
if (MyWalSnd-flush  sentPtr  !waiting_for_ping_response)
{
WalSndKeepalive(true);
waiting_for_ping_response = true;
}


I am finding that my logical replication reader is spending a tremendous 
amount of time sending feedback to the server because a keep alive reply 
was requested.  My flush pointer is smaller than sendPtr, which I see as 
the normal case (The client hasn't confirmed all the wal it has been 
sent).   My client queues the records it receives and only confirms when 
actually processes the record.


So the sequence looks something like

Server Sends LSN 0/1000
Server Sends LSN 0/2000
Server Sends LSN 0/3000
Client confirms LSN 0/2000


I don't see why all these keep alive replies are needed in this case 
(the timeout value is bumped way up, it's the above block that is 
triggering the reply request not something related to timeout)


Steve






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE REPLICATION SLOT fails on a timeout

2014-05-30 Thread Steve Singer

On 05/28/2014 06:41 PM, Andres Freund wrote:

Hi,


Pushed a fix for it. I am pretty sure it will, but could you still test
that it fixes your problem?

Thanks!


The fix seems to work (I am no longer getting the timeout on slot creation)

Thanks




Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CREATE REPLICATION SLOT fails on a timeout

2014-05-16 Thread Steve Singer
I am finding that my logical walsender connections are being terminated 
due to a timeout on the CREATE REPLICATION SLOT command. with 
terminating walsender process due to replication timeout


Below is the stack trace when this happens

#3  0x0067df28 in WalSndCheckTimeOut 
(now=now@entry=453585463823871) at walsender.c:1748
#4  0x0067eedc in WalSndWaitForWal (loc=691727888) at 
walsender.c:1216
#5  logical_read_xlog_page (state=optimized out, 
targetPagePtr=691724288, reqLen=optimized out,
targetRecPtr=optimized out, cur_page=0x18476e0 }\320\005, 
pageTLI=optimized out) at walsender.c:741
#6  0x004f41bf in ReadPageInternal (state=state@entry=0x17aa140, 
pageptr=pageptr@entry=691724288,

reqLen=reqLen@entry=3600) at xlogreader.c:525
#7  0x004f454a in XLogReadRecord (state=0x17aa140, 
RecPtr=691727856, RecPtr@entry=0,

errormsg=errormsg@entry=0x7fff7f668c28) at xlogreader.c:228
#8  0x00675e5c in DecodingContextFindStartpoint 
(ctx=ctx@entry=0x17a0358) at logical.c:460
#9  0x00680f16 in CreateReplicationSlot (cmd=0x1798b50) at 
walsender.c:800

#10 exec_replication_command (
cmd_string=cmd_string@entry=0x17f1478 CREATE_REPLICATION_SLOT 
\slon_4_1\ LOGICAL \slony1_funcs.2.2.0\)

at walsender.c:1291
#11 0x006bf4a1 in PostgresMain (argc=optimized out, 
argv=argv@entry=0x177db50, dbname=0x177db30 test1,


(gdb) p last_reply_timestamp
$1 = 0


I propose the attached patch sets last_reply_timestamp to now() it 
starts processing a command.  Since receiving a command is hearing 
something from the client.



Steve

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
new file mode 100644
index 5c11d68..56a2f10
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*** exec_replication_command(const char *cmd
*** 1276,1281 
--- 1276,1282 
    parse_rc;
  
  	cmd_node = replication_parse_result;
+ 	last_reply_timestamp = GetCurrentTimestamp();
  
  	switch (cmd_node-type)
  	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE REPLICATION SLOT fails on a timeout

2014-05-16 Thread Steve Singer

On 05/16/2014 04:43 PM, Andres Freund wrote:

Hi,

I don't think that's going to cut it though. The creation can take
longer than whatever wal_sender_timeout is set to (when there's lots of
longrunning transactions). I think checking whether last_reply_timestamp
= 0 during timeout checking is more robust.

Greetings,

Andres Freund





That makes sense.
A patch that does that is attached.

Steve
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
new file mode 100644
index 5c11d68..d23f06b
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*** WalSndCheckTimeOut(TimestampTz now)
*** 1738,1744 
  	timeout = TimestampTzPlusMilliseconds(last_reply_timestamp,
  		  wal_sender_timeout);
  
! 	if (wal_sender_timeout  0  now = timeout)
  	{
  		/*
  		 * Since typically expiration of replication timeout means
--- 1738,1744 
  	timeout = TimestampTzPlusMilliseconds(last_reply_timestamp,
  		  wal_sender_timeout);
  
! 	if (last_reply_timestamp  0  wal_sender_timeout  0  now = timeout)
  	{
  		/*
  		 * Since typically expiration of replication timeout means

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] assertion in 9.4 with wal_level=logical

2014-04-17 Thread Steve Singer

With master/9.4 from today (52e757420fa98a76015c2c88432db94269f3e8f4)

I am getting an assertion when doing a truncate via SPI when I have 
wal_level=logical.


Stack trace is below.

I am just replicating a table with normal slony (2.2) I don't need to 
establish any replication slots to get this.





(gdb) where
#0  0x7fc9b4f58295 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7fc9b4f5b438 in __GI_abort () at abort.c:90
#2  0x007a10f7 in ExceptionalCondition (
conditionName=conditionName@entry=0x955d90 !(CritSectionCount == 0 
|| (CurrentMemoryContext) == ErrorContext || (MyAuxProcType == 
CheckpointerProcess)),

errorType=errorType@entry=0x7da7b0 FailedAssertion,
fileName=fileName@entry=0x955a2e mcxt.c, 
lineNumber=lineNumber@entry=670)

at assert.c:54
#3  0x007c3090 in palloc (size=16) at mcxt.c:670
#4  0x004dd83f in mXactCacheGetById (members=0x7fff679a3d18, 
multi=58)

at multixact.c:1411
#5  GetMultiXactIdMembers (multi=58, members=members@entry=0x7fff679a3d18,
allow_old=allow_old@entry=0 '\000') at multixact.c:1080
#6  0x0049e43f in MultiXactIdGetUpdateXid (xmax=optimized out,
t_infomask=optimized out) at heapam.c:6042
#7  0x004a1ccc in HeapTupleGetUpdateXid (tuple=optimized out)
at heapam.c:6083
#8  0x007cf7d9 in HeapTupleHeaderGetCmax 
(tup=tup@entry=0x7fc9ac838e38)

at combocid.c:122
#9  0x0049eb98 in log_heap_new_cid (
relation=relation@entry=0x7fc9b5a67dc0, tup=tup@entry=0x7fff679a3ea0)
at heapam.c:7047
#10 0x004a48a5 in heap_update 
(relation=relation@entry=0x7fc9b5a67dc0,

otid=otid@entry=0x2678c6c, newtup=newtup@entry=0x2678c68, cid=26,
crosscheck=crosscheck@entry=0x0, wait=wait@entry=1 '\001',
hufd=hufd@entry=0x7fff679a4080, lockmode=lockmode@entry=0x7fff679a407c)
at heapam.c:3734
#11 0x004a5842 in simple_heap_update (
relation=relation@entry=0x7fc9b5a67dc0, otid=otid@entry=0x2678c6c,
tup=tup@entry=0x2678c68) at heapam.c:4010
#12 0x00797cf7 in RelationSetNewRelfilenode (
relation=relation@entry=0x7fc9ab270b68, freezeXid=19459,
minmulti=minmulti@entry=58) at relcache.c:2956
#13 0x0059ddde in ExecuteTruncate (stmt=0x3a, stmt@entry=0x2678a58)
at tablecmds.c:1187
#14 0x006c3870 in standard_ProcessUtility (parsetree=0x2678a58,
queryString=optimized out, context=optimized out, params=0x0,
dest=optimized out, completionTag=optimized out) at utility.c:515
#15 0x005e79d9 in _SPI_execute_plan 
(plan=plan@entry=0x7fff679a4320,

paramLI=paramLI@entry=0x0, snapshot=snapshot@entry=0x0,
crosscheck_snapshot=crosscheck_snapshot@entry=0x0,
read_only=read_only@entry=0 '\000',
fire_triggers=fire_triggers@entry=1 '\001', tcount=tcount@entry=0)
at spi.c:2171
#16 0x005e83c1 in SPI_execute (
#16 0x005e83c1 in SPI_execute (
---Type return to continue, or q return to quit---
src=src@entry=0x25bde90 truncate only \disorder\.\do_restock\,
read_only=0 '\000', tcount=tcount@entry=0) at spi.c:386


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion in 9.4 with wal_level=logical

2014-04-17 Thread Steve Singer

On 04/17/2014 04:33 PM, Andres Freund wrote:

Hi,

On 2014-04-17 16:23:54 -0400, Steve Singer wrote:

With master/9.4 from today (52e757420fa98a76015c2c88432db94269f3e8f4)

I am getting an assertion when doing a truncate via SPI when I have
wal_level=logical.

Stack trace is below.

I am just replicating a table with normal slony (2.2) I don't need to
establish any replication slots to get this.

Uh, that's somewhat nasty... You probably only get that because of
slony's habit of share locking catalogs. Could that be?


Yes slony does a select from pg_catalog and pg_namespace  in  the stored 
function just before doing the truncate.




For now, to circumvent the problem you could just revert
4a170ee9e0ebd7021cb1190fabd5b0cbe2effb8e for now.

I'll look into fixing it properly.

Greetings,

Andres Freund






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier

2014-01-06 Thread Steve Singer

On 01/05/2014 09:13 PM, Michael Paquier wrote:



On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com 
mailto:markdil...@yahoo.com wrote:

 I am building a regression test system for replication and came across
 this email thread.  I have gotten pretty far into my implementation, but
 would be happy to make modifications if folks have improvements to
 suggest.  If the community likes my design, or a modified version based
 on your feedback, I'd be happy to submit a patch.
Yeah, this would be nice to look at, core code definitely needs to 
have some more infrastructure for such a test suite. I didn't get the 
time to go back to it since I began this thread though :)


 Currently I am canibalizing src/test/pg_regress.c, but that could 
instead
 be copied to src/test/pg_regress_replication.c or whatever.  The 
regression

 test creates and configures multiple database clusters, sets up the
 replication configuration for them, runs them each in nonprivileged mode
 and bound to different ports, feeds all the existing 141 regression 
tests
 into the master database with the usual checking that all the right 
results

 are obtained, and then checks that the standbys have the expected
 data.  This is possible all on one system because the database clusters
 are chroot'ed to see their own /data directory and not the /data 
directory
 of the other chroot'ed clusters, although the rest of the system, 
like /bin

 and /etc and /dev are all bind mounted and visible to each cluster.
Having vanilla regressions run in a cluster with multiple nodes and 
check the results on a standby is the top of the iceberg though. What 
I had in mind when I began this thread was to have more than a 
copy/paste of pg_regress, but an infrastructure that people could use 
to create and customize tests by having an additional control layer on 
the cluster itself. For example, testing replication is not only a 
matter of creating and setting up the nodes, but you might want to be 
able to initialize, add, remove nodes during the tests. Node addition 
would be either a new fresh master (this would be damn useful for a 
test suite for logical replication I think), or a slave node with 
custom recovery parameters to test replication, as well as PITR, 
archiving, etc. Then you need to be able to run SQL commands on top of 
that to check if the results are consistent with what you want.




I'd encourage anyone looking at implementing a testing suite for 
replication to look at the stuff we did for Slony at least to get some 
ideas.


We wrote a test driver framework (clustertest - 
https://github.com/clustertest/clustertest-framework) then some 
Javascript base classes for common types of operations.  An individual 
test is then written in Javascript that invokes methods either in the 
framework or base-class to do most of the interesting work.
http://git.postgresql.org/gitweb/?p=slony1-engine.git;a=blob;f=clustertest/disorder/tests/EmptySet.js;h=7b4850c1d24036067f5a659b990c7f05415ed967;hb=HEAD 
as an example





A possible input for a test that users could provide would be 
something like that:

# Node information for tests
nodes
{
{node1, postgresql.conf params, recovery.conf params}
{node2, postgresql.conf params, recovery.conf params, slave of node1}
}
# Run test
init node1
run_sql node1 file1.sql
# Check output
init node2
run_sql node2 file2.sql
# Check that results are fine
# Process

The main problem is actually how to do that. Having some smart shell 
infrastructure would be simple and would facilitate (?) the 
maintenance of code used to run the tests. On the contrary having a C 
program would make the maintenance of code to run the tests more 
difficult (?) for a trade with more readable test suite input like the 
one I wrote above. This might also make the test input more readable 
for a human eye, in the shape of what is already available in 
src/test/isolation.


Another possibility could be also to integrate directly a 
recovery/backup manager in PG core, and have some tests for it, or 
even include those tests directly with pg_basebackup or an upper layer 
of it.


 There of course is room to add as many replication tests as you like,
 and the main 141 tests fed into the master could be extended to feed
 more data and such.

 The main drawbacks that I don't care for are:

 1) 'make check' becomes 'sudo make check' because it needs permission
 to run chroot.
-1 for that developers should not need to use root to run regression 
suite.


 2) I have no win32 version of the logic
For a first shot I am not sure that it matters much.

 The main advantages that I like about this design are:

 1) Only one system is required.  The developer does not need network
 access to a second replication system.  Moreover, multiple database
 clusters can be established with interesting replication hierarchies 
between

 them, and the cost of each additional cluster is just another chroot
 environment
An assumption of the test 

Re: [HACKERS] plpgsql_check_function - rebase for 9.3

2013-12-07 Thread Steve Singer

On 08/22/2013 02:02 AM, Pavel Stehule wrote:

rebased

Regards

Pavel

This patch again needs a rebase, it no longer applies cleanly. 
plpgsql_estate_setup now takes a shared estate parameter and the call in 
pl_check needs that.   I passed NULL in and things seem to work.




I think this is another example where are processes aren't working as 
we'd like.


The last time this patch got  a review was in March when Tom said 
http://www.postgresql.org/message-id/27661.1364267...@sss.pgh.pa.us


Shortly after Pavel responded with a new patch that changes the output 
format. 
http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_jjop21fj...@mail.gmail.com


I don't much has progress in the following  9 months on this patch.

In Tom's review the issue of code duplication and asks:
do we want to accept that this is the implementation pathway we're 
going to settle for, or are we going to hold out for a better one? I'd 
be the first in line to hold out if I had a clue how to move towards a 
better implementation, but I have none.


This question is still outstanding.

Here are my comments on this version of the patch:



This version of the patch gives output as a table with the structure:

functionid | lineno | statement | sqlstate | message | detail | hint | level | 
position | query | context
++---+--+-++--+---+--+---+-

I like this much better than the previous format.

I think the patch needs more documentation.

The extent of the documentation is

 titleChecking of embedded SQL/title
+para
+ The SQL statements inside applicationPL/pgSQL/ functions are
+ checked by validator for semantic errors. These errors
+ can be found by functionplpgsql_check_function/function:



I a don't think that this adequately describes the function.  It isn't clear to 
me what we mean by 'embedded' SQL.
and 'semantic errors'.


I think this would read better as

  sect2 id=plpgsql-check-function
   titleChecking SQL Inside Functions/title
   para
 The SQL Statements inside of applicationPL/pgsql/ functions can be
 checked by a validation function for semantic errors. You can check
 applicationPL/pgsl/application functions by passing the name of the
 function to functionplpgsql_check_function/function.
   /para
   para
   The functionplpgsql_check_function/function will check the SQL
   inside of a applicationPL/pgsql/application function for certain
   types of errors and warnings.
   /para

but that needs to be expanded on.

I was expecting something like:

create function x() returns void as $$ declare a int4; begin a='f'; end $$ 
language plpgsql;

to return an error but it doesn't

but

create temp table x(a int4);
create or replace function x() returns void as $$ declare a int4; begin insert 
into x values('h'); end $$ language plpgsql;

does give an error when I pass it to the validator.   Is this the intended 
behavior of the patch? If so we need to explain why the first function doesn't 
generate an error.


I feel we need to document what each of the columns in the output means.  What 
is the difference between statement, query and context?

Some random comments on the messages you return:

Line 816:

if (is_expression  tupdesc-natts != 1)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg(qw,
query-query,
tupdesc-natts)));

Should this be query \%s\ returned %d columns but only 1 is allowed  or 
something similar?

 Line 837

else
/* XXX: should be a warning? */
ereport(ERROR,
(errmsg(cannot to identify real type for 
record type variable)));

In what situation does this come  up?  Should it be a warning or error?  
Do we mean the real type for record variable ?


Line 1000
ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+  errmsg(function does not return 
composite type, is not possible to identify composite type)));


Should this be function does not return a composite type, it is not 
possible to identify the composite type ?


Line 1109:

appendStringInfo(str, parameter \%s\ is overlapped,
+  refname);

gives warnings like:

select message,detail FROM  plpgsql_check_function('b(int)');
   message   | detail
-+
 parameter a is overlapped | Local variable overlap function parameter.


How about instead
parameter a is duplicated | Local variable is named the same as the 
function parameter



Line 1278:

+ checker_error(cstate,
+   

Re: [HACKERS] Replication Node Identifiers and crashsafe Apply Progress

2013-11-19 Thread Steve Singer

On 11/14/2013 12:26 PM, Andres Freund wrote:

Hello,

As you know, the reason we are working changeset extraction is that we
want to build logical unidirection and bidirectional replication
ontop. To use changeset extraction effectively, I think one set of
related features ontop is very useful:

When extracting changes using the changeset extraction patchset (latest
version at [1]) the START_LOGICAL_REPLICATION command is used to stream
changes from a source system. When started it will continue to send
changes as long as the connection is up or it is aborted. For obvious
performance reasons it will *not* wait for an ACK for each transaction
commit it streams out.
Instead it relies on the receiver, exactly as in physical replication,
sending feedback messages containing the LSN up to which data has safely
been received.
That means frequently something like:
walsender: = COMMIT 0/1000
walsender: = COMMIT 0/1200
walsender: = COMMIT 0/1400
walsender: = COMMIT 0/1600
receiver:  = ACKNOWLEDGE 0/1270
walsender: = COMMIT 0/1800
is possible and important for performance. I.e. the server has streamed
out more changes than it got confirmation for.

So, when the the replication connection goes down, e.g. because the
receiving side has crashed, we need to tell the server from where to
start. Every position between the last ACKed and the end of WAL is
legal.
The receiver then can ask the source to start replication from the last
replayed commit using START_LOGICAL_REPLICATION 'slot_name'
'0/1600' which would then re-stream all the changes in the
transaction that committe at 0/1600 and all that follow.

But for that the receiving side needs to know up to where changes have
been applied. One relatively easy solution for that is that the
receiving side does something like:
UPDATE replication_progress SET lsn = '0/1600' WHERE source_id = ...;
before the end of every replayed transaction. But that obviously will
quickly cause bloat.


I don't see how this is going to cause any more bloat than what 
trigger-based slony does today with sl_confirm and I don't hear a lot of 
complaints about that being a big problem.   This might be because slony 
doesn't do a commit on the replica for every transaction but groups the 
transactions together, logical slony will behave the same way where we 
would only commit on SYNC transactions.






Our solution to that is that a replaying process can tell the backend
that it is currently doing so and setup three variables for every
transaction:
1) an identifier for the the source database
2) the LSN at which the replayed transaction has committed remotely
3) the time at which the replayed transaction has committed remotely

When the transaction then commits the commit record will set the
XACT_CONTAINS_ORIGIN flag to -xinfo and will add that data to the end
of the commit record. During crash recovery the startup process will
remember the newest LSN for each remote database in shared memory.

This way, after a crash, restart, disconnect the replay process can look
into shared memory and check how far it has already replayed and restart
seamlessly. With minimal effort.

We previously discussed the topic and some were very adverse to using
any sort of numeric node identifiers across systems and suggested that
those should only be used internally. So what the attached patch does is
to add a new shared system catalog called 'pg_replication_identifier'
(suggestions for a better name welcome) which translates a number of
identifying traits into a numeric identifier.
The set of identifiers currently are:
* the sysid of the remote system, combined with the remote TLI
* the oid of the local database
* the oid of the remote database
* an optional name
but that's just what we needed in our multimaster prototype, and not
what I necessarily think is correct.

The added API (which surely need some work, I am not particularly happy
with the naming of functions for one) basically consists of two parts:
1) functions to query/create replication identifiers:
* GetReplicationIdentifier(identifying traits) - search for a numeric 
replication identifier
* CreateReplicationIdentifier(identifying traits) - creates a numeric 
replication identifier
* GetReplicationInfoByIdentifier(numeric identifier) - returns identifying 
traits

2) functions to query/manipulate replication progress:
* AdvanceReplicationIdentifier(node, local_lsn, remote_lsn)
* XLogRecPtr RemoteCommitFromReplicationIdentifier(node)

Internally the code also maintains some on-disk data which is updated
during checkpoints to store the replication progress, otherwise it'd
vanish if we shutdown gracefully ;).

The attached code also integrates with the commit timestamp module
that Alvaro submitted ([2]). Everytime a remote transaction is committed
we store a) the remote commit's timestamp, b) the origin node id in it.
That allows to relatively easily build multimaster systems with conflict

Re: [HACKERS] logical changeset generation v6.5

2013-11-13 Thread Steve Singer

On 11/11/2013 02:06 PM, Andres Freund wrote:

On 2013-11-10 14:45:17 -0500, Steve Singer wrote:

Not really keen - that'd be a noticeable overhead. Note that in the
cases where DEFAULT|INDEX is used, you can just use the new tuple to
extract what you need for the pkey lookup since they now have the same
format and since it's guaranteed that the relevant columns haven't
changed if oldtup is null and there's a key.

What are you actually doing with those columns? Populating a WHERE
clause?


Yup building a WHERE clause


Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.5

2013-11-10 Thread Steve Singer

On 11/10/2013 09:41 AM, Andres Freund wrote:

Still give me the following:
update  disorder.do_inventory set ii_in_stock=2 where ii_id=251;
UPDATE 1
test1=# LOG:  tuple in table with oid: 35122 without primary key
Hm. Could it be that you still have an older test_decoding plugin
lying around? The current one doesn't contain that string
anymore. That'd explain the problems.
In v6.4 the output plugin API was changed that plain heaptuples are
passed for the old key, although with non-key columns set to
NULL. Earlier it was a index tuple as defined by the indexes
TupleDesc.


Grrr, yah that was the problem I had compiled but not installed the 
newer plugin. Sorry.




a) The table does have a primary key
b) I don't get anything in the old key when I was expecting all the rows
c)  If I change the table to use the pkey index with
alter table disorder.do_inventory  replica identity using index
do_inventory_pkey;

The LOG message on the update goes away but the output of the test decoder
plugin goes back to

table do_inventory: UPDATE: ii_id[int8]:251 ii_in_stock[int8]:5
ii_reserved[int8]:144 ii_total_sold[int8]:911

Which I suspect means oldtuple is back to null

Which is legitimate though, if you don't update the primary (or
explicitly chosen candidate) key. Those only get logged if there's
actual changes in those columns.
Makes sense?

Is the expectation that plugin writters will call
RelationGetIndexAttrBitmap(relation,INDEX_ATTR_BITMAP_IDENTITY_KEY);
to figure out what the identity key is.

How do we feel about having the decoder logic populate change.oldtuple 
with the identity
on UPDATE statements when it is null?  The logic I have now is  to use 
oldtuple if it is not null, otherwise go figure out which columns from 
the identiy key we should be using.   I think most plugins that do 
anything useful with an update will need to duplicate that









Greetings,

Andres Freund

--
  Andres Freundhttp://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.5

2013-11-09 Thread Steve Singer

On 11/05/2013 10:21 AM, Andres Freund wrote:

Hi,

Attached to this mail and in the xlog-decoding-rebasing-remapping branch
in my git[1] repository you can find the next version of the patchset that:
* Fixes full table rewrites of catalog tables using the method Robert
   prefers (which is to log rewrite mappings to disk)
* Extract the REPLICA IDENTITY as configured with ALTER TABLE for the
   old tuple for UPDATEs and DELETEs
* Much better support for synchronous replication
* Better resource cleanup (as in we need less local WAL available)
* Lots of smaller fixes
The change around REPLICA IDENTITY is *incompatible* to older output
plugins since we now log tuples using the table's TupleDesc, not the
indexes.


My updated plugin is getting rows with
change-tp.oldtuple as NULL on updates either with the default PRIMARY 
KEY identify or with a  FULL identity.


When I try the test_decoding plugin on UPDATE I get rows like:

table do_inventory: UPDATE: ii_id[int8]:251 ii_in_stock[int8]:1 
ii_reserved[int8]:144 ii_total_sold[int8]:911


which I think is only data from the new tuple.The lack of old-key 
in the output makes me think the test decoding plugin also isn't getting 
the old tuple.


(This is with your patch-set rebased ontop of 
ac4ab97ec05ea900db0f14d428cae2e79832e02d which includes the patches 
Robert committed the other day, I can't rule out that I didn't break 
something in the rebase).







Robert, I'd be very grateful if you could have a look at patch 0007
implementing what we've discussed. I kept it separate to make it easier
to look at it in isolation, but I think in the end it partially should
be merged into the wal_level=logical patch.
I still think the wide cmin/cmax solution is more elegant and has
wider applicability, but this works as well although it's about 5 times
the code.

Comments?

[1]: 
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=summary
Greetings,

Andres Freund







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.5

2013-11-09 Thread Steve Singer

On 11/09/2013 05:42 PM, Andres Freund wrote:

On 2013-11-09 17:36:49 -0500, Steve Singer wrote:

On 11/05/2013 10:21 AM, Andres Freund wrote:

Hi,

Attached to this mail and in the xlog-decoding-rebasing-remapping branch
in my git[1] repository you can find the next version of the patchset that:
* Fixes full table rewrites of catalog tables using the method Robert
   prefers (which is to log rewrite mappings to disk)
* Extract the REPLICA IDENTITY as configured with ALTER TABLE for the
   old tuple for UPDATEs and DELETEs
* Much better support for synchronous replication
* Better resource cleanup (as in we need less local WAL available)
* Lots of smaller fixes
The change around REPLICA IDENTITY is *incompatible* to older output
plugins since we now log tuples using the table's TupleDesc, not the
indexes.

My updated plugin is getting rows with
change-tp.oldtuple as NULL on updates either with the default PRIMARY KEY
identify or with a  FULL identity.

When I try the test_decoding plugin on UPDATE I get rows like:

table do_inventory: UPDATE: ii_id[int8]:251 ii_in_stock[int8]:1
ii_reserved[int8]:144 ii_total_sold[int8]:911

which I think is only data from the new tuple.The lack of old-key in
the output makes me think the test decoding plugin also isn't getting the
old tuple.

(This is with your patch-set rebased ontop of
ac4ab97ec05ea900db0f14d428cae2e79832e02d which includes the patches Robert
committed the other day, I can't rule out that I didn't break something in
the rebase).

I've pushed an updated tree to git, that contains that
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-remapping
git://git.postgresql.org/git/users/andresfreund/postgres.git

and some more fixes. I'll send out an email with details sometime soon.


93c5c2a171455763995cef0afa907bcfaa405db4

Still give me the following:
update  disorder.do_inventory set ii_in_stock=2 where ii_id=251;
UPDATE 1
test1=# LOG:  tuple in table with oid: 35122 without primary key

\d disorder.do_inventory
   Table disorder.do_inventory
Column |  Type  | Modifiers
---++---
 ii_id | bigint | not null
 ii_in_stock   | bigint |
 ii_reserved   | bigint |
 ii_total_sold | bigint |
Indexes:
do_inventory_pkey PRIMARY KEY, btree (ii_id)
Foreign-key constraints:
do_inventory_item_ref FOREIGN KEY (ii_id) REFERENCES 
disorder.do_item(i_id) ON DELETE CASCADE

Referenced by:
TABLE disorder.do_item CONSTRAINT do_item_inventory_ref FOREIGN 
KEY (i_id) REFERENCES disorder.do_inventory(ii_id) DEFERRABLE INITIALLY 
DEFERRED
TABLE disorder.do_restock CONSTRAINT do_restock_inventory_ref 
FOREIGN KEY (r_i_id) REFERENCES disorder.do_inventory(ii_id) ON DELETE 
CASCADE

Triggers:
_disorder_replica_truncatetrigger BEFORE TRUNCATE ON 
disorder.do_inventory FOR EACH STATEMENT EXECUTE PROCEDURE 
_disorder_replica.log_truncate('3')

Disabled triggers:
_disorder_replica_denyaccess BEFORE INSERT OR DELETE OR UPDATE ON 
disorder.do_inventory FOR EACH ROW EXECUTE PROCEDURE 
_disorder_replica.denyaccess('_disorder_replica')
_disorder_replica_truncatedeny BEFORE TRUNCATE ON 
disorder.do_inventory FOR EACH STATEMENT EXECUTE PROCEDURE 
_disorder_replica.deny_truncate()

Replica Identity: FULL


The test decoder plugin gives me:

table do_inventory: UPDATE: old-pkey:


a) The table does have a primary key
b) I don't get anything in the old key when I was expecting all the rows
c)  If I change the table to use the pkey index with
alter table disorder.do_inventory  replica identity using index 
do_inventory_pkey;


The LOG message on the update goes away but the output of the test 
decoder plugin goes back to


table do_inventory: UPDATE: ii_id[int8]:251 ii_in_stock[int8]:5 
ii_reserved[int8]:144 ii_total_sold[int8]:911


Which I suspect means oldtuple is back to null




Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.1

2013-10-08 Thread Steve Singer

On 10/07/2013 09:32 AM, Andres Freund wrote:

Todo:
* rename treat_as_catalog_table, after agreeing on the new name
* rename remaining timetravel function names
* restrict SuspendDecodingSnapshots usage to RelationInitPhysicalAddr,
   that ought to be enough.
* add InLogicalDecoding() function.
* throw away older data when reading xl_running_xacts records, to deal
   with immediate shutdowns/crashes


What is your current plan for decoding sequence updates?  Is this 
something that you were going to hold-off on supporting till a future 
version? ( know this was discussed a while ago but I don't remember 
where it stands now)


From a Slony point of view this isn't a big deal, I can continue to 
capture sequence changes in sl_seqlog when I create each SYNC event and 
then just replicate the INSERT statements in sl_seqlog via logical 
decoding.  I can see why someone building a replication system not based 
on the concept of a SYNC would have a harder time with this.


I am guessing we would want to pass sequence operations to the plugins 
as we encounter the WAL for them out-of-band of any transaction.   This 
would mean that a set of operations like


begin;
insert into a (id) values(4);
insert into a (id) values(nextval('some_seq'));
commit

would be replayed on the replicas as
setval('some_seq',100);
begin;
insert into a (id) values (4);
insert into a (id) values (100);
commit;




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.2

2013-10-07 Thread Steve Singer

On 10/03/2013 04:00 PM, Andres Freund wrote:
Ok, there were a couple of bugs because I thought mxacts wouldn't need 
to be supported. So far your testcase doesn't crash the database 
anymore - it spews some internal errors though, so I am not sure if 
it's entirely fixed for you. Thanks for testing and helping! I've 
pushed the changes to the git tree, they aren't squashed yet and 
there's some further outstanding stuff, so I won't repost the series 
yet. Greetings, Andres Freund 
When I run your updated version (from friday, not what you posted today) 
against a more recent version of my slony changes I can get the test 
case to pass 2/3 'rd of the time.  The failures are due to an issue in 
slon itself that I need to fix.


I see lots of
0LOG:  tx with subtxn 58836

but they seem harmless.

Thanks



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator - Review

2013-10-03 Thread Steve Singer

On 09/30/2013 09:08 AM, Kevin Grittner wrote:

Steve Singer st...@ssinger.info wrote:


How about

To support matching of rows which include elements without a default
  B-tree operator class, the following operators are defined for composite
  type comparison:
  literal*=/,
  literal*lt;gt;/,
  literal*lt;/,
  literal*lt;=/,
  literal*gt;/, and
  literal*gt;=/.

These operators compare the internal binary representation of the two
rows.  Two rows might have a different binary representation even
though comparisons of the two rows with the equality operator is true.
The ordering of rows under these comparision operators is deterministic
but not otherwise meaningful.  These operators are used internally for
materialized views and might be useful for other specialized purposes
such as replication but are not intended to be generally useful for
writing queries.

I agree that's an improvement.  Thanks!



Are there any outstanding issues on this patch preventing it from being 
committed?
I think we have discussed this patch enough such that we now have 
consensus on proceeding with adding a record identical operator to SQL.

No one has objected to the latest names of the operators.

You haven't adjusted the patch to reduce the duplication between the 
equality and comparison functions, if you disagree with me and feel that 
doing so would increase the code complexity and be inconsistent with how 
we do things elsewhere that is fine.


Steve




--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.2

2013-10-03 Thread Steve Singer

On 10/03/2013 12:38 PM, Andres Freund wrote:
Does your code use SELECT FOR UPDATE/SHARE on system or 
treat_as_catalog tables? Greetings, Andres Freund 


Yes.
It declares sl_table and sl_sequence and sl_set as catalog.

It does a
SELECT ..
from @NAMESPACE@.sl_table T, @NAMESPACE@.sl_set S,
pg_catalog.pg_class PGC, pg_catalog.pg_namespace PGN,
pg_catalog.pg_index PGX, pg_catalog.pg_class PGXC
where ... for update

in the code being executed by the 'set add table'.

(We also do select for update commands in many other places during 
cluster configuration commands)




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.2

2013-10-01 Thread Steve Singer

On 09/30/2013 06:44 PM, Andres Freund wrote:

Hi,

The series from friday was a bit too buggy - obviously I was too
tired. So here's a new one:


With this series I've also noticed
#2  0x007741a7 in ExceptionalCondition (
conditionName=conditionName@entry=0x7c2908 !(!(tuple-t_infomask  
0x1000)), errorType=errorType@entry=0x7acc70 FailedAssertion,

fileName=fileName@entry=0x91767e tqual.c,
lineNumber=lineNumber@entry=1608) at assert.c:54
54abort();


 0x007a4432 in HeapTupleSatisfiesMVCCDuringDecoding (
htup=0x10bfe48, snapshot=0x108b3d8, buffer=310) at tqual.c:1608
#4  0x0049d6b7 in heap_hot_search_buffer (tid=tid@entry=0x10bfe4c,
relation=0x7fbebbcd89c0, buffer=310, snapshot=0x10bfda0,
heapTuple=heapTuple@entry=0x10bfe48,
all_dead=all_dead@entry=0x7fff4aa3866f \001\370\375\v\001,
first_call=1 '\001') at heapam.c:1756
#5  0x004a8174 in index_fetch_heap (scan=scan@entry=0x10bfdf8)
at indexam.c:539
#6  0x004a82a8 in index_getnext (scan=0x10bfdf8,
direction=direction@entry=ForwardScanDirection) at indexam.c:622
#7  0x004a6fa9 in systable_getnext (sysscan=sysscan@entry=0x10bfd48)
at genam.c:343
#8  0x0076df40 in RelidByRelfilenode (reltablespace=0,
relfilenode=529775) at relfilenodemap.c:214
---Type return to continue, or q return to quit---
#9  0x00664ad7 in ReorderBufferCommit (rb=0x1082d98,
xid=optimized out, commit_lsn=4638756800, end_lsn=optimized out,
commit_time=commit_time@entry=433970378426176) at reorderbuffer.c:1320

In addition to some of the other ones I've posted about.



* fix pg_recvlogical makefile (Thanks Steve)
* fix two commits not compiling properly without later changes (Thanks Kevin)
* keep track of commit timestamps
* fix bugs with option passing in test_logical_decoding
* actually parse option values in test_decoding instead of just using the
   option name
* don't use anonymous structs in unions. That's compiler specific (msvc
   and gcc) before C11 on which we can't rely. That unfortunately will
   break output plugins because ReorderBufferChange need to qualify
   old/new tuples now
* improve error handling/cleanup in test_logical_decoding
* some minor cleanups

Patches attached, git tree updated.

Greetings,

Andres Freund







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator - Review

2013-09-29 Thread Steve Singer

On 09/28/2013 03:03 PM, Kevin Grittner wrote:

para
+To support matching of rows which include elements without a default
+B-tree operator class, the following operators are defined for composite
+type comparison:
+literal*=/,
+literal*lt;gt;/,
+literal*lt;/,
+literal*lt;=/,
+literal*gt;/, and
+literal*gt;=/.
+These operators are also used internally to maintain materialized views,
+and may be useful to replication software.  To ensure that all user
+visible changes are detected, even when the equality operator for the
+type treats old and new values as equal, the byte images of the stored
+data are compared.  While ordering is deterministic, it is not generally
+useful except to facilitate merge joins.  Ordering may differ between
+system architectures and major releases of
+productnamePostgreSQL/productname.
+   /para


How about

 To support matching of rows which include elements without a default
B-tree operator class, the following operators are defined for composite
type comparison:
literal*=/,
literal*lt;gt;/,
literal*lt;/,
literal*lt;=/,
literal*gt;/, and
   literal*gt;=/.

These operators compare the internal binary representation of the two 
rows.   Two rows might have a different binary representation even 
though comparisons of the two rows with the equality operator is true.   
The ordering of rows under these comparision operators is deterministic 
but not otherwise meaningful.  These operators are used internally for 
materialized views and might be useful for other specialized purposes 
such as replication but are not intended to be generally useful for 
writing queries.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.1

2013-09-28 Thread Steve Singer

On 09/27/2013 11:44 AM, Andres Freund wrote:

I'm encountering a make error:
Gah. Lastminute changes. Always the same... Updated patch attached.

Greetings,

Andres Freund




I'm still encountering an error in the make.

make clean
.
.make[3]: Entering directory 
`/usr/local/src/postgresql/src/bin/pg_basebackup'

rm -f pg_basebackup pg_receivexlog pg_recvlogical(X) \
pg_basebackup.o pg_receivexlog.o pg_recvlogical.o \
receivelog.o streamutil.o
/bin/sh: 1: Syntax error: ( unexpected
make[3]: *** [clean] Error 2

I had to add a quotes in to the clean commands to make it work









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-28 Thread Steve Singer

On 09/27/2013 05:18 PM, Andres Freund wrote:

Hi Steve,

On 2013-09-27 17:06:59 -0400, Steve Singer wrote:

I've determined that when in this test the walsender seems to be hitting
this when it is decode the transactions that are behind the slonik
commands to add tables to replication (set add table, set add sequence).
This is before the SUBSCRIBE SET is submitted.

I've also noticed something else that is strange (but might be unrelated).
If I stop my slon process and restart it I get messages like:

WARNING:  Starting logical replication from 0/a9321360
ERROR:  cannot stream from 0/A9321360, minimum is 0/A9320B00

Where 0/A9321360 was sent in the last packet my slon received from the
walsender before the restart.

Uh, that looks like I fumbled some comparison. Let me check.


I've further narrowed this down to something (or the combination of) what
the  _disorder_replica.altertableaddTriggers(1);
stored function does.  (or @SLONYNAMESPACE@.altertableaddTriggers(int);

Which is essentially
* Get an exclusive lock on sl_config_lock
* Get an exclusive lock on the user table in question
* create a trigger (the deny access trigger)
* create a truncate trigger
* create a deny truncate trigger

I am not yet able to replicate the error by issuing the same SQL commands
from psql, but I must be missing something.

I can replicate this when just using the test_decoding plugin.

Thanks. That should get me started with debugging. Unless it's possibly
fixed in the latest version, one bug fixed there might cause something
like this if the moon stands exactly right?


The latest version has NOT fixed the problem.

Also, I was a bit inaccurate in my previous descriptions. To clarify:

1.   I sometimes am getting that 'unexpected duplicate' error
2.   The 'set add table ' which triggers those functions that create and 
configure triggers is actually causing the walsender to hit the 
following assertion

2  0x00773d47 in ExceptionalCondition (
conditionName=conditionName@entry=0x8cf400 !(ent-cmin == 
change-tuplecid.cmin), errorType=errorType@entry=0x7ab830 
FailedAssertion,

fileName=fileName@entry=0x8cecc3 reorderbuffer.c,
lineNumber=lineNumber@entry=1162) at assert.c:54
#3  0x00665480 in ReorderBufferBuildTupleCidHash (txn=0x1b6e610,
rb=optimized out) at reorderbuffer.c:1162
#4  ReorderBufferCommit (rb=0x1b6e4f8, xid=optimized out,
commit_lsn=3461001952, end_lsn=optimized out) at reorderbuffer.c:1285
#5  0x0065f0f7 in DecodeCommit (xid=optimized out,
nsubxacts=optimized out, sub_xids=optimized out, ninval_msgs=16,
msgs=0x1b637c0, buf=0x7fff54d01530, buf=0x7fff54d01530, ctx=0x1adb928,
ctx=0x1adb928) at decode.c:477


I had added an assert(false) to the code where the 'unknown duplicate' 
error was logged to make spotting this easier but yesterday I didn't 
double check that I was hitting the assertion I added versus this other 
one.   I can't yet say if this is two unrelated issues or if I'd get to 
the 'unknown duplicate' message immediately after.






Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-27 Thread Steve Singer

On 09/26/2013 02:47 PM, Steve Singer wrote:



I've determined that when in this test the walsender seems to be 
hitting this when it is decode the transactions that are behind the 
slonik commands to add tables to replication (set add table, set add 
sequence).  This is before the SUBSCRIBE SET is submitted.


I've also noticed something else that is strange (but might be 
unrelated).  If I stop my slon process and restart it I get messages 
like:


WARNING:  Starting logical replication from 0/a9321360
ERROR:  cannot stream from 0/A9321360, minimum is 0/A9320B00

Where 0/A9321360 was sent in the last packet my slon received from the 
walsender before the restart.


If force it to restart replication from 0/A9320B00 I see datarows that 
I appear to have already seen before the restart.
I think this is happening when I process the data for 0/A9320B00 but 
don't get the feedback message my slon was killed. Is this expected?





I've further narrowed this down to something (or the combination of) 
what the  _disorder_replica.altertableaddTriggers(1);

stored function does.  (or @SLONYNAMESPACE@.altertableaddTriggers(int);

Which is essentially
* Get an exclusive lock on sl_config_lock
* Get an exclusive lock on the user table in question
* create a trigger (the deny access trigger)
* create a truncate trigger
* create a deny truncate trigger

I am not yet able to replicate the error by issuing the same SQL 
commands from psql, but I must be missing something.


I can replicate this when just using the test_decoding plugin.











Greetings,

Andres Freund













--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-26 Thread Steve Singer

On 09/25/2013 01:20 PM, Steve Singer wrote:

On 09/25/2013 11:08 AM, Andres Freund wrote:

On 2013-09-25 11:01:44 -0400, Steve Singer wrote:

On 09/17/2013 10:31 AM, Andres Freund wrote:
This patch set now fails to apply because of the commit Rename 
various

freeze multixact variables.
And I am even partially guilty for that patch...

Rebased patches attached.
While testing the logical replication changes against my WIP logical 
slony I

am sometimes getting error messages from the WAL sender of the form:
unexpected duplicate for tablespace  X relfilenode  X

Any chance you could provide a setup to reproduce the error?



The steps to build a setup that should reproduce this error are:

1.  I had apply the attached patch on top of your logical replication 
branch so my pg_decode_init  would now if it was being called as part 
of a INIT_REPLICATION or START_REPLICATION.
Unless I have misunderstood something you probably will want to merge 
this fix in


2.  Get my WIP for adding logical support to slony from: 
g...@github.com:ssinger/slony1-engine.git branch logical_repl 
(4af1917f8418a)
(My code changes to slony are more prototype level code quality than 
production code quality)


3.
cd slony1-engine
./configure --with-pgconfigdir=/usr/local/pg94wal/bin (or whatever)
make
make install

4.   Grab the clustertest framework JAR from 
https://github.com/clustertest/clustertest-framework and build up a 
clustertest jar file


5.  Create a file
slony1-engine/clustertest/conf/java.conf
that contains the path to the above JAR file  as a shell variable 
assignment: ie
CLUSTERTESTJAR=/home/ssinger/src/clustertest/clustertest_git/build/jar/clustertest-coordinator.jar 



6.
cp clustertest/conf/disorder.properties.sample 
clustertest/conf/disorder.properties



edit disorder.properites to have the proper values for your 
environment.  All 6 databases can point at the same postgres instance, 
this test will only actually use 2 of them(so far).


7. Run the test
cd clustertest
./run_all_disorder_tests.sh

This involves having the slon connect to the walsender on the database 
test1 and replicate the data into test2 (which is a different database 
on the same postmaster)


If this setup seems like too much effort I can request one of the 
commitfest VM's from Josh and get everything setup there for you.


Steve


Any ideas?

I'll look into it. Could you provide any context to what youre doing
that's being decoded?




I've determined that when in this test the walsender seems to be hitting 
this when it is decode the transactions that are behind the slonik 
commands to add tables to replication (set add table, set add 
sequence).  This is before the SUBSCRIBE SET is submitted.


I've also noticed something else that is strange (but might be 
unrelated).  If I stop my slon process and restart it I get messages like:


WARNING:  Starting logical replication from 0/a9321360
ERROR:  cannot stream from 0/A9321360, minimum is 0/A9320B00

Where 0/A9321360 was sent in the last packet my slon received from the 
walsender before the restart.


If force it to restart replication from 0/A9320B00 I see datarows that I 
appear to have already seen before the restart.
I think this is happening when I process the data for 0/A9320B00 but 
don't get the feedback message my slon was killed. Is this expected?





Greetings,

Andres Freund









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-25 Thread Steve Singer

On 09/17/2013 10:31 AM, Andres Freund wrote:

This patch set now fails to apply because of the commit Rename various
freeze multixact variables.
And I am even partially guilty for that patch...

Rebased patches attached.


While testing the logical replication changes against my WIP logical 
slony I am sometimes getting error messages from the WAL sender of the form:

unexpected duplicate for tablespace  X relfilenode  X

The call stack is

HeapTupleSatisfiesMVCCDuringDecoding
heap_hot_search_buffer
index_fetch_heap
index_getnext
systable_getnext
RelidByRelfilenode
ReorderBufferCommit
 DecodeCommit
.
.
.


I am working off something based on your version 
e0acfeace6d695c229efd5d78041a1b734583431



Any ideas?


Greetings,

Andres Freund







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-25 Thread Steve Singer

On 09/25/2013 11:08 AM, Andres Freund wrote:

On 2013-09-25 11:01:44 -0400, Steve Singer wrote:

On 09/17/2013 10:31 AM, Andres Freund wrote:

This patch set now fails to apply because of the commit Rename various
freeze multixact variables.
And I am even partially guilty for that patch...

Rebased patches attached.

While testing the logical replication changes against my WIP logical slony I
am sometimes getting error messages from the WAL sender of the form:
unexpected duplicate for tablespace  X relfilenode  X

Any chance you could provide a setup to reproduce the error?



The steps to build a setup that should reproduce this error are:

1.  I had apply the attached patch on top of your logical replication 
branch so my pg_decode_init  would now if it was being called as part of 
a INIT_REPLICATION or START_REPLICATION.
Unless I have misunderstood something you probably will want to merge 
this fix in


2.  Get my WIP for adding logical support to slony from: 
g...@github.com:ssinger/slony1-engine.git branch logical_repl  
(4af1917f8418a)
(My code changes to slony are more prototype level code quality than 
production code quality)


3.
cd slony1-engine
./configure --with-pgconfigdir=/usr/local/pg94wal/bin (or whatever)
make
make install

4.   Grab the clustertest framework JAR from 
https://github.com/clustertest/clustertest-framework and build up a 
clustertest jar file


5.  Create a file
slony1-engine/clustertest/conf/java.conf
that contains the path to the above JAR file  as a shell variable 
assignment: ie

CLUSTERTESTJAR=/home/ssinger/src/clustertest/clustertest_git/build/jar/clustertest-coordinator.jar

6.
cp clustertest/conf/disorder.properties.sample 
clustertest/conf/disorder.properties



edit disorder.properites to have the proper values for your 
environment.  All 6 databases can point at the same postgres instance, 
this test will only actually use 2 of them(so far).


7. Run the test
cd clustertest
./run_all_disorder_tests.sh

This involves having the slon connect to the walsender on the database 
test1 and replicate the data into test2 (which is a different database 
on the same postmaster)


If this setup seems like too much effort I can request one of the 
commitfest VM's from Josh and get everything setup there for you.


Steve


Any ideas?

I'll look into it. Could you provide any context to what youre doing
that's being decoded?

Greetings,

Andres Freund



From 9efae980c357d3b75c6d98204ed75b8d29ed6385 Mon Sep 17 00:00:00 2001
From: Steve Singer ssin...@ca.afilias.info
Date: Mon, 16 Sep 2013 10:21:39 -0400
Subject: [PATCH] set the init parameter to true when performing an INIT
 REPLICATION

---
 src/backend/replication/walsender.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2187d96..1b8f289 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -743,7 +743,7 @@ InitLogicalReplication(InitLogicalReplicationCmd *cmd)
 	sendTimeLine = ThisTimeLineID;
 
 	initStringInfo(output_message);
-	ctx = CreateLogicalDecodingContext(MyLogicalDecodingSlot, false, InvalidXLogRecPtr,
+	ctx = CreateLogicalDecodingContext(MyLogicalDecodingSlot, true, InvalidXLogRecPtr,
 	   NIL,	replay_read_page,
 	   WalSndPrepareWrite, WalSndWriteData);
 
-- 
1.7.10.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-24 Thread Steve Singer

On 09/24/2013 11:21 AM, Andres Freund wrote:

Not having a consumer of the walsender interface included sounds like a
bad idea to me, even if it were only useful for testing. Now, you could
argue it should be in /contrib - and I wouldn't argue against that
except it shares code with the rest of src/bin/pg_basebackup.


+1 on pg_receivellog (or whatever better name we pick) being somewhere.
I found the pg_receivellog code very useful as an example and for 
debugging/development purposes.
It isn't something that I see useful for the average user and I think 
the use-cases it meets are closer to other things we usually put in /contrib


Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6

2013-09-21 Thread Steve Singer

On 09/20/2013 06:33 AM, Andres Freund wrote:

Hi,




The points I find daunting are the semantics, like:
* How do we control whether a standby is allowed prevent WAL file
   removal. What if archiving is configured?
* How do we control whether a standby is allowed to peg xmin?
* How long do we peg an xmin/wal file removal if the standby is gone
* How does the userinterface look to remove a slot if a standby is gone
* How do we decide/control which commands use a slot in which cases?


I think we are going to want to be flexible enough to support users with 
a couple of different points of use-cases.
* Some people will want to keep xmin pegged and prevent WAL removal so a 
standby with a slot can always catch up, and wi
* Most people will want to say keep X megabytes of WA (if needed by a 
behind slot) and keep xmin pegged so that the WAL can be consumed by a 
logical plugin.


I can see us also implementing a restore_command that the walsender 
could use to get archived segments but for logical replication xmin 
would still need to be low enough


I don't think the current patch set is incompatible with us later 
implementing any of the above. I'd rather see us focus on getting the 
core functionality committed and worry about a good interface for 
managing slots later.



Greetings, Andres Freund 


Steve



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator - Review

2013-09-20 Thread Steve Singer

On 09/20/2013 08:38 AM, Kevin Grittner wrote:
Did you look at the record_eq and record_cmp functions which exist 
before this patch?  Is there a reason we should do it one way for the 
default operators and another way for this alternative? 
Do you think we should change record_eq and record_cmp to do things 
the way you suggest? 


I think the record_eq and record_cmp functions would be better if they 
shared code as well, but I don't think changing that should be part of 
this patch, you aren't otherwise touching those functions. I know some 
people dislike code that is switch based and prefer duplication, my 
preference is to avoid duplication.


This seems like a bad idea to me.  I don't like that I get a type comparison
error only sometimes based on the values of the data in the column.  If I'm
a user testing code that compares two records with this operator and I test my
query with 1 value pair then and it works then I'd naively expect to get a
true or false on all other value sets of the same record type.


Again, this is a result of following the precedent set by the
existing record comparison operators.

test=# create table r1 (c1 int, c2 int);
CREATE TABLE
test=# create table r2 (c1 int, c2 int, c3 int);
CREATE TABLE
test=# insert into r1 values (1, 2);
INSERT 0 1
test=# insert into r2 values (3, 4, 5);
INSERT 0 1
test=# select * from r1 join r2 on r1  r2;
  c1 | c2 | c1 | c2 | c3
++++
   1 |  2 |  3 |  4 |  5
(1 row)

test=# update r1 set c1 = 3, c2 = 4;
UPDATE 1
test=# select * from r1 join r2 on r1  r2;
ERROR:  cannot compare record types with different numbers of columns

Would be in favor of forcing a change to the record comparison
operators which have been in use for years?  If not, is there a
good reason to have them behave differently in this regard?

--
I hadn't picked up on that you copied that part of the behaviour from 
the existing comparison operators.
No we would need a really good reason for changing the behaviour of the 
comparison operators and I don't think we have that.  I agree that the 
binary identical operators should behave similarly to the existing 
comparison operators and error out on the column number mismatch after 
comparing the columns that are present in both.


Steve


Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator - Review

2013-09-19 Thread Steve Singer

On 09/12/2013 06:27 PM, Kevin Grittner wrote:

Attached is a patch for a bit of infrastructure I believe to be
necessary for correct behavior of REFRESH MATERIALIZED VIEW
CONCURRENTLY as well as incremental maintenance of matviews.


Here is attempt at a review of the v1 patch.

There has been a heated discussion on list about if we even want this 
type of operator. I will try to summarize it here


The arguments against it are
* that a record_image_identical call on two records that returns false 
can still return true under the equality operator, and the records might 
(or might not) appear to be the same to users
* Once we expose this as an operator via SQL someone will find it and 
use it and then complain when we change things such as the internal 
representation of a datatype which might

   break there queries
* The differences between = and record_image_identical might not be 
understood by everywhere who finds the operator leading to confusion
* Using a criteria other than equality (the = operator provided by the 
datatype) might cause problems if we later provide 'on change' triggers 
for materialized views


The arguments for this patch are
* We want the materialized view to return the same value as would be 
returned if the query were executed directly.  This not only means that 
the values should be the same according to a datatypes = operator but 
that they should appear the same 'to the eyeball'.
* Supporting the materialized view refresh check via SQL makes a lot of 
sense but doing that requires exposing something via SQL
* If we adequately document what we mean by record_image_identical and 
the operator we pick for this then users shouldn't be surprised at what 
they get if they use this


I think there is agreement that better (as in more obscure) operators 
than === and !== need to be picked  and we need to find a place in the 
user-docs to warn users of the behaviour of this operators.   Hannu has 
proposed


*==*   binary equal, surely very equal by any other definition as wall
!==?  maybe not equal -- binary inequal, may still be equal by
other comparison methods

and no one has yet objected to this.  My vote would be to update the patch to 
use those operator names and then figure out where we can document them.  It it 
means we have to section describing the equal operator for records as well then 
maybe we should do that so we can get on with things.


Code Review


The record_image_eq and record_image_cmp functions are VERY similar.  It seems 
to me that you could have the meet of these functions into a common function 
(that isn't exposed via SQL) that then behaves differently  in 2 or 3 spots 
based on a parameter that controls if it detoasts the tuple for the compare or 
returns false on the equals.

Beyond that I don't see any issues with the code.

You call out a question about two records being compared have a different 
number of columns should it always be an error, or only an error when they 
match on the columns they do have in common.

The current behaviour is

 select * FROM r1,r2 where r1==r2;
 a | b | a | b | c
---+---+---+---+---
 1 | 1 | 1 | 2 | 1
(1 row)

update r2 set b=1;
UPDATE 1
test=# select * FROM r1,r2 where r2==r1;
ERROR:  cannot compare record types with different numbers of columns

This seems like a bad idea to me.  I don't like that I get a type comparison 
error only sometimes based on the values of the data in the column.  If I'm a 
user testing code that compares two records with this operator and I test my 
query with 1 value pair then and it works then I'd naively expect to get a true 
or false on all other value sets of the same record type.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-18 Thread Steve Singer

On 09/18/2013 11:39 AM, Stephen Frost wrote:

* Kevin Grittner (kgri...@ymail.com) wrote:

= and  aren't listed above even though they do a byte-for-byte
comparison because, well, I guess because we have chosen to treat
two UTF8 strings which produce the same set of glyphs using
different bytes as unequal.  :-/

I tend to side with Andres on this case actually- we're being asked to
store specific UTF8 bytes by the end user.  That is not the same as
treating two different numerics which are the same *number* as
different because they have different binary representations, which is
entirely an internal-to-postgres consideration.



The problem is that there are datatypes (citext, postgis,...) that have 
defined = to return true when comparing two values that are different 
not just stored differently.  Are you saying that matview's should 
update only when the = operator of the datatype returns false and if 
people don't like this behaviour they should fix the datatypes?




Thanks,

Stephen




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changeset generation v5-01 - Patches git tree

2013-07-05 Thread Steve Singer

On 07/05/2013 08:03 AM, Andres Freund wrote:

On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:
Tried that, too, and problem persists.  The log shows the last commit 
on your branch as 022c2da1873de2fbc93ae524819932719ca41bdb. 

Ok. I think I have a slight idea what's going on. Could you check
whether recompiling with -O0 fixes the issue?

There's something strange going on here, not sure whether it's just a
bug that's hidden, by either not doing optimizations or by adding more
elog()s, or wheter it's a compiler bug.


I am getting the same test failure Kevin is seeing.
This is on a x64 Debian wheezy machine with
gcc (Debian 4.7.2-5) 4.7.2

Building with -O0 results in passing tests.




Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changeset generation v5-01 - Patches git tree

2013-07-05 Thread Steve Singer

On 07/05/2013 09:34 AM, Andres Freund wrote:

On 2013-07-05 09:28:45 -0400, Steve Singer wrote:

On 07/05/2013 08:03 AM, Andres Freund wrote:

On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:

Tried that, too, and problem persists.  The log shows the last commit on
your branch as 022c2da1873de2fbc93ae524819932719ca41bdb.

Ok. I think I have a slight idea what's going on. Could you check
whether recompiling with -O0 fixes the issue?

There's something strange going on here, not sure whether it's just a
bug that's hidden, by either not doing optimizations or by adding more
elog()s, or wheter it's a compiler bug.

I am getting the same test failure Kevin is seeing.
This is on a x64 Debian wheezy machine with
gcc (Debian 4.7.2-5) 4.7.2

Building with -O0 results in passing tests.

Does the patch from
http://archives.postgresql.org/message-id/20130705132513.GB11640%40awork2.anarazel.de
or the git tree (which is rebased ontop of the mvcc catalog commit from
robert which needs some changes) fix it, even with optimizations?



Yes your latest git tree the tests pass with -O2



Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] changeset generation v5-01 - Patches git tree

2013-07-05 Thread Steve Singer

On 06/14/2013 06:51 PM, Andres Freund wrote:

The git tree is at:
git://git.postgresql.org/git/users/andresfreund/postgres.git branch 
xlog-decoding-rebasing-cf4
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4



We discussed issues related to passing options to the plugins a number 
of months ago ( 
http://www.postgresql.org/message-id/20130129015732.ga24...@awork2.anarazel.de)


I'm still having issues with the syntax you describe there.


START_LOGICAL_REPLICATION 1 0/0 (foo,bar)
 unexpected termination of replication stream: ERROR:  foo requires a 
parameter


START_LOGICAL_REPLICATION 1 0/0 (foo bar)
START_LOGICAL_REPLICATION 1 0/0 (foo bar): ERROR:  syntax error

START_LOGICAL_REPLICATION 1 0/0 (foo)
works okay



Steve





On 2013-06-15 00:48:17 +0200, Andres Freund wrote:

Overview of the attached patches:
0001: indirect toast tuples; required but submitted independently
0002: functions for testing; not required,
0003: (tablespace, filenode) syscache; required
0004: RelationMapFilenodeToOid: required, simple
0005: pg_relation_by_filenode() function; not required but useful
0006: Introduce InvalidCommandId: required, simple
0007: Adjust Satisfies* interface: required, mechanical,
0008: Allow walsender to attach to a database: required, needs review
0009: New GetOldestXmin() parameter; required, pretty boring
0010: Log xl_running_xact regularly in the bgwriter: required
0011: make fsync_fname() public; required, needs to be in a different file
0012: Relcache support for an Relation's primary key: required
0013: Actual changeset extraction; required
0014: Output plugin demo; not required (except for testing) but useful
0015: Add pg_receivellog program: not required but useful
0016: Add test_logical_decoding extension; not required, but contains
   the tests for the feature. Uses 0014
0017: Snapshot building docs; not required

Version v5-01 attached

Greetings,

Andres Freund







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-28 Thread Steve Singer

On 06/27/2013 05:04 AM, Szymon Guz wrote:
On 27 June 2013 05:21, Steve Singer st...@ssinger.info 
mailto:st...@ssinger.info wrote:


On 06/26/2013 04:47 PM, Szymon Guz wrote:






Hi Steve,
thanks for the changes.

You're idea about common code for decimal and cdecimal is good, 
however not good enough. I like the idea of common code for decimal 
and cdecimal. But we need class name, not the value.


I've changed the code from str(x) to x.__class__.__name__ so the 
function prints class name (which is Decimal for both packages), not 
the value. We need to have the class name check. The value is returned 
by the function and is a couple of lines lower in the file.


patch is attached.



I think the value is more important than the name, I want to the tests 
to make sure that the conversion is actually converting properly.  With 
your method of getting the class name without the module we can have both.


The attached patch should print the value and the class name but not the 
module name.


Steve



thanks,
Szymon








diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
new file mode 100644
index aaf758d..da27874
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 308,321 
/para
   /listitem
  
   listitem
para
!PostgreSQL typereal/type, typedouble/type,
!and typenumeric/type are converted to
!Python typefloat/type.  Note that for
!the typenumeric/type this loses information and can lead to
!incorrect results.  This might be fixed in a future
!release.
/para
   /listitem
  
--- 308,326 
/para
   /listitem
  
+ 	 listitem
+   para
+PostgreSQL typereal/type and typedouble/type are converted to
+Python typefloat/type.
+   /para
+  /listitem
+ 
   listitem
para
!PostgreSQL typenumeric/type is converted to
!Python typeDecimal/type. This type is imported from 
! 	   literalcdecimal/literal package if it is available. If cdecimal
! 	   cannot be used, then literaldecimal.Decimal/literal will be used.
/para
   /listitem
  
diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
new file mode 100644
index 4641345..e602336
*** a/src/pl/plpython/expected/plpython_types.out
--- b/src/pl/plpython/expected/plpython_types.out
*** CONTEXT:  PL/Python function test_type_
*** 213,248 
  (1 row)
  
  CREATE FUNCTION test_type_conversion_numeric(x numeric) RETURNS numeric AS $$
! plpy.info(x, type(x))
  return x
  $$ LANGUAGE plpythonu;
! /* The current implementation converts numeric to float. */
  SELECT * FROM test_type_conversion_numeric(100);
! INFO:  (100.0, type 'float')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
! 100.0
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(-100);
! INFO:  (-100.0, type 'float')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
!-100.0
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(50.5);
! INFO:  (50.5, type 'float')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
   50.5
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(null);
! INFO:  (None, type 'NoneType')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
--- 213,264 
  (1 row)
  
  CREATE FUNCTION test_type_conversion_numeric(x numeric) RETURNS numeric AS $$
! plpy.info(x,x.__class__.__name__)
  return x
  $$ LANGUAGE plpythonu;
! /* The current implementation converts numeric to Decimal. */
  SELECT * FROM test_type_conversion_numeric(100);
! INFO:  (Decimal('100'), 'Decimal')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
!   100
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(-100);
! INFO:  (Decimal('-100'), 'Decimal')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
!  -100
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(50.5);
! INFO:  (Decimal('50.5'), 'Decimal')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
   50.5
  (1 row)
  
+ SELECT * FROM test_type_conversion_numeric(1234567890.0987654321);
+ INFO:  (Decimal('1234567890.0987654321'), 'Decimal')
+ CONTEXT:  PL/Python function test_type_conversion_numeric
+  test_type_conversion_numeric

Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-26 Thread Steve Singer

On 06/26/2013 04:47 PM, Szymon Guz wrote:





Attached patch has all changes against trunk code.

There is added a function for conversion from Postgres numeric to 
Python Decimal. The Decimal type is taken from cdecimal.Decimal, if it 
is available. It is an external library, quite fast, but may be not 
available. If it is not available, then decimal.Decimal will be used. 
It is in standard Python library, however it is rather slow.


The initialization is done in the conversion function, the pointer to 
a proper Decimal constructor is stored as static variable inside the 
function and is lazy initialized.


The documentation is updated.



I've tested this version with python 2.7 with and without cdecimal and 
also with 3.3 that has the faster decimal performance. It seems fine.


The v5 version of the patch makes only white-space changes to 
plpy_main.c you should excluded that from the patch if your making a new 
version (I have done this in the v6 version I'm attaching)



Tests for python 2 and 3 have been added. They work only with standard 
decimal.Decimal, as the type is printed in the *.out files. I think 
there is nothing we can do with that now.





I think we should make  test_type_conversion_numeric to do something 
that generates the same output in both cases.  ie
py.info(str(x)).  I downside of having the test fail on installs with 
cdecimal installed is much greater than any benefit we get by ensuring 
that the type is really decimal.
I've attached a v6 version of the patch that does this, do you agree 
with my thinking?





Steve







regards,
Szymon




diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
new file mode 100644
index aaf758d..da27874
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 308,321 
/para
   /listitem
  
   listitem
para
!PostgreSQL typereal/type, typedouble/type,
!and typenumeric/type are converted to
!Python typefloat/type.  Note that for
!the typenumeric/type this loses information and can lead to
!incorrect results.  This might be fixed in a future
!release.
/para
   /listitem
  
--- 308,326 
/para
   /listitem
  
+ 	 listitem
+   para
+PostgreSQL typereal/type and typedouble/type are converted to
+Python typefloat/type.
+   /para
+  /listitem
+ 
   listitem
para
!PostgreSQL typenumeric/type is converted to
!Python typeDecimal/type. This type is imported from 
! 	   literalcdecimal/literal package if it is available. If cdecimal
! 	   cannot be used, then literaldecimal.Decimal/literal will be used.
/para
   /listitem
  
diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
new file mode 100644
index 4641345..46308ed
*** a/src/pl/plpython/expected/plpython_types.out
--- b/src/pl/plpython/expected/plpython_types.out
*** CONTEXT:  PL/Python function test_type_
*** 213,248 
  (1 row)
  
  CREATE FUNCTION test_type_conversion_numeric(x numeric) RETURNS numeric AS $$
! plpy.info(x, type(x))
  return x
  $$ LANGUAGE plpythonu;
! /* The current implementation converts numeric to float. */
  SELECT * FROM test_type_conversion_numeric(100);
! INFO:  (100.0, type 'float')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
! 100.0
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(-100);
! INFO:  (-100.0, type 'float')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
!-100.0
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(50.5);
! INFO:  (50.5, type 'float')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
   50.5
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(null);
! INFO:  (None, type 'NoneType')
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
--- 213,264 
  (1 row)
  
  CREATE FUNCTION test_type_conversion_numeric(x numeric) RETURNS numeric AS $$
! plpy.info(str(x))
  return x
  $$ LANGUAGE plpythonu;
! /* The current implementation converts numeric to decimal.Decimal. */
  SELECT * FROM test_type_conversion_numeric(100);
! INFO:  100
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
!   100
  (1 row)
  
  SELECT * FROM test_type_conversion_numeric(-100);
! INFO:  -100
  CONTEXT:  PL/Python function test_type_conversion_numeric
   test_type_conversion_numeric 
  --
!  -100
  (1 row)
  
 

Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-25 Thread Steve Singer

On 06/25/2013 06:42 AM, Szymon Guz wrote:




Hi,
I've attached a new patch. I've fixed all the problems you've found, 
except for the efficiency problem, which has been described in 
previous email.


thanks,
Szymon



This version of the patch addresses the issues I mentioned.  Thanks for 
looking into seeing if the performance issue is with our conversions to 
strings or inherit with the python decimal type.  I guess we 
(Postgresql) can't do much about it.   A runtime switch to use cdecimal 
if it is available is a good idea, but I agree with you that could be a 
different patch.


One minor thing I noticed in this round,

 PLy_elog(ERROR, could not import module 'decimal');

I think should have decimal in double-quotes.

I think this patch is ready for a committer to look at it.

Steve







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix conversion for Decimal arguments in plpython functions

2013-06-24 Thread Steve Singer

On 05/28/2013 04:41 PM, Szymon Guz wrote:

Hi,
I've got a patch.

This is for a plpython enhancement.

There is an item at the TODO list 
http://wiki.postgresql.org/wiki/Todo#Server-Side_Languages
Fix loss of information during conversion of numeric type to Python 
float


This patch uses a decimal.Decimal type from Python standard library 
for the plpthon function numeric argument instead of float.


Patch contains changes in code, documentation and tests.

Most probably there is something wrong, as this is my first Postgres 
patch :)




Thanks for contributing.

This patch applies cleanly against master and compiles with warnings

plpy_main.c: In function ‘PLy_init_interp’:
plpy_main.c:157:2: warning: ISO C90 forbids mixed declarations and code 
[-Wdeclaration-after-statement]
plpy_main.c:161:2: warning: ISO C90 forbids mixed declarations and code 
[-Wdeclaration-after-statement]


You can avoid this by moving the declaration of decimal and decimal_dict 
to be at the top of the function where mainmod is declared.


Also in this function you've introduced places where it returns with an 
error (the PLy_elog(ERROR...) calls before decrementing the reference to 
mainmod. I think you can decrement the mainmod reference after the call 
to SetItemString  before your changes that import the Decimal module.



The patch works as expected, I am able to write python functions that 
take numerics as arguments and work with them.  I can adjust the decimal 
context precision inside of  my function.


One concern I have is that this patch makes pl/python functions 
involving numerics more than 3 times as slow as before.



create temp table b(a numeric);
insert into b select generate_series(1,1);

create or replace function x(a numeric,b numeric) returns numeric as $$
if a==None:
  return b
return a+b
$$ language plpythonu;
create aggregate sm(basetype=numeric, sfunc=x,stype=numeric);


test=# select sm(a) from b;
sm
--
 50005000
(1 row)

Time: 565.650 ms

versus before the patch this was taking in the range of 80ms.

Would it be faster to call numeric_send instead of numeric_out and then 
convert the sequence of Int16's to a tuple of digits that can be passed 
into the Decimal constructor? I think this is worth trying and testing,



Documentation
=
Your patched version of the docs say

  PostgreSQL typereal/type, typedouble/type, and 
typenumeric/type are converted to
   Python typeDecimal/type. This type is imported 
fromliteraldecimal.Decimal/literal.



I don't think this is correct, as far as I can tell your not changing 
the behaviour for postgresql real and double types, they continue to use 
floating point.




listitem
para
   PostgreSQL typereal/type and typedouble/typeare converted to
   Python typefloat/type.
/para
/listitem

listitem
para
   PostgreSQL typenumeric/type is converted to
   Python typeDecimal/type. This type is imported from
literaldecimal.Decimal/literal.
/para
/listitem


Maybe?

Steve



thanks,
Szymon







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] pg_upgrade fails, mismatch of relation OID - 9.1.9 to 9.2.4

2013-05-13 Thread Steve Singer

On 05/11/2013 04:58 PM, Bruce Momjian wrote:

On Fri, May 10, 2013 at 08:03:38PM -0400, Bruce Momjian wrote:

OK, this verifies that the table had a lot of DDL churn.  I have no idea
how to pursue this further because I am unsure how we are going to
replicate the operations performed on this table in the past, as you
mentioned much of this was before your time on the job.

Evan, I suggest you force a toast table on the table by doing:

ALTER TABLE bpm.setupinfo ADD COLUMN dummy TEXT;

Then drop the column.  That will create a toast table and will allow
pg_upgrade to succeed.

FYI, I did test adding a TEXT column and altering a column to TEXT on
Postgres 9.1, and both created a toast table.  I am still have no clues
about what would have caused the missing toast table.



I once saw a case where a varchar(x) column was changed to something 
larger by manually updating the catalog with an UPDATE statement on 
pg_attribute.atttypmod. Everything was fine until they tried pg_upgrade 
which failed because the DDL to create the table from pg_dump with the 
larger column creates a table that had a toast table but the original 
table in the 8.3 cluster did not have a toast table.


Steve




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance with the new security release?

2013-04-23 Thread Steve Singer

On 13-04-22 11:46 PM, Anne Rosset wrote:

Thanks Steve.
I have read that a fix has been put in release 9.2.3 for this issue. Is that 
right?
Thanks,
Anne


No this issue is present in 9.0.13, 9.1.9 and 9.2.4 (as well as 9.2.3).

There is talk about fixing this for the next set of minor releases but I 
haven't yet seen a patch.



-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 4:35 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 04:41 PM, Anne Rosset wrote:

Hi Steve,
Yes I see these messages in our log. Is there a solution to this?
Thanks,
Anne

A manual analyze of the effected tables should work and give you updated 
statistics.  If your problem is just statistics then that should help.
A manual vacuum will , unfortunately, behave like the auto-vacuum. The only way 
to get vacuum past this (until this issue is fixed) is for
vacuum to be able to get that exclusive lock.   If there are times of
the day your database is less busy you might have some luck turning off 
auto-vacuum on these tables  and doing manual vacuums during those times.





-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 1:26 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 04:15 PM, Anne Rosset wrote:

Hi Steve,
Thanks for your reply.
We are now running  9.0.13. Before it was 9.0.7.
How can I find out if we are running into this issue: ie if
statistics are no longer being updated because analyze can't get the exclusive lock 
for truncation?

This issue is discussed in the thread
http://www.postgresql.org/message-id/CAMkU=1xYXOJp=jLAASPdSAqab-HwhA_t
nRhy+JUe=4=b=v3...@mail.gmail.com

If your seeing messages in your logs of the form:

automatic vacuum of table XXX.YYY cannot (re)acquire exclusive lock for truncate 
scan

then you might be hitting this issue.



I will dig into our logs to see for the query times.
Thanks,
Anne

-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 12:59 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 01:38 PM, Anne Rosset wrote:

Hi,
We are seeing some overall performance degradation in our application
 since we installed the security release. Other commits were also
done at the same time in the application so we don't know yet if the
 degradation has any relationship with the security release.

 While we are digging into this, I would like to know if it is possible
 that the release has some impact on performance. After reading this
 It was created as a side effect of a refactoring effort to make
 establishing new connections to a PostgreSQL server faster, and the
 associated code more maintainable., I am thinking it is quite possible.

 Please let me know. Thanks,

Exactly which version of PostgreSQL are you running? (we released security 
update releases for multiple PG versions).  Also which version were you running 
before?

There were some changes to analyze/vacuum in the previous set of minor releases 
that could cause performance issues in some cases (ie if statistics are no 
longer being updated because analyze can't get the
exclusive lock for truncation).   There might be other unintended
performance related changes.

Are all queries taking longer or only some?  Can you find any sort of pattern 
that might help narrow the issue?

Steve


 Anne











--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance with the new security release?

2013-04-23 Thread Steve Singer

On 13-04-23 10:04 AM, Anne Rosset wrote:

Thanks Steve.
I found this: http://www.postgresql.org/docs/current/static/release-9-2-3.html

Fix performance problems with autovacuum truncation in busy workloads (Jan 
Wieck)
Truncation of empty pages at the end of a table requires exclusive lock, but 
autovacuum was coded to fail (and release the table lock) when there are conflicting 
lock requests. Under load, it is easily possible that truncation would never occur, 
resulting in table bloat. Fix by performing a partial truncation, releasing the 
lock, then attempting to re-acquire the lock and continue. This fix also greatly 
reduces the average time before autovacuum releases the lock after a conflicting 
request arrives.

So that is not the fix?
No, that is the change that caused this problem.   That fix addresses a 
slightly different set of symptoms where the truncate as part of an 
auto-vacuum doesn't happen because the lock gets pre-empted.  An 
unintended/undesirable consequence of that fix was that it means if 
vacuum can't do the truncate stage because it can't obtain the lock in 
the first place then statistics don't get updated.





(Sorry to ask a second time but I really need to make sure).
Thanks,
Anne



-Original Message-


From: Steve Singer [mailto:st...@ssinger.info]
Sent: Tuesday, April 23, 2013 6:33 AM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 11:46 PM, Anne Rosset wrote:

Thanks Steve.
I have read that a fix has been put in release 9.2.3 for this issue. Is that 
right?
Thanks,
Anne

No this issue is present in 9.0.13, 9.1.9 and 9.2.4 (as well as 9.2.3).

There is talk about fixing this for the next set of minor releases but I 
haven't yet seen a patch.


-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 4:35 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 04:41 PM, Anne Rosset wrote:

Hi Steve,
Yes I see these messages in our log. Is there a solution to this?
Thanks,
Anne

A manual analyze of the effected tables should work and give you updated 
statistics.  If your problem is just statistics then that should help.
A manual vacuum will , unfortunately, behave like the auto-vacuum. The only way 
to get vacuum past this (until this issue is fixed) is for
vacuum to be able to get that exclusive lock.   If there are times of
the day your database is less busy you might have some luck turning off 
auto-vacuum on these tables  and doing manual vacuums during those times.





-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 1:26 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 04:15 PM, Anne Rosset wrote:

Hi Steve,
Thanks for your reply.
We are now running  9.0.13. Before it was 9.0.7.
How can I find out if we are running into this issue: ie if
statistics are no longer being updated because analyze can't get the exclusive lock 
for truncation?

This issue is discussed in the thread
http://www.postgresql.org/message-id/CAMkU=1xYXOJp=jLAASPdSAqab-HwhA_
t
nRhy+JUe=4=b=v3...@mail.gmail.com

If your seeing messages in your logs of the form:

automatic vacuum of table XXX.YYY cannot (re)acquire exclusive lock for truncate 
scan

then you might be hitting this issue.



I will dig into our logs to see for the query times.
Thanks,
Anne

-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 12:59 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 01:38 PM, Anne Rosset wrote:

Hi,
We are seeing some overall performance degradation in our application
  since we installed the security release. Other commits were
also done at the same time in the application so we don't know yet if the
  degradation has any relationship with the security release.

  While we are digging into this, I would like to know if it is possible
  that the release has some impact on performance. After reading this
  It was created as a side effect of a refactoring effort to make
  establishing new connections to a PostgreSQL server faster, and the
  associated code more maintainable., I am thinking it is quite possible.

  Please let me know. Thanks,

Exactly which version of PostgreSQL are you running? (we released security 
update releases for multiple PG versions).  Also which version were you running 
before?

There were some changes to analyze/vacuum in the previous set of minor releases 
that could cause performance issues in some cases (ie if statistics are no 
longer being updated because analyze can't get the
exclusive lock for truncation).   There might be other unintended
performance related changes

Re: [HACKERS] Performance with the new security release?

2013-04-22 Thread Steve Singer

On 13-04-22 01:38 PM, Anne Rosset wrote:

Hi,
We are seeing some overall performance degradation in our application
  since we installed the security release. Other commits were also done
at the same time in the application so we don't know yet if the
  degradation has any relationship with the security release.

  While we are digging into this, I would like to know if it is possible
  that the release has some impact on performance. After reading this
  It was created as a side effect of a refactoring effort to make
  establishing new connections to a PostgreSQL server faster, and the
  associated code more maintainable., I am thinking it is quite possible.

  Please let me know. Thanks,


Exactly which version of PostgreSQL are you running? (we released 
security update releases for multiple PG versions).  Also which version 
were you running before?


There were some changes to analyze/vacuum in the previous set of minor 
releases that could cause performance issues in some cases (ie if 
statistics are no longer being updated because analyze can't get the 
exclusive lock for truncation).   There might be other unintended 
performance related changes.


Are all queries taking longer or only some?  Can you find any sort of 
pattern that might help narrow the issue?


Steve


  Anne






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance with the new security release?

2013-04-22 Thread Steve Singer

On 13-04-22 04:15 PM, Anne Rosset wrote:

Hi Steve,
Thanks for your reply.
We are now running  9.0.13. Before it was 9.0.7.
How can I find out if we are running into this issue: ie if statistics are no 
longer being updated because analyze can't get the
exclusive lock for truncation?


This issue is discussed in the thread 
http://www.postgresql.org/message-id/CAMkU=1xYXOJp=jLAASPdSAqab-HwhA_tnRhy+JUe=4=b=v3...@mail.gmail.com


If your seeing messages in your logs of the form:

automatic vacuum of table XXX.YYY cannot (re)acquire exclusive lock for 
truncate scan


then you might be hitting this issue.



I will dig into our logs to see for the query times.
Thanks,
Anne

-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 12:59 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 01:38 PM, Anne Rosset wrote:

Hi,
We are seeing some overall performance degradation in our application
   since we installed the security release. Other commits were also
done at the same time in the application so we don't know yet if the
   degradation has any relationship with the security release.

   While we are digging into this, I would like to know if it is possible
   that the release has some impact on performance. After reading this
   It was created as a side effect of a refactoring effort to make
   establishing new connections to a PostgreSQL server faster, and the
   associated code more maintainable., I am thinking it is quite possible.

   Please let me know. Thanks,

Exactly which version of PostgreSQL are you running? (we released security 
update releases for multiple PG versions).  Also which version were you running 
before?

There were some changes to analyze/vacuum in the previous set of minor releases 
that could cause performance issues in some cases (ie if statistics are no 
longer being updated because analyze can't get the
exclusive lock for truncation).   There might be other unintended
performance related changes.

Are all queries taking longer or only some?  Can you find any sort of pattern 
that might help narrow the issue?

Steve


   Anne









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance with the new security release?

2013-04-22 Thread Steve Singer

On 13-04-22 04:41 PM, Anne Rosset wrote:

Hi Steve,
Yes I see these messages in our log. Is there a solution to this?
Thanks,
Anne


A manual analyze of the effected tables should work and give you updated 
statistics.  If your problem is just statistics then that should help.
A manual vacuum will , unfortunately, behave like the auto-vacuum. The 
only way to get vacuum past this (until this issue is fixed) is for 
vacuum to be able to get that exclusive lock.   If there are times of 
the day your database is less busy you might have some luck turning off 
auto-vacuum on these tables  and doing manual vacuums during those times.






-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 1:26 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 04:15 PM, Anne Rosset wrote:

Hi Steve,
Thanks for your reply.
We are now running  9.0.13. Before it was 9.0.7.
How can I find out if we are running into this issue: ie if
statistics are no longer being updated because analyze can't get the exclusive lock 
for truncation?

This issue is discussed in the thread
http://www.postgresql.org/message-id/CAMkU=1xYXOJp=jLAASPdSAqab-HwhA_tnRhy+JUe=4=b=v3...@mail.gmail.com

If your seeing messages in your logs of the form:

automatic vacuum of table XXX.YYY cannot (re)acquire exclusive lock for truncate 
scan

then you might be hitting this issue.



I will dig into our logs to see for the query times.
Thanks,
Anne

-Original Message-
From: Steve Singer [mailto:st...@ssinger.info]
Sent: Monday, April 22, 2013 12:59 PM
To: Anne Rosset
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Performance with the new security release?

On 13-04-22 01:38 PM, Anne Rosset wrote:

Hi,
We are seeing some overall performance degradation in our application
since we installed the security release. Other commits were also
done at the same time in the application so we don't know yet if the
degradation has any relationship with the security release.

While we are digging into this, I would like to know if it is possible
that the release has some impact on performance. After reading this
It was created as a side effect of a refactoring effort to make
establishing new connections to a PostgreSQL server faster, and the
associated code more maintainable., I am thinking it is quite possible.

Please let me know. Thanks,

Exactly which version of PostgreSQL are you running? (we released security 
update releases for multiple PG versions).  Also which version were you running 
before?

There were some changes to analyze/vacuum in the previous set of minor releases 
that could cause performance issues in some cases (ie if statistics are no 
longer being updated because analyze can't get the
exclusive lock for truncation).   There might be other unintended
performance related changes.

Are all queries taking longer or only some?  Can you find any sort of pattern 
that might help narrow the issue?

Steve


Anne











--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   >