Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-16 Thread Kyotaro HORIGUCHI
At Sat, 15 Apr 2017 12:56:32 -0400, Robert Haas  wrote 
in 
> On Fri, Apr 14, 2017 at 6:52 PM, Tom Lane  wrote:
> > Peter Eisentraut  writes:
> >> If we're talking about making things easier to understand, wouldn't a
> >> random user rather know what a WAL "location" is instead of a WAL "LSN"?
> >
> > I wouldn't object to standardizing on "location" instead of "lsn" in the
> > related function and column names.  What I don't like is using different
> > words for the same thing.
> 
> The case mentioned in the subject of this thread has been using the
> word "location" since time immemorial.  It's true that we've already
> renamed it (xlog -> wal) in this release, so if we want to standardize
> on lsn, now's certainly the time to do it.  I'm worried that
> pg_current_wal_lsn() is an identifier composed almost entirely of
> abbreviations and therefore possibly just as impenetrable as
> qx_current_pfq_dnr(), but maybe we should assume that LSN is a term of
> art with which knowledgeable users are required to be familiar, much
> as we are doing for "WAL".
> 
> It appears, from grepping the 9.6 version of pg_proc.h, that both lsn
> and location have some historical precedent.

I'd better to have replied here. The detail is in my reply on
another brandh of this thread.

https://www.postgresql.org/message-id/20170417.143937.232025253.horiguchi.kyot...@lab.ntt.co.jp

After all, "location" seems better to me in user interface. We
could rename almost all of %lsn% names into %location% except
pg_lsn oprators as long as it doesn't become too long or complex.

One annoyance is the historical function pg_xlog_location_diff(),
which is currently just an alias of pg_lsn_mi. It is
substantially an operator, but 'pg_wal_lsn_diff()' is so far from
the historical name that it becomes totally useless.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgbench tap tests & minor fixes

2017-04-16 Thread Fabien COELHO


Hello,

When developping new features for pgbench, I usually write some tests 
which are lost when the feature is committed. Given that I have submitted 
some more features and that part of pgbench code may be considered for 
sharing with pgsql, I think that improving the abysmal state of tests 
would be a good move.


The attached patch:

(1) extends the existing perl tap test infrastructure with methods to test 
pgbench, i.e. "pgbench" which runs a pgbench test and "pgbench_likes" 
which allows to check for expectations.


(2) reuse this infrastructure for the prior existing test about concurrent 
inserts.


(3) add a lot of new very small tests so that coverage jumps from very low 
to over 90% for source files. I think that derived files (exprparse.c, 
exprscan.c) should be removed from coverage analysis.


Previous coverage status:

 exprparse.y0.0 % 0 / 770.0 %0 / 8
 exprscan.l 0.0 % 0 / 102   0.0 %0 / 8
 pgbench.c  28.3 %  485 / 1716  43.1 %  28 / 65

New status:

 exprparse.y96.1 %73 / 76   100.0 %  8 / 8
 exprscan.l 92.8 %90 / 97   100.0 %  8 / 8
 pgbench.c  90.4 %  1542 / 1705  96.9 % 63 / 65

The test runtime is about doubled on my laptop, which is not too bad given 
the coverage achieved.


(4) fixes a two minor issues. These fixes may be considered for 
backpatching to 10, although I doubt anyone will complain, so I would not 
bother. Namely:


 - the -t/-R/-L combination was not working properly, fix that
   by moving client statistics in processXactStats, adjust some
   formula, and add a few comments for details I had to discover.

 - add a check that --progress-timestamp => --progress

I'm unsure of the portability of some of the tests (\shell and \setshell), 
especially on Windows. If there is an issue, these test will have to be 
skipped on this platform.


Some of the tests may fail with a very low probability (eg 1/2**84?).

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ae36247..2128418 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -229,7 +229,7 @@ typedef struct SimpleStats
 typedef struct StatsData
 {
 	time_t		start_time;		/* interval start time, for aggregates */
-	int64		cnt;			/* number of transactions */
+	int64		cnt;			/* number of transactions, including skipped */
 	int64		skipped;		/* number of transactions skipped under --rate
  * and --latency-limit */
 	SimpleStats latency;
@@ -329,7 +329,7 @@ typedef struct
 	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
 
 	/* per client collected stats */
-	int64		cnt;			/* transaction count */
+	int64		cnt;			/* client transaction count, for -t */
 	int			ecnt;			/* error count */
 } CState;
 
@@ -2045,7 +2045,9 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 	if (INSTR_TIME_IS_ZERO(now))
 		INSTR_TIME_SET_CURRENT(now);
 	now_us = INSTR_TIME_GET_MICROSEC(now);
-	while (thread->throttle_trigger < now_us - latency_limit)
+	while (thread->throttle_trigger < now_us - latency_limit &&
+		   /* with -t, do not overshoot */
+		   (nxacts <= 0 || st->cnt < nxacts))
 	{
 		processXactStats(thread, st, , true, agg);
 		/* next rendez-vous */
@@ -2053,6 +2055,12 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 		thread->throttle_trigger += wait;
 		st->txn_scheduled = thread->throttle_trigger;
 	}
+
+	if (nxacts > 0 && st->cnt >= nxacts)
+	{
+		st->state = CSTATE_FINISHED;
+		break;
+	}
 }
 
 st->state = CSTATE_THROTTLE;
@@ -2366,13 +2374,13 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 
 /*
  * transaction finished: calculate latency and log the
- * transaction
+ * transaction.
  */
 if (progress || throttle_delay || latency_limit ||
 	per_script_stats || use_log)
 	processXactStats(thread, st, , false, agg);
 else
-	thread->stats.cnt++;
+	thread->stats.cnt++, st->cnt++;
 
 if (is_connect)
 {
@@ -2381,7 +2389,6 @@ doCustom(TState *thread, CState *st, StatsData *agg)
 	INSTR_TIME_SET_ZERO(now);
 }
 
-++st->cnt;
 if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded)
 {
 	/* exit success */
@@ -2530,6 +2537,7 @@ processXactStats(TState *thread, CState *st, instr_time *now,
 		lag = INSTR_TIME_GET_MICROSEC(st->txn_begin) - st->txn_scheduled;
 	}
 
+	/* thread stats */
 	if (progress || throttle_delay || latency_limit)
 	{
 		accumStats(>stats, skipped, latency, lag);
@@ -2539,8 +2547,12 @@ processXactStats(TState *thread, CState *st, instr_time *now,
 			thread->latency_late++;
 	}
 	else
+		/* just count */
 		thread->stats.cnt++;
 
+	/* client stat is just counting */
+	st->cnt ++;
+
 	if (use_log)
 		doLog(thread, st, agg, skipped, latency, lag);
 
@@ -3118,7 +3130,7 @@ process_backslash_command(PsqlScanState sstate, const char *source)
 	/* Save line */
 	

Re: [HACKERS] Shouldn't duplicate addition to publication be a no-op?

2017-04-16 Thread Robert Haas
On Sun, Apr 16, 2017 at 11:58 PM, Amit Langote
 wrote:
> By the way, Petr said in the other thread that it could be made a no-op
> (presumably without requiring IF NOT EXISTS) on the grounds that
> membership of table in publication is "soft object" or "property" rather
> than real object.

I don't find that argument terribly convincing.

The nearest parallel that we have for this is probably:

ALTER EXTENSION name ADD member_object;
ALTER EXTENSION name DROP member_object;

I would guess this ought to work similarly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()

2017-04-16 Thread Kyotaro HORIGUCHI
At Fri, 14 Apr 2017 18:26:37 -0400, Peter Eisentraut 
 wrote in 
<052f4ce0-159a-1666-f136-91977d326...@2ndquadrant.com>
> On 4/14/17 04:28, Kyotaro HORIGUCHI wrote:
> > =# select distinct attname from pg_attribute where attname like '%lsn%';
> >attname   
> > -
> >  confirmed_flush_lsn
> >  latest_end_lsn
> >  local_lsn
> >  receive_start_lsn
> >  received_lsn
> >  remote_lsn
> >  restart_lsn
> >  srsublsn
> > (8 rows)
> > 
> > 
> > Feature is already frozen, but this seems inconsistent a bit..
> 
> I think these are all recently added for logical replication.  We could
> rename them to _location.
> 
> I'm not a fan of renaming everything the opposite way.

I don't particulary care for either. What is most unpleasant here
for me is the inconsistency among several replication-related
tables. Logical replication stuff is using LSN and physical sutff
has been using location, but pg_stat_wal_receiver is using
LSN. pg_replication_slots as the common stuff is using LSN.

"Location" fits attribute names since the table name implies that
the location is "LSN".

On the other hand nothing suggests such implication on function
names. So only "wal_location" or "lsn" can be used in function
names. pg_current_wal_* requires to be "wal_lsn" even using LSN
since "LSN" itself doesn't imply WAL files being
written. "wal_lsn" looks somewhat too-much, though.

Columns:
=# select attrelid::regclass || '.' || attname from pg_attribute where attname 
like '%location%' or attname like '%lsn%';
 ?column? 
--
 pg_subscription_rel.srsublsn
 pg_stat_replication.sent_location
 pg_stat_replication.write_location
 pg_stat_replication.flush_location
 pg_stat_replication.replay_location
 pg_stat_wal_receiver.receive_start_lsn
 pg_stat_wal_receiver.received_lsn
 pg_stat_wal_receiver.latest_end_lsn
 pg_stat_subscription.received_lsn
 pg_stat_subscription.latest_end_lsn
 pg_replication_slots.restart_lsn
 pg_replication_slots.confirmed_flush_lsn
 pg_replication_origin_status.remote_lsn
 pg_replication_origin_status.local_lsn


pg_subscription_rel has a bit different naming convention from
others. But I'm not sure that involving it in the unification is
good since it doesn't seem to be explicitly exposed to users.


=# select proname from pg_proc where proname like '%location%' or proname like 
'%lsn%';
proname 

 pg_tablespace_location ## This is irrelevant
 pg_current_wal_location
 pg_current_wal_insert_location
 pg_current_wal_flush_location
 pg_wal_location_diff
 pg_last_wal_receive_location
 pg_last_wal_replay_location
 pg_lsn_mi
 pg_lsn_in
 pg_lsn_out
 pg_lsn_lt
 pg_lsn_le
 pg_lsn_eq
 pg_lsn_ge
 pg_lsn_gt
 pg_lsn_ne
 pg_lsn_recv
 pg_lsn_send
 pg_lsn_cmp
 pg_lsn_hash

I think we can use "location" for all attributes and functions
except pg_lsn operators.

The last annoyance would be pg_wal_location_diff(). This exists
only for backward compatibility but the name 'pg_wal_lsn_diff' is
already so far from the original name that it becomes totally
useless.

Any more thoughts?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate parallel-safety check for SubPlans

2017-04-16 Thread Robert Haas
On Mon, Apr 17, 2017 at 1:24 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane  wrote:
>>> FYI, I have this on my to-look-at list, and expect to fix it before Robert
>>> returns from vacation.
>
>> Let me know if an initial patch by someone else can be helpful?
>
> Sure, have a go at it.  I won't get to this for a day or so ...

Thanks to both of you.  I appreciate the help.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate parallel-safety check for SubPlans

2017-04-16 Thread Tom Lane
Amit Kapila  writes:
> On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane  wrote:
>> FYI, I have this on my to-look-at list, and expect to fix it before Robert
>> returns from vacation.

> Let me know if an initial patch by someone else can be helpful?

Sure, have a go at it.  I won't get to this for a day or so ...

> I was not completely sure whether we need to pass any sort of whitelist
> of params for parallel safety, but if you think that is the better way
> to go, I can give it a try.

There's certainly more than one way to do it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing extended stats on foreign and partitioned tables

2017-04-16 Thread Ashutosh Bapat
On Sat, Apr 15, 2017 at 5:27 AM, Peter Eisentraut
 wrote:
> On 4/10/17 06:18, Ashutosh Bapat wrote:
>> This isn't exactly about this particular thread. But I noticed, that
>> after we introduced RELKIND_PARTITIONED_TABLE, we required to change a
>> number of conditions to include this relkind. We missed some places in
>> initial commits and fixed those later. I am wondering whether we
>> should creates macros clubbing relevant relkinds together based on the
>> purpose of the tests e.g. IS_RELKIND_HAS_STORAGE(). When a new relkind
>> is added, one can examine these macros to check whether the new
>> relkind fits in the given macro. If all those macros are placed
>> together, there is a high chance that we will not miss any place in
>> the initial commit itself.
>
> I think this is worth a try.

Thanks, will try to come up with something ASAP.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate parallel-safety check for SubPlans

2017-04-16 Thread Amit Kapila
On Sun, Apr 16, 2017 at 9:30 PM, Tom Lane  wrote:
> Noah Misch  writes:
>> On Wed, Apr 12, 2017 at 02:41:09PM -0400, Tom Lane wrote:
>>> This is 100% wrong.  It's failing to recurse into the subexpressions of
>>> the SubPlan, which could very easily include parallel-unsafe function
>>> calls.  Example:
>
>> The above-described topic is currently a PostgreSQL 10 open item.  Robert,
>> since you committed the patch believed to have created it, you own this open
>> item.  If some other commit is more relevant or if this does not belong as a
>> v10 open item, please let us know.
>
> FYI, I have this on my to-look-at list, and expect to fix it before Robert
> returns from vacation.
>

Let me know if an initial patch by someone else can be helpful?  I was
not completely sure whether we need to pass any sort of whitelist of
params for parallel safety, but if you think that is the better way to
go, I can give it a try.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-16 Thread Pavel Stehule
2017-04-17 1:00 GMT+02:00 Fabien COELHO :

>
> I checked the pgbench expr related code.
>>
>
> 2. move pgbench expressions to separate module
>>
>
> Probably already existing "fe_utils".
>
> 3. teach pgbench expressions booleans
>>
>
> See https://commitfest.postgresql.org/14/985/


so some work is done :)

Regards

Pavel


>
>
> --
> Fabien.
>


Re: [HACKERS] Shouldn't duplicate addition to publication be a no-op?

2017-04-16 Thread Amit Langote
On 2017/04/15 8:53, Peter Eisentraut wrote:
> On 4/13/17 06:23, Amit Langote wrote:
>> create table bar (a int);
>> create publication mypub for table bar;
>> alter publication mypub add table bar;
>> ERROR:  relation "bar" is already member of publication "mypub"
>>
>> 2nd command should be a no-op, IMHO.
> 
> We generally require a IF NOT EXISTS in those situations.

Hmm, okay.  So I guess the grammar support will be added later.

By the way, Petr said in the other thread that it could be made a no-op
(presumably without requiring IF NOT EXISTS) on the grounds that
membership of table in publication is "soft object" or "property" rather
than real object.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and inheritance

2017-04-16 Thread Amit Langote
On 2017/04/15 3:53, Peter Eisentraut wrote:
> On 4/13/17 06:48, Amit Langote wrote:
>> That is an important consideration because of pg_dump.  See below:
>>
>> create table foo (a int);
>> create table bar () inherits (foo);
>> create publication mypub for table foo;  -- bar is added too.
>>
>> $ pg_dump -s | psql -e test
>> 
>> ALTER PUBLICATION mypub ADD TABLE foo;
>> ERROR:  relation "bar" is already member of publication "mypub"
> 
> To fix this, pg_dump should emit ADD TABLE ONLY foo.

Yeah, that's one way.  Attached is a tiny patch for that.

By the way, I noticed that although grammar accepts ONLY and * against a
table name to affect whether descendant tables are included, the same is
not mentioned in the CREATE PUBLICATION and ALTER PUBLICATION reference
pages.  I suspect it was probably not intentional, so attached a doc patch
for that too.

Thanks,
Amit
>From da6335c1727be7d2a12c225842baa64b9bc4a24a Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 17 Apr 2017 11:35:02 +0900
Subject: [PATCH 1/2] Make pg_dump emit ONLY before table added to publication

---
 src/bin/pg_dump/pg_dump.c|  2 +-
 src/bin/pg_dump/t/002_pg_dump.pl | 12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3eccfa626b..22b5f784dc 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3627,7 +3627,7 @@ dumpPublicationTable(Archive *fout, PublicationRelInfo *pubrinfo)
 
 	query = createPQExpBuffer();
 
-	appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE",
+	appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY",
 	  fmtId(pubrinfo->pubname));
 	appendPQExpBuffer(query, " %s;",
 	  fmtId(tbinfo->dobj.name));
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 4dd208e8e1..b2f4ab6baf 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -4417,13 +4417,13 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 			section_pre_data => 1,
 			test_schema_plus_blobs   => 1, }, },
 
-	'ALTER PUBLICATION pub1 ADD TABLE test_table' => {
+	'ALTER PUBLICATION pub1 ADD TABLE ONLY test_table' => {
 		all_runs => 1,
 		create_order => 51,
 		create_sql =>
-		  'ALTER PUBLICATION pub1 ADD TABLE dump_test.test_table;',
+		  'ALTER PUBLICATION pub1 ADD TABLE ONLY dump_test.test_table;',
 		regexp => qr/^
-			\QALTER PUBLICATION pub1 ADD TABLE test_table;\E
+			\QALTER PUBLICATION pub1 ADD TABLE ONLY test_table;\E
 			/xm,
 		like => {
 			binary_upgrade  => 1,
@@ -4452,12 +4452,12 @@ qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog
 			pg_dumpall_globals   => 1,
 			pg_dumpall_globals_clean => 1,
 			test_schema_plus_blobs   => 1, }, },
-	'ALTER PUBLICATION pub1 ADD TABLE test_second_table' => {
+	'ALTER PUBLICATION pub1 ADD TABLE ONLY test_second_table' => {
 		create_order => 52,
 		create_sql =>
-		  'ALTER PUBLICATION pub1 ADD TABLE dump_test.test_second_table;',
+		  'ALTER PUBLICATION pub1 ADD TABLE ONLY dump_test.test_second_table;',
 		regexp => qr/^
-			\QALTER PUBLICATION pub1 ADD TABLE test_second_table;\E
+			\QALTER PUBLICATION pub1 ADD TABLE ONLY test_second_table;\E
 			/xm,
 		like => {
 			binary_upgrade  => 1,
-- 
2.11.0

>From d65be70d9a8515d8e7d6d0fe11362e9f4cdc8c2b Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 17 Apr 2017 11:45:45 +0900
Subject: [PATCH 2/2] Document that ONLY can be specified in publication
 commands

---
 doc/src/sgml/ref/alter_publication.sgml  | 12 
 doc/src/sgml/ref/create_publication.sgml |  9 +++--
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/alter_publication.sgml b/doc/src/sgml/ref/alter_publication.sgml
index 0a965b3bbf..858231fbcb 100644
--- a/doc/src/sgml/ref/alter_publication.sgml
+++ b/doc/src/sgml/ref/alter_publication.sgml
@@ -31,9 +31,9 @@ ALTER PUBLICATION name WITH ( name OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
 ALTER PUBLICATION name RENAME TO new_name
-ALTER PUBLICATION name ADD TABLE table_name [, ...]
-ALTER PUBLICATION name SET TABLE table_name [, ...]
-ALTER PUBLICATION name DROP TABLE table_name [, ...]
+ALTER PUBLICATION name ADD TABLE [ ONLY ] table_name [ * ] [, ...]
+ALTER PUBLICATION name SET TABLE [ ONLY ] table_name [ * ] [, ...]
+ALTER PUBLICATION name DROP TABLE [ ONLY ] table_name [ * ] [, ...]
 
  
 
@@ -116,7 +116,11 @@ ALTER PUBLICATION name DROP TABLE <
 table_name
 
  
-  Name of an existing table.
+  Name of an existing table.  If ONLY is specified before the
+  table name, only that table is affected.  If ONLY is not
+  specified, the table and all its descendant tables (if any) are
+  affected.  Optionally, * can be specified after the table
+  name to explicitly indicate that descendant tables are included.
  
 
   

Re: [HACKERS] Logical replication launcher uses wal_retrieve_retry_interval

2017-04-16 Thread Masahiko Sawada
On Fri, Apr 14, 2017 at 9:59 PM, Petr Jelinek
 wrote:
> On 14/04/17 14:30, Michael Paquier wrote:
>> On Fri, Apr 14, 2017 at 9:19 PM, Petr Jelinek
>>  wrote:
>>> I am not quite sure adding more GUCs is all that great option. When
>>> writing the patches I was wondering if we should perhaps rename the
>>> wal_receiver_timeout and wal_retrieve_retry_interval to something that
>>> makes more sense for both physical and logical replication though.
>>
>> It seems to me that you should really have a different GUC,
>> wal_retrieve_retry_interval has been designed to work in the startup
>> process, and I think that it should still only behave as originally
>> designed.
>
> Ah yeah I am actually confusing it with wal_receiver_timeout which
> behaves same for wal_receiver and subscription worker. So yeah it makes
> sense to have separate GUC

Attached two patches add new GUCs apply_worker_timeout and
apply_worker_launch_interval which are used instead of
wal_receiver_timeout and wal_retrieve_retry_timeout. These new
parameters are not settable at worker-level so far.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


0001-Add-a-GUC-parameter-apply_worker_timeout.patch
Description: Binary data


0002-Add-a-new-GUC-parameter-apply_worker_launch_interval.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapbuild woes

2017-04-16 Thread Noah Misch
On Thu, Apr 13, 2017 at 12:58:12AM -0400, Noah Misch wrote:
> On Wed, Apr 12, 2017 at 10:21:51AM -0700, Andres Freund wrote:
> > On 2017-04-12 11:03:57 -0400, Peter Eisentraut wrote:
> > > On 4/12/17 02:31, Noah Misch wrote:
> > > >>> But I hope you mean to commit these snapbuild patches before the 
> > > >>> postgres 10
> > > >>> release?  As far as I know, logical replication is still very broken 
> > > >>> without
> > > >>> them (or at least some of that set of 5 patches - I don't know which 
> > > >>> ones
> > > >>> are essential and which may not be).
> > > >>
> > > >> Yes, these should go into 10 *and* earlier releases, and I don't plan 
> > > >> to
> > > >> wait for 2017-07.
> > > > 
> > > > [Action required within three days.  This is a generic notification.]
> > > 
> > > I'm hoping for a word from Andres on this.
> > 
> > Feel free to reassign to me.
> 
> Thanks for volunteering; I'll do that shortly.  Please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.
> 
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-16 Thread Noah Misch
On Thu, Apr 13, 2017 at 04:56:05AM +, Noah Misch wrote:
> On Tue, Apr 11, 2017 at 02:28:44AM +0900, Fujii Masao wrote:
> >  src/backend/replication/logical/launcher.c
> >  * Worker started and attached to our shmem. This check is safe
> >  * because only launcher ever starts the workers, so nobody can 
> > steal
> >  * the worker slot.
> > 
> > The tablesync patch enabled even worker to start another worker.
> > So the above assumption is not valid for now.
> > 
> > This issue seems to cause the corner case where the launcher picks up
> > the same worker slot that previously-started worker has already picked
> > up to start another worker.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Peter,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
> 
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PANIC in pg_commit_ts slru after crashes

2017-04-16 Thread Michael Paquier
On Sun, Apr 16, 2017 at 6:02 PM, Simon Riggs  wrote:
> On 15 April 2017 at 21:30, Jeff Janes  wrote:
>> On Fri, Apr 14, 2017 at 9:33 PM, Pavan Deolasee 
>> wrote:
>>
>>>
>>> Since all those offsets fall on a page boundary, my guess is that we're
>>> somehow failing to handle a new page correctly.
>>>
>>> Looking at the patch itself, my feeling is that the following code in
>>> src/backend/access/transam/twophase.c might be causing the problem.
>>>
>>> 1841
>>> 1842 /* update nextXid if needed */
>>> 1843 if (TransactionIdFollowsOrEquals(maxsubxid,
>>> ShmemVariableCache->nextXid))
>>> 1844 {
>>> 1845 LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
>>> 1846 ShmemVariableCache->nextXid = maxsubxid;
>>> 1847 TransactionIdAdvance(ShmemVariableCache->nextXid);
>>> 1848 LWLockRelease(XidGenLock);
>>> 1849 }
>>>
>>> The function PrescanPreparedTransactions() gets called at the start of the
>>> redo recovery and this specific block will get exercised irrespective of
>>> whether there are any prepared transactions or not. What I find particularly
>>> wrong here is that we are initialising maxsubxid to current value of
>>> ShmemVariableCache->nextXid when the function enters, but this block would
>>> then again increment ShmemVariableCache->nextXid, when there are no prepared
>>> transactions in the system.
>>>
>>> I wonder if we should do as in attached patch.
>>
>>
>> That solves it for me.
>
> Thanks for patching and testing. I'll review and probably commit tomorrow.

Actually I think that the fix proposed is not correct as we should
just never take the risk to call TransactionIdAdvance() if there are
no 2PC files to scan, and it adds more difficulty in understanding
something that the 2PC restore code has introduced and already made
harder to catch (2nd thoughts and regrets here). If you look closely
at the code, ProcessTwoPhaseBuffer() uses *result and *maxsubid only
for PrescanPreparedTransactions() so there is a link between both.
Attached is a patch to rework ProcessTwoPhaseBuffer() by removing
those two arguments and replace them by a boolean to decide if the
next cached transaction ID should be updated or not. The cached next
TXID gets updated only if caller of ProcessTwoPhaseBuffer() wants to
do so and if the subxid scanned follows the XID of the scanned 2PC
file. This looks safer to me in the long run.

Jeff, does this patch make the situation better? The fix is rather
simple as it just makes sure that the next XID never gets updated if
there are no 2PC files.
-- 
Michael


2pc-restore-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-16 Thread Andrew Dunstan


On 04/16/2017 07:43 PM, Peter Eisentraut wrote:
> On 4/15/17 12:33, Andrew Dunstan wrote:
>> Sure. Just means putting this code a bit later in the file. "make check"
>> is only one initdb, so it won't cost much. I'm still inclined to force a
>> TAP test for initdb with no TZ set, though.
> How much is this going to buy overall?  Is it worth the complications?
>


I don't know, but it's a very small change.

I am going to spend some time instrumenting where the time goes in
various tests.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment typo in xlogutils.c

2017-04-16 Thread Peter Eisentraut
On 4/15/17 11:26, Masahiko Sawada wrote:
> Attached patch for $subject.
> 
> s/apruptly/abruptly

Fixed, thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-16 Thread Peter Eisentraut
On 4/15/17 12:33, Andrew Dunstan wrote:
> Sure. Just means putting this code a bit later in the file. "make check"
> is only one initdb, so it won't cost much. I'm still inclined to force a
> TAP test for initdb with no TZ set, though.

How much is this going to buy overall?  Is it worth the complications?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-16 Thread Fabien COELHO



I checked the pgbench expr related code.



2. move pgbench expressions to separate module


Probably already existing "fe_utils".


3. teach pgbench expressions booleans


See https://commitfest.postgresql.org/14/985/

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OpenSSL 1.1 breaks configure and more

2017-04-16 Thread Tom Lane
Andreas Karlsson  writes:
> On 04/16/2017 03:14 AM, Tom Lane wrote:
>> 1. Back-patch that patch, probably also including the followup adjustments
>> in 86029b31e and 36a3be654.

> Given that I cannot recall seeing any complaints about the behavior of 
> 9.4 compared to 9.3 I am leaning towards #1. That way there are fewer 
> different versions of our OpenSSL code.

Yeah, I was thinking about that point too.  Barring objections I'll
do #1 and then move forward with the openssl 1.1 backport.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OpenSSL 1.1 breaks configure and more

2017-04-16 Thread Andreas Karlsson

On 04/16/2017 03:14 AM, Tom Lane wrote:

1. Back-patch that patch, probably also including the followup adjustments
  in 86029b31e and 36a3be654.

2. Add #if's to use 31cf1a1a4's coding with OpenSSL >= 1.1, while keeping
   the older code for use when built against older OpenSSLs.

3. Conditionally disable renegotiation altogether with OpenSSL >= 1.1,
   thus adopting 9.5 not 9.4 behavior when using newer OpenSSL.

[...]

Thoughts?


Given that I cannot recall seeing any complaints about the behavior of 
9.4 compared to 9.3 I am leaning towards #1. That way there are fewer 
different versions of our OpenSSL code.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-16 Thread Petr Jelinek
On 16/04/17 21:27, Steve Singer wrote:
> On 04/10/2017 01:28 PM, Fujii Masao wrote:
>> Hi,
>>
>>   src/backend/replication/logical/launcher.c
>>   * Worker started and attached to our shmem. This check is safe
>>   * because only launcher ever starts the workers, so nobody
>> can steal
>>   * the worker slot.
>>
>> The tablesync patch enabled even worker to start another worker.
>> So the above assumption is not valid for now.
>>
>> This issue seems to cause the corner case where the launcher picks up
>> the same worker slot that previously-started worker has already picked
>> up to start another worker.
>>
>> Regards,
>>
> 
> I'm not sure if what I am seeing is related to this or another issue.
> 
> If I create a subscription for a publication that doesn't exist (yet) I
> see 'publication mypub does not exist" in the subscribers log
> 
> If I then create the publication on the publisher the error doesn't go
> away, the worker process keeps spinning with the same error.
> 
> If I then drop the subscription and recreate it it works.
> 

No that's a result of how logical decoding/slots work. We see catalogs
as what they looked like at the specific point in time. If you create
slot (by creating subscription) it will not see publication that was
created after until decoding on that slot reaches point in time when it
was actually created.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-04-16 Thread Petr Jelinek
On 16/04/17 18:47, Tom Lane wrote:
> Craig Ringer  writes:
>> On 12 April 2017 at 13:34, Kuntal Ghosh  wrote:
>>> For backend_type=background worker, application_name shows the name of
>>> the background worker (BackgroundWorker->bgw_name). I think we need
>>> some way to distinguish among different background workers. But,
>>> application_name may not be the correct field to show the information.
> 
>> It's better than (ab)using 'query' IMO.
> 
>> I'd rather an abbreviated entry to address Tom's concerns about
>> format. 'lrlaunch' or whatever.
> 
> Basically the problem I've got with the LR launcher is that it looks
> utterly unlike any other background process in pg_stat_activity.
> Leaving out all-null columns to make my point:
> 
> regression=# select 
> pid,usesysid,usename,application_name,backend_start,wait_event_type,wait_event,backend_type
>  from pg_stat_activity where application_name != 'psql';
>   pid  | usesysid | usename  |   application_name   | 
> backend_start | wait_event_type | wait_event  |
> backend_type 
> ---+--+--+--+---+-+-+-
>  25416 |  |  |  | 2017-04-16 
> 12:32:46.987076-04 | Activity| AutoVacuumMain  | autovacuum 
> launcher
>  25418 |   10 | postgres | logical replication launcher | 2017-04-16 
> 12:32:46.988859-04 | Activity| LogicalLauncherMain | background worker
>  25414 |  |  |  | 2017-04-16 
> 12:32:46.986745-04 | Activity| BgWriterHibernate   | background writer
>  25413 |  |  |  | 2017-04-16 
> 12:32:46.986885-04 | Activity| CheckpointerMain| checkpointer
>  25415 |  |  |  | 2017-04-16 
> 12:32:46.9871-04   | Activity| WalWriterMain   | walwriter
> (5 rows)
> 
> Why has it got non-null entries for usesysid and usename, never mind
> application_name?  Why does it not follow the well-established convention
> that backend_type is what identifies background processes?
> 
> I'm sure the answer to those questions is "it's an implementation artifact
> from using the generic bgworker infrastructure", but that does not make it
> look any less like sloppy, half-finished work.
> 
> If it is a limitation of the bgworker infrastructure that we can't make
> the LR processes look more like the other kinds of built-in processes,
> then I think we need to fix that limitation.  And I further assert that
> we need to do it for v10, because once we ship v10 people will adjust
> their tools for this bogus output, and we'll face complaints about
> backwards compatibility if we fix it later.
> 

It's indeed how bgworker infrastructure is reporting it. That being
said, since LR processes are in-core, we can add exception for them in
pgstat_bestart() so that they are reported more like other builtin
processes. We could also try to add api for bgworker processes to change
how they are reported so that any future workers (and all the external
workers) can be reported properly as well, but that seems better fit for
v11 at this point since it would be good if we had some discussion for
how that should look like.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c

2017-04-16 Thread Petr Jelinek
On 16/04/17 20:41, Andres Freund wrote:
> On 2017-04-16 10:46:21 +0200, Erik Rijkers wrote:
>> On 2017-04-15 04:47, Erik Rijkers wrote:
>>>
>>> 0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
>>> 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
>>> 0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
>>> 0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  +
>>> 0005-Skip-unnecessary-snapshot-builds.patch
>>
>> I am now using these newer patches:
>> https://www.postgresql.org/message-id/30242bc6-eca4-b7bb-670e-8d0458753a8c%402ndquadrant.com
>>
>>> It builds fine, but when I run the old pbench-over-logical-replication
>>> test I get:
>>>
>>> TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File:
>>> "pgstat.c", Line: 828)
>>
>>
>> To get that error:
> 
> I presume this is the fault of
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=139eb9673cb84c76f493af7e68301ae204199746
> if you git revert that individual commit, do things work again?
> 
> - Andres
> 

Yeah it is, it needs to be fenced to happen only after commit, which is
not guaranteed at the point of code, we probably need to put the
pgstat_report_stat() inside the if above after the
CommitTransactionCommand() (that will make it report stats for changes
apply did to pg_subscription_rel after next transaction though) or put
it into it's own if that checks if tx was indeed committed (which is bit
ugly).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tablesync patch broke the assumption that logical rep depends on?

2017-04-16 Thread Steve Singer

On 04/10/2017 01:28 PM, Fujii Masao wrote:

Hi,

  src/backend/replication/logical/launcher.c
  * Worker started and attached to our shmem. This check is safe
  * because only launcher ever starts the workers, so nobody can steal
  * the worker slot.

The tablesync patch enabled even worker to start another worker.
So the above assumption is not valid for now.

This issue seems to cause the corner case where the launcher picks up
the same worker slot that previously-started worker has already picked
up to start another worker.

Regards,



I'm not sure if what I am seeing is related to this or another issue.

If I create a subscription for a publication that doesn't exist (yet) I 
see 'publication mypub does not exist" in the subscribers log


If I then create the publication on the publisher the error doesn't go 
away, the worker process keeps spinning with the same error.


If I then drop the subscription and recreate it it works.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c

2017-04-16 Thread Andres Freund
On 2017-04-16 10:46:21 +0200, Erik Rijkers wrote:
> On 2017-04-15 04:47, Erik Rijkers wrote:
> > 
> > 0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
> > 0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
> > 0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
> > 0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  +
> > 0005-Skip-unnecessary-snapshot-builds.patch
> 
> I am now using these newer patches:
> https://www.postgresql.org/message-id/30242bc6-eca4-b7bb-670e-8d0458753a8c%402ndquadrant.com
> 
> > It builds fine, but when I run the old pbench-over-logical-replication
> > test I get:
> > 
> > TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File:
> > "pgstat.c", Line: 828)
> 
> 
> To get that error:

I presume this is the fault of
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=139eb9673cb84c76f493af7e68301ae204199746
if you git revert that individual commit, do things work again?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Variable substitution in psql backtick expansion

2017-04-16 Thread Pavel Stehule
2017-04-12 1:42 GMT+02:00 Fabien COELHO :

>
> Hmmm. Although I do not buy this, it could work as a replacement for \set
>>> which it seems cannot be upgraded because some people may rely on it to
>>> just store whatever comes after it in a variable.
>>>
>>
>> I have no strong opinion on how expressive expressions should be, but
>> having a separate \expr (or \setexpr, etc) gives us a green field to
>> develop them.
>>
>
> Yep.
>
> One possible approach would be to reuse pgbench expression engine in order
> to avoid redevelopping yet another lexer & parser & evaluator. This would
> mean some abstraction work, but it looks like the simplest & most effective
> approach right now. Currently it supports an SQL-expression subset about
> int & float, and there is an ongoing submission to add booleans and a few
> functions. If this is done this way, this suggests that variable management
> should/could be merged as well, but there are some differences (psql
> variables are not typed, it relies on a list, there is a "namespace" thing
> I'm not sure I understood...).
>
> Pavel also suggested some support for TEXT, although I would like to see a
> use case. That could be another extension to the engine.
>

I checked the pgbench expr related code.

Now, there are not a boolean operations, and value compare operations. But
there are lot of functions for random numbers - it is nice for regress
tests.

The text support should be really minimalist - eq op - or can be removed,
if we will have special functions for SQLSTATE (but simple string eq
operation should be useful for writing some tests).


>
> A drawback is that pgbench needs more powerfull client-side expressions
> than psql, thus it adds some useless complexity to psql, but avoiding
> another engine seems desirable.


The pgbench expression language is perfect for us - there is not any new
dependency - it is working on all supported platforms.

Can be nice, if we can reuse pgbench expressions in psql - there are some
task that should be solved first (it is definitely topic for next release)

1. synchronise lexers  - the psql lexer doesn't supports types, but
supports variable escaping
2. move pgbench expressions to separate module
3. teach pgbench expressions booleans and strings
4. because pgbench doesn't do early variable evaluation, implementation of
"defined" function is easy - we can introduce some new syntax for
implementation some bash patterns like "default value" or "own undefined
message"
5. we can introduce \setexpr in psql, and \if can use pgbench expr too (the
result of expression) must be boolean value like now
6. the psql builtin variables should be enhanced about server side and
client side numeric versions
7. the psql builtin variables should be enhanced about sqlstate - we are
able to handle errors due setting ON_ERROR_STOP already
8. the psql builtin variables can be enhanced about info about processed
rows

There is a benefit for pgbench - the code can be reduced after migration
expr related code to independent module.

The pgbench can take \if command and \setexpr command (although \setexpr
can be redundant there, there can be nice compatibility with psql)

Regards

Pavel


>
> --
> Fabien.
>


Re: [HACKERS] Why does logical replication launcher set application_name?

2017-04-16 Thread Tom Lane
Craig Ringer  writes:
> On 12 April 2017 at 13:34, Kuntal Ghosh  wrote:
>> For backend_type=background worker, application_name shows the name of
>> the background worker (BackgroundWorker->bgw_name). I think we need
>> some way to distinguish among different background workers. But,
>> application_name may not be the correct field to show the information.

> It's better than (ab)using 'query' IMO.

> I'd rather an abbreviated entry to address Tom's concerns about
> format. 'lrlaunch' or whatever.

Basically the problem I've got with the LR launcher is that it looks
utterly unlike any other background process in pg_stat_activity.
Leaving out all-null columns to make my point:

regression=# select 
pid,usesysid,usename,application_name,backend_start,wait_event_type,wait_event,backend_type
 from pg_stat_activity where application_name != 'psql';
  pid  | usesysid | usename  |   application_name   | 
backend_start | wait_event_type | wait_event  |backend_type 

---+--+--+--+---+-+-+-
 25416 |  |  |  | 2017-04-16 
12:32:46.987076-04 | Activity| AutoVacuumMain  | autovacuum launcher
 25418 |   10 | postgres | logical replication launcher | 2017-04-16 
12:32:46.988859-04 | Activity| LogicalLauncherMain | background worker
 25414 |  |  |  | 2017-04-16 
12:32:46.986745-04 | Activity| BgWriterHibernate   | background writer
 25413 |  |  |  | 2017-04-16 
12:32:46.986885-04 | Activity| CheckpointerMain| checkpointer
 25415 |  |  |  | 2017-04-16 
12:32:46.9871-04   | Activity| WalWriterMain   | walwriter
(5 rows)

Why has it got non-null entries for usesysid and usename, never mind
application_name?  Why does it not follow the well-established convention
that backend_type is what identifies background processes?

I'm sure the answer to those questions is "it's an implementation artifact
from using the generic bgworker infrastructure", but that does not make it
look any less like sloppy, half-finished work.

If it is a limitation of the bgworker infrastructure that we can't make
the LR processes look more like the other kinds of built-in processes,
then I think we need to fix that limitation.  And I further assert that
we need to do it for v10, because once we ship v10 people will adjust
their tools for this bogus output, and we'll face complaints about
backwards compatibility if we fix it later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate parallel-safety check for SubPlans

2017-04-16 Thread Tom Lane
Noah Misch  writes:
> On Wed, Apr 12, 2017 at 02:41:09PM -0400, Tom Lane wrote:
>> This is 100% wrong.  It's failing to recurse into the subexpressions of
>> the SubPlan, which could very easily include parallel-unsafe function
>> calls.  Example:

> The above-described topic is currently a PostgreSQL 10 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.

FYI, I have this on my to-look-at list, and expect to fix it before Robert
returns from vacation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-04-16 Thread Fujii Masao
On Sun, Apr 16, 2017 at 1:19 PM, Noah Misch  wrote:
> On Fri, Apr 14, 2017 at 11:58:23PM -0400, Noah Misch wrote:
>> On Wed, Apr 05, 2017 at 09:51:02PM -0400, Noah Misch wrote:
>> > On Thu, Apr 06, 2017 at 12:48:56AM +0900, Fujii Masao wrote:
>>
>> > > > On Mon, Dec 19, 2016 at 09:49:58PM +0900, Fujii Masao wrote:
>> > > >> (2)
>> > > >> There will be still many source comments and documentations that
>> > > >> we need to update, for example, in high-availability.sgml. We need to
>> > > >> check and update them throughly.
>> > > >>
>> > > >> (3)
>> > > >> The priority value is assigned to each standby listed in s_s_names
>> > > >> even in quorum commit though those priority values are not used at 
>> > > >> all.
>> > > >> Users can see those priority values in pg_stat_replication.
>> > > >> Isn't this confusing? If yes, it might be better to always assign 1 as
>> > > >> the priority, for example.
>>
>> > > Regarding the item (2), Sawada-san told me that he will work on it after
>> > > this CommitFest finishes. So we would receive the patch for the item from
>> > > him next week. If there will be no patch even after the end of next week
>> > > (i.e., April 14th), I will. Let's wait for Sawada-san's action at first.
>> >
>> > Sounds reasonable; I will look for your update on 14Apr or earlier.
>>
>> This PostgreSQL 10 open item is past due for your status update.  Kindly send
>> a status update within 24 hours, and include a date for your subsequent 
>> status
>> update.  Refer to the policy on open item ownership:
>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

Sorry for the delay.

I will review Sawada-san's patch and commit something in next three days.
So next target date is April 19th.

>> > Since you do want (3) to change, please own it like any other open item,
>> > including the mandatory status updates.
>>
>> Likewise.

As I told firstly this is not a bug. There are some proposals for better design
of priority column in pg_stat_replication, but we've not reached the consensus
yet. So I think that it's better to move this open item to "Design Decisions to
Recheck Mid-Beta" section so that we can hear more opinions.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PANIC in pg_commit_ts slru after crashes

2017-04-16 Thread Simon Riggs
On 15 April 2017 at 21:30, Jeff Janes  wrote:
> On Fri, Apr 14, 2017 at 9:33 PM, Pavan Deolasee 
> wrote:
>
>>
>> Since all those offsets fall on a page boundary, my guess is that we're
>> somehow failing to handle a new page correctly.
>>
>> Looking at the patch itself, my feeling is that the following code in
>> src/backend/access/transam/twophase.c might be causing the problem.
>>
>> 1841
>> 1842 /* update nextXid if needed */
>> 1843 if (TransactionIdFollowsOrEquals(maxsubxid,
>> ShmemVariableCache->nextXid))
>> 1844 {
>> 1845 LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
>> 1846 ShmemVariableCache->nextXid = maxsubxid;
>> 1847 TransactionIdAdvance(ShmemVariableCache->nextXid);
>> 1848 LWLockRelease(XidGenLock);
>> 1849 }
>>
>> The function PrescanPreparedTransactions() gets called at the start of the
>> redo recovery and this specific block will get exercised irrespective of
>> whether there are any prepared transactions or not. What I find particularly
>> wrong here is that we are initialising maxsubxid to current value of
>> ShmemVariableCache->nextXid when the function enters, but this block would
>> then again increment ShmemVariableCache->nextXid, when there are no prepared
>> transactions in the system.
>>
>> I wonder if we should do as in attached patch.
>
>
> That solves it for me.

Thanks for patching and testing. I'll review and probably commit tomorrow.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication - TRAP: FailedAssertion in pgstat.c

2017-04-16 Thread Erik Rijkers

On 2017-04-15 04:47, Erik Rijkers wrote:


0001-Reserve-global-xmin-for-create-slot-snasphot-export.patch +
0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch+
0003-Prevent-snapshot-builder-xmin-from-going-backwards.patch  +
0004-Fix-xl_running_xacts-usage-in-snapshot-builder.patch  +
0005-Skip-unnecessary-snapshot-builds.patch


I am now using these newer patches:
https://www.postgresql.org/message-id/30242bc6-eca4-b7bb-670e-8d0458753a8c%402ndquadrant.com


It builds fine, but when I run the old pbench-over-logical-replication
test I get:

TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: 
"pgstat.c", Line: 828)



To get that error:

--
#!/bin/sh

port1=6972 port2=6973 scale=25 clients=16 duration=60

   echo "drop table if exists pgbench_accounts;
 drop table if exists pgbench_branches;
 drop table if exists pgbench_tellers;
 drop table if exists pgbench_history;" | psql -qXp $port1 \
&& echo "drop table if exists pgbench_accounts;
 drop table if exists pgbench_branches;
 drop table if exists pgbench_tellers;
 drop table if exists pgbench_history;" | psql -qXp $port2 \
&& pgbench -p $port1 -qis ${scale//_/} && echo "
alter table pgbench_history add column hid serial primary key;
" | psql -q1Xp $port1  \
  && pg_dump -F c -p $port1   \
   --exclude-table-data=pgbench_history  \
   --exclude-table-data=pgbench_accounts \
   --exclude-table-data=pgbench_branches \
   --exclude-table-data=pgbench_tellers  \
   -t pgbench_history  \
   -t pgbench_accounts \
   -t pgbench_branches \
   -t pgbench_tellers  \
  | pg_restore -1 -p $port2 -d testdb

appname=pgbench_derail
echo "create publication pub1 for all tables;" | psql -p $port1 -aqtAX
echo "create subscription sub1 connection 'port=${port1} 
application_name=${appname}' publication pub1 with (disabled);

alter subscription sub1 enable;
" | psql -p $port2 -aqtAX

echo "-- pgbench -p $port1 -c $clients -T $duration -n   -- scale $scale 
"

 pgbench -p $port1 -c $clients -T $duration -n

--


Erik Rijkers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate parallel-safety check for SubPlans

2017-04-16 Thread Noah Misch
On Wed, Apr 12, 2017 at 02:41:09PM -0400, Tom Lane wrote:
> While poking at the question of parallel_safe marking for Plans,
> I noticed that max_parallel_hazard_walker() does this:
> 
> /* We can push the subplans only if they are parallel-safe. */
> else if (IsA(node, SubPlan))
> return !((SubPlan *) node)->parallel_safe;
> 
> This is 100% wrong.  It's failing to recurse into the subexpressions of
> the SubPlan, which could very easily include parallel-unsafe function
> calls.  Example:
> 
> regression=# explain select * from tenk1 where int4(unique1 + random()) not 
> in (select ten from tenk2);
>  QUERY PLAN  
> -
>  Gather  (cost=470.00..884.25 rows=5000 width=244)
>Workers Planned: 4
>->  Parallel Seq Scan on tenk1  (cost=470.00..884.25 rows=1250 width=244)
>  Filter: (NOT (hashed SubPlan 1))
>  SubPlan 1
>->  Seq Scan on tenk2  (cost=0.00..445.00 rows=1 width=4)
> 
> random() is parallel-restricted so it's not cool that the SubPlan was
> allowed to be passed down to workers.
> 
> I tried to fix it like this:
> 
> else if (IsA(node, SubPlan))
> {
> if (!((SubPlan *) node)->parallel_safe &&
> max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
> return true;
> }
> 
> but got some failures in the regression tests due to things not getting
> parallelized when expected.  Poking into that, I remembered that for some
> SubPlans, the testexpr contains Params representing the output columns
> of the sub-select.  So the recursive call of max_parallel_hazard_walker
> visits those and deems the expression parallel-restricted.
> 
> Basically, therefore, this logic is totally inadequate.  I think what
> we need to do is improve matters so that while looking at the testexpr
> of a SubPlan, we pass down a whitelist saying that the Params having
> the numbers indicated by the SubLink's paramIds list are OK.

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, I will look for an update
according to your current blanket status update[1]:

  I will begin investigating this no later than April 24th, my first day back
  from vacation, and will provide a further update by that same day.

Thanks.

[1] 
https://www.postgresql.org/message-id/CA+Tgmoa1-529KFwdB-+A1eG2MFc3j9eqJenBUFU=mst-h9q...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] some review comments on logical rep code

2017-04-16 Thread Noah Misch
On Fri, Apr 14, 2017 at 04:47:12AM +0900, Fujii Masao wrote:
> Though I've read only a part of the logical rep code yet, I'd like to
> share some (relatively minor) review comments that I got so far.
> 
> In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
> value from the argument, instead of DatumGetObjectId().
> 
> No one resets on_commit_launcher_wakeup flag to false.
> 
> ApplyLauncherWakeup() should be static function.
> 
> Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
> subcommands (e.g., ENABLED one) should wake the launcher up on commit.
> 
> Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
> 
> Normal statements that the walsender for logical rep runs are logged
> by log_replication_commands. So if log_statement is also set,
> those commands are logged twice.
> 
> LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
> an error. The callback function to reset it needs to be registered
> via on_shmem_exit().
> 
> Typo: "an subscriber" should be "a subscriber" in some places.
> 
> /* Get conninfo */
> datum = SysCacheGetAttr(SUBSCRIPTIONOID,
> tup,
> Anum_pg_subscription_subconninfo,
> );
> Assert(!isnull);
> 
> This assertion makes me think that pg_subscription.subconnifo should
> have NOT NULL constraint. Also subslotname and subpublications
> should have NOT NULL constraint because of the same reason.
> 
> SpinLockAcquire(>relmutex);
> MyLogicalRepWorker->relstate =
>   GetSubscriptionRelState(MyLogicalRepWorker->subid,
> MyLogicalRepWorker->relid,
> >relstate_lsn,
> false);
> SpinLockRelease(>relmutex);
> 
> Non-"short-term" function like GetSubscriptionRelState() should not
> be called while spinlock is being held.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: logical replication and PANIC during shutdown checkpoint in publisher

2017-04-16 Thread Noah Misch
On Wed, Apr 12, 2017 at 10:55:08PM +0900, Fujii Masao wrote:
> When I shut down the publisher while I repeated creating and dropping
> the subscription in the subscriber, the publisher emitted the following
> PANIC error during shutdown checkpoint.
> 
> PANIC:  concurrent transaction log activity while database system is
> shutting down
> 
> The cause of this problem is that walsender for logical replication can
> generate WAL records even during shutdown checkpoint.
> 
> Firstly walsender keeps running until shutdown checkpoint finishes
> so that all the WAL including shutdown checkpoint record can be
> replicated to the standby. This was safe because previously walsender
> could not generate WAL records. However this assumption became
> invalid because of logical replication. That is, currenty walsender for
> logical replication can generate WAL records, for example, by executing
> CREATE_REPLICATION_SLOT command. This is an oversight in
> logical replication patch, I think.
> 
> To fix this issue, we should terminate walsender for logical replication
> before shutdown checkpoint starts. Of course walsender for physical
> replication still needs to keep running until shutdown checkpoint ends,
> though.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Interval for launching the table sync worker

2017-04-16 Thread Noah Misch
On Thu, Apr 06, 2017 at 11:33:22AM +0900, Masahiko Sawada wrote:
> While testing table sync worker for logical replication I noticed that
> if the table sync worker of logical replication failed to insert the
> data for whatever reason, the table sync worker process exits with
> error. And then the main apply worker launches the table sync worker
> again soon without interval. This routine is executed at very high
> frequency without interval.
> 
> Should we do put a interval (wal_retrieve_interval or make a new GUC
> parameter?) for launching the table sync worker?

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-04-16 Thread Noah Misch
On Thu, Apr 13, 2017 at 11:38:08AM -0400, Robert Haas wrote:
> On Thu, Apr 13, 2017 at 11:05 AM, Stephen Frost  wrote:
> > Sure, though I won't be able to today and I've got some doubts about the
> > other patches.  I'll have more time tomorrow though.
> 
> OK, cool.  I'll mark you down as the owner on the open items list.

[Action required within three days.]

The above-described topic is currently a PostgreSQL 10 open item.  Stephen, as
its owner, please observe the policy on open item ownership[1] and send a
status update within three calendar days of this message.  Include a date for
your subsequent status update.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers