date:20160420

Re: [HACKERS] EXPLAIN VERBOSE with parallel Aggregate

2016-04-20 Thread Tom Lane

David Rowley  writes:
> I'd like to admit that I'm a bit confused as to why
> generate_function_name(), when it already knows the funcid, bothers to
> call func_get_detail(), which performs a search for the function based
> on the name and argument types, to find the function, most likely with
> the same funcid as the one which it already knew.

> Could you explain why this has to happen?

The point is exactly to find out whether a search for the function
given only the name and argument types would find the same function,
or a similar function in a different schema --- which would happen
if that other schema is earlier in the search path than the desired
one, or maybe the desired one isn't in search_path at all.  In such
a case we must schema-qualify the function name in the printed
expression.

To some extent this is because ruleutils serves two masters.  We would
probably not care that much about schema exactness for EXPLAIN's purposes,
but we do care for dumping views and rules.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN VERBOSE with parallel Aggregate

2016-04-20 Thread David Rowley

On 18 April 2016 at 02:22, Tom Lane  wrote:
> David Rowley  writes:
>> Note that I've done nothing for the weird schema prefixing problem I
>> mentioned. I think I'd need input on the best way to solve this. If
>> it's actually a problem.
>
> It sure looks like a problem to me.  Maybe it's only cosmetic, but
> it's going to cause confusion and bug reports.  EXPLAIN output for
> parallel queries is going to be confusing enough anyway; we need
> to avoid having artifacts like this.

I'd like to admit that I'm a bit confused as to why
generate_function_name(), when it already knows the funcid, bothers to
call func_get_detail(), which performs a search for the function based
on the name and argument types, to find the function, most likely with
the same funcid as the one which it already knew.

Could you explain why this has to happen?

I also tried patching with the attached and running the regression
tests to see what breaks... nothing did. So it seems that, at least in
the tests that that code path is never hit.

With that in mind, perhaps the fix for the namespace problem is just
to special case the combine Aggrefs in get_agg_expr() and have it just
lookup the pg_proc entry for the aggred->aggfnoid, and use that
proname.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

generate_function_name_experiment.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix of doc for synchronous_standby_names.

2016-04-20 Thread Amit Langote

On 2016/04/21 12:25, Kyotaro HORIGUCHI wrote:
> At Wed, 20 Apr 2016 23:07:41 -0400, Robert Haas wrote:
>> On Sun, Apr 17, 2016 at 11:56 PM, Kyotaro HORIGUCHI wrote:
>>>
 There is no mechanism to enforce uniqueness. In case of
 duplicates one of the matching standbys will be considered as
 higher priority, though exactly which one is indeterminate.
>>>
>>> The patch attatched edits the above to the following.
>>>
 There is no mechanism to enforce uniqueness. In case of
 duplicates some of the matching standbys will be considered as
 higher priority, though they are chosen in an indeterminate way.
>>>
>>> Is this makes sense?
>>
>> I don't see what the problem is with the existing language.  I don't
>> find your rewrite to be clearer.
> 
> My first sentense shows my concern. I don't want make something
> clear but want to fix the description that seems to me to be
> wrong.
> 
> If the exising description fits the case that two or more
> matching standbys are choosed as 'higher priority', I'm quite bad
> in reading..

ISTM, the sentence describes what happens in a *single instance* of
encountering duplicate (same name found in primary_conninfo of 2 or more
standbys).  It's still one name but which of the standbys claims the spot
(for that name) of being a synchronous standby with given priority is
indeterminate.

Now, there can be multiple instances of encountering duplicates, each for
a different sync slot.  But this particular sentence seems to be talking
about what's the case for any given slot.

Does that make sense?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should XLogInsert() be done only inside a critical section?

2016-04-20 Thread Michael Paquier

On Thu, Apr 21, 2016 at 5:44 AM, Tom Lane  wrote:
> Anyway, I went through our tree and added START/END_CRIT_SECTION calls
> around all XLogInsert calls that could currently be reached without one;
> see attached.  Since this potentially breaks third-party code I would
> not propose back-patching it, but I think it's reasonable to propose
> applying it to HEAD.

+1 for sanitizing those code paths this way. This patch looks sane to
me after having a look with some testing.

--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -610,15 +610,12 @@ brinbuild(Relation heap, Relation index,
IndexInfo *indexInfo)
elog(ERROR, "index \"%s\" already contains data",
 RelationGetRelationName(index));

-   /*
-* Critical section not required, because on error the creation of the
-* whole relation will be rolled back.
-*/
Perhaps Alvaro has a opinion to offer regarding this bit removed in brin.c?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PGCTLTIMEOUT in pg_regress, or skink versus the clock

2016-04-20 Thread Tom Lane

Noah Misch  writes:
> On Wed, Apr 20, 2016 at 06:38:56PM -0400, Tom Lane wrote:
>> I am thinking that we missed a bet in commit 2ffa86962077c588
>> et al, and that pg_regress's hard-wired 60-second start timeout ought to
>> be overridable from an environment variable just as pg_ctl's timeout is.
>> It might as well be the same environment variable, so I propose the
>> attached patch.

>> Any objections?

> Looks reasonable.

Pushed.

I did some more testing here and concluded that slow postmaster startup is
almost certainly the right explanation for skink's problems.  On my
otherwise-idle workstation, postmaster startup under valgrind takes about
10 seconds, of which six or seven seem to involve valgrind just collecting
its thoughts :-(.  The postmaster's socket file does not appear until nine
seconds in, and then by ten seconds it is ready to accept connections.
So that's how come I see just one "FATAL: the database system is starting
up" log entry --- pg_regress's previous eight launches of pg_ctl just failed
with "no such socket file".  So I now think the observed failures on skink
can be explained by supposing that valgrind sometimes takes around a
minute to start up on that platform.  The skink log I quoted before would
fit with the postmaster almost but not quite reaching "ready" status
before pg_ctl's timeout expires.  The other two skink failures actually
have empty postmaster log files, suggesting that valgrind was so slow that
we didn't even get to the "database system was shut down" log message.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result

2016-04-20 Thread Robert Haas

On Tue, Mar 29, 2016 at 10:20 AM, Ashutosh Bapat
 wrote:
>> I think the reason for that is in foreign_join_ok.  This in that function:
>>
>> wrongly pulls up remote_conds from joining relations in the FULL JOIN
>> case.  I think we should not pull up such conditions in the FULL JOIN case.
>>
>
> Right. For a full outer join, since each joining relation acts as outer for
> the other, we can not pull up the quals to either join clauses or other
> clauses. So, in such a case, we will need to encapsulate the joining
> relation with conditions into a subquery. Unfortunately, the current deparse
> logic does not handle this encapsulation. Adding that functionality so close
> to the feature freeze might be risky given the amount of code changes
> required.
>
> PFA patch with a quick fix. A full outer join with either of the joining
> relations having WHERE conditions (or other clauses) is not pushed down. In
> the particular case that was reported, the bug triggered because of the way
> conditions are handled for an inner join. For an inner join, all the
> conditions in ON as well as WHERE clause are treated like they are part of
> WHERE clause. This allows pushing down a join even if there are unpushable
> join clauses. But the pushable conditions can be put back into the ON
> clause. This avoids using subqueries while deparsing.

Committed.

I think we should introduce subquery-based deparsing for 9.7, but I
agree it's better not to do it now.  I think we should try to handle
SEMI and ANTI joins that way, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix of doc for synchronous_standby_names.

2016-04-20 Thread Kyotaro HORIGUCHI

At Wed, 20 Apr 2016 23:07:41 -0400, Robert Haas  wrote 
in 
> On Sun, Apr 17, 2016 at 11:56 PM, Kyotaro HORIGUCHI
>  wrote:
> > Hello, now the synchronous_standby_names can teach to ensure more
> > then one synchronous standbys. But the doc for it seems assuming
> > only one synchronous standby.
> >
> >> There is no mechanism to enforce uniqueness. In case of
> >> duplicates one of the matching standbys will be considered as
> >> higher priority, though exactly which one is indeterminate.
> >
> > The patch attatched edits the above to the following.
> >
> >> There is no mechanism to enforce uniqueness. In case of
> >> duplicates some of the matching standbys will be considered as
> >> higher priority, though they are chosen in an indeterminate way.
> >
> > Is this makes sense?
> 
> I don't see what the problem is with the existing language.  I don't
> find your rewrite to be clearer.

My first sentense shows my concern. I don't want make something
clear but want to fix the description that seems to me to be
wrong.

If the exising description fits the case that two or more
matching standbys are choosed as 'higher priority', I'm quite bad
in reading..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix of doc for synchronous_standby_names.

2016-04-20 Thread Robert Haas

On Sun, Apr 17, 2016 at 11:56 PM, Kyotaro HORIGUCHI
 wrote:
> Hello, now the synchronous_standby_names can teach to ensure more
> then one synchronous standbys. But the doc for it seems assuming
> only one synchronous standby.
>
>> There is no mechanism to enforce uniqueness. In case of
>> duplicates one of the matching standbys will be considered as
>> higher priority, though exactly which one is indeterminate.
>
> The patch attatched edits the above to the following.
>
>> There is no mechanism to enforce uniqueness. In case of
>> duplicates some of the matching standbys will be considered as
>> higher priority, though they are chosen in an indeterminate way.
>
> Is this makes sense?

I don't see what the problem is with the existing language.  I don't
find your rewrite to be clearer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "parallel= " information is not coming in pg_dumpall for create aggregate

2016-04-20 Thread Robert Haas

On Mon, Apr 18, 2016 at 10:47 AM, Fabrízio de Royes Mello
 wrote:
>> I checked in PG 9.6 , if we create an aggregate function with saying -
>> parallel=safe/restricted/unsafe and then take
>> a pg_dumpall of the entire cluster , "parallel= " is missing from create
>> aggregate syntax
>>
>> Steps to reproduce -
>>
>> .)connect to psql terminal and create an aggregate function
>>
>> postgres=# CREATE AGGREGATE unsafe_sum100 (float8)
>> (
>> stype = float8,
>> sfunc = float8pl,
>> mstype = float8,
>> msfunc = float8pl,
>> minvfunc = float8mi,
>> parallel=safe);
>> CREATE AGGREGATE
>>
>> .)perform pg_dumpall against that cluster
>>
>> .)check the content of create aggregate unsafe_sum100 in the file
>>
>> "
>> -
>> -- Name: unsafe_sum100(double precision); Type: AGGREGATE; Schema: public;
>> Owner: centos
>> --
>>
>> CREATE AGGREGATE unsafe_sum100(double precision) (
>> SFUNC = float8pl,
>> STYPE = double precision,
>> MSFUNC = float8pl,
>> MINVFUNC = float8mi,
>> MSTYPE = double precision
>> );
>>
>> "
>
> You're correct... try the attached patch to fix it.

Nice catch, Tushar.  Thanks for the patch, Fabrízio.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_activity crashes

2016-04-20 Thread Kyotaro HORIGUCHI

Hello,

At Wed, 20 Apr 2016 15:14:16 +0200, Petr Jelinek  wrote 
in <571780a8.4070...@2ndquadrant.com>
> I noticed sporadic segfaults when selecting from pg_stat_activity on
> current HEAD.
> 
> The culprit is the 53be0b1add7064ca5db3cd884302dfc3268d884e commit
> which added more wait info into the pg_stat_get_activity(). More
> specifically, the following code is broken:
> 
> +   proc = BackendPidGetProc(beentry->st_procpid);
> + wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
> 
> This needs to check if proc is NULL. When reading the code I noticed
> that the new functions pg_stat_get_backend_wait_event_type() and
> pg_stat_get_backend_wait_event() suffer from the same problem.

Good catch.

> Here is PoC patch which fixes the problem. I am wondering if we should
> raise warning in the pg_stat_get_backend_wait_event_type() and
> pg_stat_get_backend_wait_event() like the pg_signal_backend() does
> when proc is NULL instead of just returning NULL which is what this
> patch does though.

It still makes the two relevant columns in pg_stat_activity
inconsistent each other since it reads the procarray entry twice
without a lock on procarray.

The attached patch adds pgstat_get_wait_event_info to read
wait_event_info in more appropriate way. Then change
pg_stat_get_wait_event(_type) to take the wait_event_info.

Does this work for you?

We still may have an inconsistency between weit_event and query,
or beentry itself but preventing it would need to have local
copies of wait_event_info of all corresponding entries in
procarray, which will be overdone.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 41f4374..999b7e7 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -55,6 +55,7 @@
 #include "storage/latch.h"
 #include "storage/lmgr.h"
 #include "storage/pg_shmem.h"
+#include "storage/procarray.h"
 #include "storage/procsignal.h"
 #include "storage/sinvaladt.h"
 #include "utils/ascii.h"
@@ -3118,6 +3119,27 @@ pgstat_read_current_status(void)
 }
 
 /* --
+ * pgstat_get_wait_event_info() -
+ *
+ *	Return the wait_event_info for a procid. 0 if the proc is no longer
+ *	living or has no waiting event.
+ */
+uint32
+pgstat_get_wait_event_info(int procpid)
+{
+	uint32		ret = 0;
+	PGPROC	   *proc;
+
+	LWLockAcquire(ProcArrayLock, LW_SHARED);
+	proc = BackendPidGetProcWithLock(procpid);
+	if (proc)
+		ret = proc->wait_event_info;
+	LWLockRelease(ProcArrayLock);
+
+	return ret;
+}
+
+/* --
  * pgstat_get_wait_event_type() -
  *
  *	Return a string representing the current wait event type, backend is
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 64c4cc4..72776ab 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -679,7 +679,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 		bool		nulls[PG_STAT_GET_ACTIVITY_COLS];
 		LocalPgBackendStatus *local_beentry;
 		PgBackendStatus *beentry;
-		PGPROC	   *proc;
+		uint32		wait_event_info;
 		const char *wait_event_type;
 		const char *wait_event;
 
@@ -716,6 +716,14 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			continue;
 		}
 
+		/*
+		 * Read wait event. This may be inconsistent with the beentry when the
+		 * corresponding procarray entry has been removed or modified after
+		 * the beentry was copied, but we don't need such strict consistency
+		 * for this information.
+		 */
+		wait_event_info = pgstat_get_wait_event_info(beentry->st_procpid);
+
 		/* Values available to all callers */
 		values[0] = ObjectIdGetDatum(beentry->st_databaseid);
 		values[1] = Int32GetDatum(beentry->st_procpid);
@@ -782,14 +790,13 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 
 			values[5] = CStringGetTextDatum(beentry->st_activity);
 
-			proc = BackendPidGetProc(beentry->st_procpid);
-			wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
+			wait_event_type = pgstat_get_wait_event_type(wait_event_info);
 			if (wait_event_type)
 values[6] = CStringGetTextDatum(wait_event_type);
 			else
 nulls[6] = true;
 
-			wait_event = pgstat_get_wait_event(proc->wait_event_info);
+			wait_event = pgstat_get_wait_event(wait_event_info);
 			if (wait_event)
 values[7] = CStringGetTextDatum(wait_event);
 			else
@@ -983,7 +990,6 @@ pg_stat_get_backend_wait_event_type(PG_FUNCTION_ARGS)
 {
 	int32		beid = PG_GETARG_INT32(0);
 	PgBackendStatus *beentry;
-	PGPROC	   *proc;
 	const char *wait_event_type;
 
 	if ((beentry = pgstat_fetch_stat_beentry(beid)) == NULL)
@@ -992,8 +998,8 @@ pg_stat_get_backend_wait_event_type(PG_FUNCTION_ARGS)
 		wait_event_type = "";
 	else
 	{
-		proc = BackendPidGetProc(beentry->st_procpid);
-		wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
+		wait_event_type = pgstat_get_wait_event_type(
+

Re: [HACKERS] pg_dump dump catalog ACLs

2016-04-20 Thread Stephen Frost

Noah,

On Wednesday, April 20, 2016, Noah Misch  wrote:

> On Wed, Apr 20, 2016 at 11:12:44AM -0400, Stephen Frost wrote:
> > * Noah Misch (n...@leadboat.com ) wrote:
> > > On Sun, Apr 17, 2016 at 11:02:28PM -0400, Noah Misch wrote:
> > > > (3) pg_dumpall became much slower around the time of these commits.
> On one
> > > > machine (POWER7 3.55 GHz), a pg_dumpall just after initdb slowed
> from 0.25s at
> > > > commit 6c268df^ to 4.0s at commit 7a54270.  On a slower machine
> (Opteron
> > > > 1210), pg_dumpall now takes 19s against such a fresh cluster.
> > >
> > > [This is a generic notification.]
> > >
> > > The above-described topic is currently a PostgreSQL 9.6 open item.
> Stephen,
> > > since you committed the patch believed to have created it, you own
> this open
> > > item.  If that responsibility lies elsewhere, please let us know whose
> > > responsibility it is to fix this.  Since new open items may be
> discovered at
> > > any time and I want to plan to have them all fixed well in advance of
> the ship
> > > date, I will appreciate your efforts toward speedy resolution.  Please
> > > present, within 72 hours, a plan to fix the defect within seven days
> of this
> > > message.  Thanks.
> >
> > I'm at PGConf.US but will be reviewing this in detail after.  The schema
> > qualification will be easily taken care of, the additional time for
> > pg_dump is due to the queries looking at the catalog objects and is
> > therefore relatively fixed and is primairly only a large amount of the
> > time when dumping databases which are mostly empty.
>
> Do you think it would be okay to release 9.6 with pg_dump still adding that
> amount of time per database?
>

For my 2c, the answer is "yes". I've actually looked at how this could be
improved using a bit of caching in pg_dump for certain things, but I didn't
think those would be appropriate to include in this patch and would be a
general pg_dump performance improvement.

I'm certainly open to improving these issues now if we agree that they
should be fixed for 9.6.  If we don't want to include such changes in 9.6
then I will propose then for post-9.6.

Thanks!

Stephen

Re: [HACKERS] EXPLAIN VERBOSE with parallel Aggregate

2016-04-20 Thread Robert Haas

On Tue, Apr 19, 2016 at 9:22 PM, Noah Misch  wrote:
> On Sun, Apr 17, 2016 at 10:22:24AM -0400, Tom Lane wrote:
>> David Rowley  writes:
>> > On 16 April 2016 at 04:27, Tom Lane  wrote:
>> >> +1 for the latter, if we can do it conveniently.  I think exposing
>> >> the names of the aggregate implementation functions would be very
>> >> user-unfriendly, as nobody but us hackers knows what those are.
>>
>> > It does not really seem all that convenient to do this. It also seems
>> > a bit strange to me to have a parent node report a column which does
>> > not exist in any nodes descending from it. Remember that the combine
>> > Aggref does not have the same ->args as its corresponding partial
>> > Aggref. It's not all that clear to me if there is any nice way to do
>> > have this work the way you'd like. If we were to just call
>> > get_rule_expr() on the first arg of the combine aggregate node, it
>> > would re-print the PARTIAL keyword again.
>>
>> This suggests to me that the parsetree representation for partial
>> aggregates was badly chosen.  If EXPLAIN can't recognize them, then
>> neither can anything else, and we may soon find needs for telling
>> the difference that are not just cosmetic.  Maybe we need more
>> fields in Aggref.
>
> [This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 9.6 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If that responsibility lies elsewhere, please let us know whose
> responsibility it is to fix this.  Since new open items may be discovered at
> any time and I want to plan to have them all fixed well in advance of the ship
> date, I will appreciate your efforts toward speedy resolution.  Please
> present, within 72 hours, a plan to fix the defect within seven days of this
> message.  Thanks.

I'll do my best to work on this soon.  I'm not happy with the output
produced by David's patch, but I don't expect I'll be able to do
better without putting some time into it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump dump catalog ACLs

2016-04-20 Thread Noah Misch

On Wed, Apr 20, 2016 at 11:12:44AM -0400, Stephen Frost wrote:
> * Noah Misch (n...@leadboat.com) wrote:
> > On Sun, Apr 17, 2016 at 11:02:28PM -0400, Noah Misch wrote:
> > > (3) pg_dumpall became much slower around the time of these commits.  On 
> > > one
> > > machine (POWER7 3.55 GHz), a pg_dumpall just after initdb slowed from 
> > > 0.25s at
> > > commit 6c268df^ to 4.0s at commit 7a54270.  On a slower machine (Opteron
> > > 1210), pg_dumpall now takes 19s against such a fresh cluster.
> > 
> > [This is a generic notification.]
> > 
> > The above-described topic is currently a PostgreSQL 9.6 open item.  Stephen,
> > since you committed the patch believed to have created it, you own this open
> > item.  If that responsibility lies elsewhere, please let us know whose
> > responsibility it is to fix this.  Since new open items may be discovered at
> > any time and I want to plan to have them all fixed well in advance of the 
> > ship
> > date, I will appreciate your efforts toward speedy resolution.  Please
> > present, within 72 hours, a plan to fix the defect within seven days of this
> > message.  Thanks.
> 
> I'm at PGConf.US but will be reviewing this in detail after.  The schema
> qualification will be easily taken care of, the additional time for
> pg_dump is due to the queries looking at the catalog objects and is
> therefore relatively fixed and is primairly only a large amount of the
> time when dumping databases which are mostly empty.

Do you think it would be okay to release 9.6 with pg_dump still adding that
amount of time per database?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PGCTLTIMEOUT in pg_regress, or skink versus the clock

2016-04-20 Thread Noah Misch

On Wed, Apr 20, 2016 at 06:38:56PM -0400, Tom Lane wrote:
> I am thinking that we missed a bet in commit 2ffa86962077c588
> et al, and that pg_regress's hard-wired 60-second start timeout ought to
> be overridable from an environment variable just as pg_ctl's timeout is.
> It might as well be the same environment variable, so I propose the
> attached patch.

> Any objections?

Looks reasonable.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid parallel full and right join paths.

2016-04-20 Thread Peter Geoghegan

On Wed, Apr 20, 2016 at 6:43 PM, Alvaro Herrera
 wrote:
> The brin.sql test does that ...

I actually copied brin.sql when creating regression tests for external
sorting, primarily because I wanted to test a variety of collations,
without having any control of what they happen to be on the target.
Those went into amcheck's regression tests, and so have yet to be
committed.

I think that your approach there has plenty to recommend it, at least
where requirements are more complicated.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-04-20 Thread Noah Misch

On Wed, Apr 20, 2016 at 02:03:16PM +0900, Michael Paquier wrote:
> On Wed, Apr 20, 2016 at 1:39 PM, Noah Misch  wrote:
> > On Tue, Apr 19, 2016 at 02:42:24AM -0300, Alvaro Herrera wrote:
> >> This thread seems to have stalled.  I thought we were going to consider
> >> these patches for 9.6.
> >
> > Committers have given this thread's patches a generous level of 
> > consideration.
> > At this point, if $you wouldn't back-patch them to at least 9.5, they don't
> > belong in 9.6.  However, a back-patch to 9.3 does seem fair, assuming the
> > final patch looks anything like the current proposals.
> 
> Er, the change is rather located and fully controlled by _MSC_VER >=
> 1900, so this represents no risk for existing compilations on Windows,
> don't you agree?

Yes.  That is why I said a back-patch to 9.3 seems fair.

> >> Should we simply push them to see what the
> >> buildfarm thinks?
> >
> > No.  The thread has been getting suitable test reports for a few weeks now.
> > If it were not, better to make the enhancement wait as long as necessary 
> > than
> > to use the buildfarm that way.  Buildfarm results wouldn't even be 
> > pertinent;
> > they would merely tell us whether the patch broke non-VS 2015 compilers.
> 
> Well, they could push them, the results won't really matter and
> existing machines won't be impacted, as no buildfarm machine is using
> _MSC_VER >= 1900 as of now. Petr has one ready though as mentioned
> upthread.

Here you've presented two additional good reasons to not "simply push them to
see what the buildfarm thinks."


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid parallel full and right join paths.

2016-04-20 Thread Alvaro Herrera

Michael Paquier wrote:
> On Thu, Apr 21, 2016 at 7:13 AM, Peter Geoghegan  wrote:
> > On Wed, Apr 20, 2016 at 2:49 PM, Robert Haas  wrote:
> >> Committed.  But I think the regression test needs more thought, so I
> >> left that out.
> >
> > It would be nice if there was a fuzz testing infrastructure that
> > verified that parallel plans produce the same answer as serial plans.
> 
> Results of parallel plans and serial plans could be stored in
> temporary tables in the test, then that's a matter of comparing them I
> guess. That's largely doable.

The brin.sql test does that ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Disallow unique index on system columns

2016-04-20 Thread Tom Lane

Eric Ridge  writes:
> I've got an extension that's actually a custom Access Method, and for
> reasons that are probably too boring to go into here, it requires that the
> first column in the index be a function that takes the ctid.  Ie, something
> akin to:
>CREATE INDEX idx ON table (my_func('table', ctid), other_func(table));

That's ... creative.

> The AM implementation itself doesn't actually use the result of my_func(),
> but that construct is necessary so I can detect certain queries that look
> like:
> SELECT FROM table WHERE my_func('table', ctid) ==> 'index condition'

Um, why's the ctid important here, or perhaps more directly, what is
it you're really trying to do?

> I don't mind that you're changing this for 9.6... 9.6 is going to change so
> much other stuff around custom AMs that I'll deal with it when the time
> comes, but back-patching this into 9.3/4/5 would make life very difficult.

We weren't planning to do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-20 Thread Ants Aasma

On Tue, Apr 19, 2016 at 6:11 PM, Kevin Grittner  wrote:
> On Tue, Apr 19, 2016 at 9:57 AM, Amit Kapila  wrote:
>> On Sun, Apr 17, 2016 at 2:26 AM, Andres Freund  wrote:
>>>
>>> On 2016-04-16 16:44:52 -0400, Noah Misch wrote:
>>> > That is more controversial than the potential ~2% regression for
>>> > old_snapshot_threshold=-1.  Alvaro[2] and Robert[3] are okay releasing
>>> > that way, and Andres[4] is not.
>>>
>>> FWIW, I could be kinda convinced that it's temporarily ok, if there'd be
>>> a clear proposal on the table how to solve the scalability issue around
>>> MaintainOldSnapshotTimeMapping().
>>
>> It seems that for read-only workloads, MaintainOldSnapshotTimeMapping()
>> takes EXCLUSIVE LWLock which seems to be a probable reason for a performance
>> regression.  Now, here the question is do we need to acquire that lock if
>> xmin is not changed since the last time value of
>> oldSnapshotControl->latest_xmin is updated or xmin is lesser than equal to
>> oldSnapshotControl->latest_xmin?
>> If we don't need it for above cases, I think it can address the performance
>> regression to a good degree for read-only workloads when the feature is
>> enabled.
>
> Thanks, Amit -- I think something along those lines is the right
> solution to the scaling issues when the feature is enabled.  For
> now I'm focusing on the back-patching issues and the performance
> regression when the feature is disabled, but I'll shift focus to
> this once the "killer" issues are in hand.

I had an idea I wanted to test out. The gist of it is to effectively
have the last slot of timestamp to xid map stored in the latest_xmin
field and only update the mapping when slot boundaries are crossed.
See attached WIP patch for details. This way the exclusive lock only
needs to be acquired once per minute. The common case is a spinlock
that could be replaced with atomics later. And it seems to me that the
mutex_threshold taken in TestForOldSnapshot() can also get pretty hot
under some workloads, so that may also need some tweaking.

I think a better approach would be to base the whole mechanism on a
periodically updated counter, instead of timestamps. Autovacuum
launcher looks like a good candidate to play the clock keeper, without
it the feature has little point anyway. AFAICS only the clock keeper
needs to have the timestamp xid mapping, others can make do with a
couple of periodically updated values. I haven't worked it out in
detail, but it feels like the code would be simpler. But this was a
larger change than I felt comfortable trying out, so I went with the
simple change first.

However, while checking out if my proof of concept patch actually
works I hit another issue. I couldn't get my test for the feature to
actually work. The test script I used is attached. Basically I have a
table with 1000 rows, one high throughput worker deleting old rows and
inserting new ones, one long query that acquires a snapshot and sleeps
for 30min, and one worker that has a repeatable read snapshot and
periodically does count(*) on the table. Based on documentation I
would expect the following:

* The interfering query gets cancelled
* The long running query gets to run
* Old rows will start to be cleaned up after the threshold expires.

However, testing on commit 9c75e1a36b6b2f3ad9f76ae661f42586c92c6f7c,
I'm seeing that the old rows do not get cleaned up, and that I'm only
seeing the interfering query get cancelled when old_snapshot_threshold
= 0. Larger values do not result in cancellation. Am I doing something
wrong or is the feature just not working at all?

Regards,
Ants Aasma
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 8aa1f49..dc00d91 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -80,8 +80,11 @@ typedef struct OldSnapshotControlData
 	 */
 	slock_t		mutex_current;			/* protect current timestamp */
 	int64		current_timestamp;		/* latest snapshot timestamp */
-	slock_t		mutex_latest_xmin;		/* protect latest snapshot xmin */
+	slock_t		mutex_latest_xmin;		/* protect latest snapshot xmin
+		 * and next_map_update
+		 */
 	TransactionId latest_xmin;			/* latest snapshot xmin */
+	int64		next_map_update;		/* latest snapshot valid up to */
 	slock_t		mutex_threshold;		/* protect threshold fields */
 	int64		threshold_timestamp;	/* earlier snapshot is old */
 	TransactionId threshold_xid;		/* earlier xid may be gone */
@@ -95,7 +98,9 @@ typedef struct OldSnapshotControlData
 	 * count_used value of old_snapshot_threshold means that the buffer is
 	 * full and the head must be advanced to add new entries.  Use timestamps
 	 * aligned to minute boundaries, since that seems less surprising than
-	 * aligning based on the first usage timestamp.
+	 * aligning based on the first usage timestamp. The latest bucket is
+	 * effectively stored within latest_xmin. The circular buffer is updated
+	 *

Re: [HACKERS] Disallow unique index on system columns

2016-04-20 Thread Eric Ridge

On Sat, Apr 16, 2016 at 12:14 PM Tom Lane  wrote:

>
> Pushed.  I moved the check into DefineIndex, as that's where user-facing
> complaints about indexes generally ought to be.
>

If you're planning on back-patching this, please don't.  :)  It'll
literally ruin my life.

I've got an extension that's actually a custom Access Method, and for
reasons that are probably too boring to go into here, it requires that the
first column in the index be a function that takes the ctid.  Ie, something
akin to:

   CREATE INDEX idx ON table (my_func('table', ctid), other_func(table));

The AM implementation itself doesn't actually use the result of my_func(),
but that construct is necessary so I can detect certain queries that look
like:
SELECT FROM table WHERE my_func('table', ctid) ==> 'index condition'

I don't mind that you're changing this for 9.6... 9.6 is going to change so
much other stuff around custom AMs that I'll deal with it when the time
comes, but back-patching this into 9.3/4/5 would make life very difficult.

Thanks for listening!

eric

[HACKERS] PGCTLTIMEOUT in pg_regress, or skink versus the clock

2016-04-20 Thread Tom Lane

Buildfarm member skink has failed three times recently like this:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink=2016-04-15%2001%3A20%3A44

the relevant part of that being

pg_regress: postmaster did not respond within 60 seconds
Examine 
/home/andres/build/buildfarm/REL9_5_STABLE/pgsql.build/src/interfaces/ecpg/test/log/postmaster.log
 for the reason

where the postmaster log shows nothing particularly surprising,
it's just not reached the ready state yet:

LOG:  database system was shut down at 2016-04-15 05:11:18 UTC
FATAL:  the database system is starting up
LOG:  MultiXact member wraparound protections are now enabled


Now, there are some reasons to suspect that there might be more here than
meets the eye; for one thing, it stretches credulity a bit to believe that
it's only random chance that all three failures are in the 9.5 branch and
all are in the ecpg regression test step.  I'm also curious as to why we
see only one "FATAL: the database system is starting up" connection
rejection and not sixty.  However, by far the simplest explanation for
this failure is just that the postmaster took more than 60 seconds to
start up; and seeing that skink is running Valgrind and is on an AWS
instance, that's not that much of a stretch of credulity either.

Hence, I am thinking that we missed a bet in commit 2ffa86962077c588
et al, and that pg_regress's hard-wired 60-second start timeout ought to
be overridable from an environment variable just as pg_ctl's timeout is.
It might as well be the same environment variable, so I propose the
attached patch.  Note that since the shutdown end of things in pg_regress
uses "pg_ctl stop", that end of it already responds to PGCTLTIMEOUT.
(I could not find any user-facing documentation for pg_regress, so there's
no apparent need for a docs update.)

Any objections?

regards, tom lane

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 2f6f56d..574f5b8 100644
*** a/src/test/regress/pg_regress.c
--- b/src/test/regress/pg_regress.c
*** regression_main(int argc, char *argv[], 
*** 2185,2190 
--- 2185,2192 
  	if (temp_instance)
  	{
  		FILE	   *pg_conf;
+ 		const char *env_wait;
+ 		int			wait_seconds;
  
  		/*
  		 * Prepare the temp instance
*** regression_main(int argc, char *argv[], 
*** 2335,2345 
  		}
  
  		/*
! 		 * Wait till postmaster is able to accept connections (normally only a
! 		 * second or so, but Cygwin is reportedly *much* slower).  Don't wait
! 		 * forever, however.
  		 */
! 		for (i = 0; i < 60; i++)
  		{
  			/* Done if psql succeeds */
  			if (system(buf2) == 0)
--- 2337,2359 
  		}
  
  		/*
! 		 * Wait till postmaster is able to accept connections; normally this
! 		 * is only a second or so, but Cygwin is reportedly *much* slower, and
! 		 * test builds using Valgrind or similar tools might be too.  Hence,
! 		 * allow the default timeout of 60 seconds to be overridden from the
! 		 * PGCTLTIMEOUT environment variable.
  		 */
! 		env_wait = getenv("PGCTLTIMEOUT");
! 		if (env_wait != NULL)
! 		{
! 			wait_seconds = atoi(env_wait);
! 			if (wait_seconds <= 0)
! wait_seconds = 60;
! 		}
! 		else
! 			wait_seconds = 60;
! 
! 		for (i = 0; i < wait_seconds; i++)
  		{
  			/* Done if psql succeeds */
  			if (system(buf2) == 0)
*** regression_main(int argc, char *argv[], 
*** 2360,2368 
  
  			pg_usleep(100L);
  		}
! 		if (i >= 60)
  		{
! 			fprintf(stderr, _("\n%s: postmaster did not respond within 60 seconds\nExamine %s/log/postmaster.log for the reason\n"), progname, outputdir);
  
  			/*
  			 * If we get here, the postmaster is probably wedged somewhere in
--- 2374,2383 
  
  			pg_usleep(100L);
  		}
! 		if (i >= wait_seconds)
  		{
! 			fprintf(stderr, _("\n%s: postmaster did not respond within %d seconds\nExamine %s/log/postmaster.log for the reason\n"),
! 	progname, wait_seconds, outputdir);
  
  			/*
  			 * If we get here, the postmaster is probably wedged somewhere in

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid parallel full and right join paths.

2016-04-20 Thread Michael Paquier

On Thu, Apr 21, 2016 at 7:13 AM, Peter Geoghegan  wrote:
> On Wed, Apr 20, 2016 at 2:49 PM, Robert Haas  wrote:
>> Committed.  But I think the regression test needs more thought, so I
>> left that out.
>
> It would be nice if there was a fuzz testing infrastructure that
> verified that parallel plans produce the same answer as serial plans.

Results of parallel plans and serial plans could be stored in
temporary tables in the test, then that's a matter of comparing them I
guess. That's largely doable.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid parallel full and right join paths.

2016-04-20 Thread Peter Geoghegan

On Wed, Apr 20, 2016 at 2:49 PM, Robert Haas  wrote:
> Committed.  But I think the regression test needs more thought, so I
> left that out.

It would be nice if there was a fuzz testing infrastructure that
verified that parallel plans produce the same answer as serial plans.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoid parallel full and right join paths.

2016-04-20 Thread Robert Haas

On Tue, Apr 19, 2016 at 10:21 AM, Mithun Cy  wrote:
> Tests:
> create table mytab(x int,x1 char(9),x2 varchar(9));
> create table mytab1(y int,y1 char(9),y2 varchar(9));
> insert into mytab values (generate_series(1,5),'aa','aaa');
> insert into mytab1 values (generate_series(1,1),'aa','aaa');
> insert into mytab values (generate_series(1,50),'aa','aaa');
> insert into mytab values (generate_series(1,50),'aa','aaa');
> analyze mytab;
> analyze mytab1;
> vacuum mytab;
> vacuum mytab1;
>
> set max_parallel_degree=0;
> SET
> df=# SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;
> count
> ---
> 3
> (1 row)
>
> # set max_parallel_degree=5;
> SET
> df=# SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;
> count
> ---
> 39089
> (1 row)
>
> Casue:
> ==
> Normal plan
> ==
> explain SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;postgres-#
> QUERY PLAN
> --
> Aggregate (cost=21682.71..21682.72 rows=1 width=8)
> -> Hash Right Join (cost=289.00..21629.07 rows=21457 width=0)
> Hash Cond: (mytab.x = mytab1.y)
> -> Seq Scan on mytab (cost=0.00..17188.00 rows=105 width=4)
> -> Hash (cost=164.00..164.00 rows=1 width=4)
> -> Seq Scan on mytab1 (cost=0.00..164.00 rows=1 width=4)
> =
>
> Parallel plan.
> ==
> explain SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;postgres-#
> QUERY PLAN
> ---
> Finalize Aggregate (cost=14135.88..14135.89 rows=1 width=8)
> -> Gather (cost=14135.67..14135.88 rows=2 width=8)
> Number of Workers: 2
> -> Partial Aggregate (cost=13135.67..13135.68 rows=1 width=8)
> -> Hash Right Join (cost=289.00..13082.02 rows=21457 width=0)
> Hash Cond: (mytab.x = mytab1.y)
> -> Parallel Seq Scan on mytab (cost=0.00..11063.00 rows=437500 width=4)
> -> Hash (cost=164.00..164.00 rows=1 width=4)
> -> Seq Scan on mytab1 (cost=0.00..164.00 rows=1 width=4)
>
>
> As above Right and Full join paths cannot be parallel as they can produce
> false null extended rows because outer table is partial path and not
> completely visible.
> Adding a patch to fix same.

Committed.  But I think the regression test needs more thought, so I
left that out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Remove regress-python3-mangle.mk

2016-04-20 Thread Tom Lane

Yury Zhuravlev  writes:
> Tom Lane wrote:
>> Also, I would bet that those tools would not know anything
>> about converting PL language names between 'plpythonu' and 'plpython2u'
>> and 'plpython3u'; so even if we used one of them, we would still need a
>> layer pretty similar to what we have.

> I do not see this as a problem because we can build Postgres only with one 
> version of Python (2 or 3).

What's your point?  The tests still have to load the correct version
of the PL language by name, and create functions that reference that
version by name.  Moreover, "make installcheck" should work even
against an installation with both .so's installed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Should XLogInsert() be done only inside a critical section?

2016-04-20 Thread Tom Lane

Over in <17456.1460832...@sss.pgh.pa.us> I speculated about whether
we should be enforcing that WAL insertion happen only inside critical
sections.  We don't currently, and a survey of the backend says that
there are quite a few calls that aren't inside critical sections.
But there are at least two good reasons why we should, IMO:

1. It's not very clear that XLogInsert will recover cleanly if it's
invoked outside a critical section and hits a failure.  Certainly,
if we allow such usage, then every potential error inside that code
has to be analyzed under both critical-section and normal rules.

2. With no such check, it's quite easy for calling code to forget to
create a critical section around code stanzas where one is *necessary*
(because you're changing shared-buffer contents).

Both of these points represent pretty clear hazards for introduction
of future bugs, whether or not there are any such bugs today.

As against this, it could be argued that adding critical sections where
they're not absolutely necessary must make crashing failures more probable
than they need to be.  But first you'd have to prove that they're not
absolutely necessary, which I'm unsure about because of point #1.

Anyway, I went through our tree and added START/END_CRIT_SECTION calls
around all XLogInsert calls that could currently be reached without one;
see attached.  Since this potentially breaks third-party code I would
not propose back-patching it, but I think it's reasonable to propose
applying it to HEAD.

Thoughts?

regards, tom lane

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 89bad05..8e41802 100644
*** a/src/backend/access/brin/brin.c
--- b/src/backend/access/brin/brin.c
*** brinbuild(Relation heap, Relation index,
*** 610,624 
  		elog(ERROR, "index \"%s\" already contains data",
  			 RelationGetRelationName(index));
  
- 	/*
- 	 * Critical section not required, because on error the creation of the
- 	 * whole relation will be rolled back.
- 	 */
- 
  	meta = ReadBuffer(index, P_NEW);
  	Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
  	LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
  
  	brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
  	   BRIN_CURRENT_VERSION);
  	MarkBufferDirty(meta);
--- 610,621 
  		elog(ERROR, "index \"%s\" already contains data",
  			 RelationGetRelationName(index));
  
  	meta = ReadBuffer(index, P_NEW);
  	Assert(BufferGetBlockNumber(meta) == BRIN_METAPAGE_BLKNO);
  	LockBuffer(meta, BUFFER_LOCK_EXCLUSIVE);
  
+ 	START_CRIT_SECTION();
+ 
  	brin_metapage_init(BufferGetPage(meta), BrinGetPagesPerRange(index),
  	   BRIN_CURRENT_VERSION);
  	MarkBufferDirty(meta);
*** brinbuild(Relation heap, Relation index,
*** 644,649 
--- 641,648 
  
  	UnlockReleaseBuffer(meta);
  
+ 	END_CRIT_SECTION();
+ 
  	/*
  	 * Initialize our state, including the deformed tuple state.
  	 */
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 2ddf568..7fc47ec 100644
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
*** ginHeapTupleFastInsert(GinState *ginstat
*** 277,282 
--- 277,284 
  		memset(, 0, sizeof(GinMetaPageData));
  		makeSublist(index, collector->tuples, collector->ntuples, );
  
+ 		START_CRIT_SECTION();
+ 
  		if (needWal)
  			XLogBeginInsert();
  
*** ginHeapTupleFastInsert(GinState *ginstat
*** 291,298 
  			/*
  			 * Main list is empty, so just insert sublist as main list
  			 */
- 			START_CRIT_SECTION();
- 
  			metadata->head = sublist.head;
  			metadata->tail = sublist.tail;
  			metadata->tailFreeSize = sublist.tailFreeSize;
--- 293,298 
*** ginHeapTupleFastInsert(GinState *ginstat
*** 314,321 
  
  			Assert(GinPageGetOpaque(page)->rightlink == InvalidBlockNumber);
  
- 			START_CRIT_SECTION();
- 
  			GinPageGetOpaque(page)->rightlink = sublist.head;
  
  			MarkBufferDirty(buffer);
--- 314,319 
*** ginHeapTupleFastInsert(GinState *ginstat
*** 353,363 
  
  		data.ntuples = collector->ntuples;
  
  		if (needWal)
  			XLogBeginInsert();
  
- 		START_CRIT_SECTION();
- 
  		/*
  		 * Increase counter of heap tuples
  		 */
--- 351,361 
  
  		data.ntuples = collector->ntuples;
  
+ 		START_CRIT_SECTION();
+ 
  		if (needWal)
  			XLogBeginInsert();
  
  		/*
  		 * Increase counter of heap tuples
  		 */
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 950bfc8..6010164 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*** log_heap_cleanup_info(RelFileNode rnode,
*** 7151,7161 
--- 7151,7165 
  	xlrec.node = rnode;
  	xlrec.latestRemovedXid = latestRemovedXid;
  
+ 	START_CRIT_SECTION();
+ 
  	XLogBeginInsert();
  	XLogRegisterData((char *) , SizeOfHeapCleanupInfo);
  
  	recptr = XLogInsert(RM_HEAP2_ID,

Re: [HACKERS] Proposal: Remove regress-python3-mangle.mk

2016-04-20 Thread Yury Zhuravlev


Tom Lane wrote:

Also, I would bet that those tools would not know anything
about converting PL language names between 'plpythonu' and 'plpython2u'
and 'plpython3u'; so even if we used one of them, we would still need a
layer pretty similar to what we have.


I do not see this as a problem because we can build Postgres only with one 
version of Python (2 or 3).

Or am I missing something?


may be a bit of a kluge

For python developer it is really strange...


PS I see Postgres philosophy - do not touch what works well, even if it 
looks terrible. :)

--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_parallel_degree > 0 for 9.6 beta

2016-04-20 Thread Tom Lane

Andres Freund  writes:
> max_parallel_degree currently defaults to 0.  I think we should enable
> it by default for at least the beta period. Otherwise we're primarily
> going to get reports back after the release.

> Then, at the end of beta, we can decide what the default should be.

+1, but let's put an entry on the 9.6 open-items page to remind us to
make that decision at the right time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Defaults for replication/backup

2016-04-20 Thread Magnus Hagander

On Sun, Feb 14, 2016 at 9:58 AM, Magnus Hagander 
wrote:

>
>
> On Sun, Feb 14, 2016 at 2:00 AM, Robert Haas 
> wrote:
>
>> On Sat, Feb 13, 2016 at 11:31 AM, Andres Freund 
>> wrote:
>> > On 2016-02-13 11:10:58 -0500, Tom Lane wrote:
>> >> Magnus Hagander  writes:
>> >> > The big thing is, IIRC, that we turn off the optimizations in
>> >> > min_wal_level. *most* people will see no impact of their regular
>> >> > application runtime from that, but it might definitely have an
>> effect on
>> >> > data loads and such. For normal runtime, there should be very close
>> to zero
>> >> > difference, no?
>> >>
>> >> I was asking for a demonstration of that, not just handwaving.  Even if
>> >> it was measured years ago, I wouldn't assume the comparison would be
>> >> the same on current Postgres.
>> >
>> > Well, let's look at what the difference between wal_level's are:
>> > 1) the (currently broken) optimization of not WAL logging COPY et al,
>> >for newly created relations.
>> > 2) relation AccessExclusiveLocks are WAL logged on >= hot_standby
>> > 3) Subtransaction assignment records are generated for >= hot_standby
>> >after 64 records.
>> > 4) checkpoints and bgwriter occasionally generate XLOG_RUNNING_XACTS
>> >records
>> > 5) btreevacuum() and _bt_getbuf() sometimes do additional WAL logging on
>> >>= hot_standby
>> > 6) Once per vacuum we issue a XLOG_HEAP2_CLEANUP_INFO
>> >
>> >
>> > 1) obviously can have performance impact; but only in a relatively
>> > narrow set of cases. I doubt any of the others really play that major a
>> > role.  But I really think minor efficiency differences are besides the
>> > point. Making backups and replication easier has a far bigger positive
>> > impact, for far more users.
>>
>> I think that there are definitely people for whom #1 is an issue, but
>> maybe that's not a sufficient reason not to change the default.
>>
>
> There definitely are people. I'd say most of those would already be tuning
> their config in different ways as well, so changing it is a lot lower cost
> for them. And there's fewer of them.
>
>
>
>> As a thought experiment, how about teaching initdb how to tailor the
>> configuration to a couple of common scenarios via some new flag?  I'll
>> call it --setup although that's probably not best.
>>
>> --setup=replication   --- Preconfigure for replication.
>> --setup=standalone  --- Preconfigure for standalone mode.
>> --setup=ephemeral  --- Preconfigure for an ephemeral instance that
>> doesn't need durability because we'll blow it up soon.
>>
>> Whichever mode we make the default, I think this kind of thing would
>> make life easier for users.
>>
>
> I'd like to reiterate that this is not just about replication, it's even
> more about decent backups. As soon as your database goes to the point that
> pg_dump is not a great solution for backup (and that happens pretty
> quickly, given the performance of pg_restore), the response is "oh, go
> change these 3 parameters in your config and then restart the db
> disconnecting all your users" which gives interesting reactions from
> people...
>
> I could go with somethin glike
> --setup=small
> --setup=normal
> --setup=ephemeral
>
> which would be a more proper mapping I think. Of course, this would also
> give us the ability to bikeshed about three different colors, one in each
> predefined set! :)
>


So what more do we need to just get going with this? Given feature freeze
we're perhaps too late to actually build the parameter feature for initdb,
but we could still change the defaults (and then we could add such a
parameter for next release).

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] max_parallel_degree > 0 for 9.6 beta

2016-04-20 Thread Andreas Joseph Krogh

På onsdag 20. april 2016 kl. 19:46:31, skrev Andres Freund >:
Hi,

 max_parallel_degree currently defaults to 0.  I think we should enable
 it by default for at least the beta period. Otherwise we're primarily
 going to get reports back after the release.

 Then, at the end of beta, we can decide what the default should be.
 
+1
 
Not enabling it per default gives the signal "It's not safe".
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com

[HACKERS] max_parallel_degree > 0 for 9.6 beta

2016-04-20 Thread Andres Freund

Hi,

max_parallel_degree currently defaults to 0.  I think we should enable
it by default for at least the beta period. Otherwise we're primarily
going to get reports back after the release.

Then, at the end of beta, we can decide what the default should be.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-04-20 Thread Magnus Hagander

On Wed, Apr 20, 2016 at 1:12 AM, Fujii Masao  wrote:

> On Sun, Apr 17, 2016 at 1:22 AM, Magnus Hagander 
> wrote:
> > On Wed, Apr 13, 2016 at 4:07 AM, Noah Misch  wrote:
> >>
> >> On Tue, Apr 12, 2016 at 10:08:23PM +0200, Magnus Hagander wrote:
> >> > On Tue, Apr 12, 2016 at 8:39 AM, Noah Misch 
> wrote:
> >> > > On Mon, Apr 11, 2016 at 11:22:27AM +0200, Magnus Hagander wrote:
> >> > > > Well, if we *don't* do the rewrite before we release it, then we
> >> > > > have to
> >> > > > instead put information about the new version of the functions
> into
> >> > > > the
> >> > > old
> >> > > > structure I think.
> >> > > >
> >> > > > So I think it's an open issue.
> >> > >
> >> > > Works for me...
> >> > >
> >> > > [This is a generic notification.]
> >> > >
> >> > > The above-described topic is currently a PostgreSQL 9.6 open item.
> >> > > Magnus,
> >> > > since you committed the patch believed to have created it, you own
> >> > > this
> >> > > open
> >> > > item.  If that responsibility lies elsewhere, please let us know
> whose
> >> > > responsibility it is to fix this.  Since new open items may be
> >> > > discovered
> >> > > at
> >> > > any time and I want to plan to have them all fixed well in advance
> of
> >> > > the
> >> > > ship
> >> > > date, I will appreciate your efforts toward speedy resolution.
> Please
> >> > > present, within 72 hours, a plan to fix the defect within seven days
> >> > > of
> >> > > this
> >> > > message.  Thanks.
> >> > >
> >> >
> >> > I won't have time to do the bigger rewrite/reordeirng by then, but I
> can
> >> > certainly commit to having the smaller updates done to cover the new
> >> > functionality in less than a week. If nothing else, that'll be
> something
> >> > for me to do on the flight over to pgconf.us.
> >>
> >> Thanks for that plan; it sounds good.
> >
> >
> > Here's a suggested patch.
> >
> > There is some duplication between the non-exclusive and exclusive backup
> > sections, but I wanted to make sure that each set of instructions can
> just
> > be followed top-to-bottom.
> >
> > I've also removed some tips that aren't really necessary as part of the
> > step-by-step instructions in order to keep things from exploding in size.
> >
> > Finally, I've changed references to "backup dump" to just be "backup",
> > because it's confusing to call them something with dumps in when it's not
> > pg_dump. Enough that I got partially confused myself while editing...
> >
> > Comments?
>
> +Low level base backups can be made in a non-exclusive or an exclusive
> +way. The non-exclusive method is recommended and the exclusive one
> will
> +at some point be deprecated and removed.
>
> I don't object to add a non-exclusive mode of low level backup,
> but I disagree to mark an exclusive backup as deprecated at least
> until we can alleviate some pains that a non-exclusive mode causes.
>

Note that we have not marked them as deprecated. We're just giving warnings
that they will be deprecated.


>
> One example of the pain, in a non-exclusive backup, we need to keep
> the IDLE connection which was used to execute pg_start_backup(),
> until the end of backup. Of course a backup can take a very
> long time. In this case the IDLE connection also needs to remain
> for such a long time. If it's accidentally terminated (e.g., because
> of IDLE connection), the backup fails and needs to be taken again
> from the beginning.
>


Yes, it's a bit annoying. But it's something you can handle. Unlike the
problems that exist with an exclusive backup which you *cannot* handle from
the backup script/commands.



>
> Another pain in a non-exclusive backup is to have to execute both
> pg_start_backup() and pg_stop_backup() on the same connection.
>

Pretty sure that's the same one you just listed?



> Please imagine the case where psql is used to execute those two
> backup functions (I believe that there are many users who do this).
> For example,
>
> psql -c "SELECT pg_start_backup()"
> rsync, cp, tar, storage backup, or something
> psql -c "SELECT pg_stop_backup()"
>
> A non-exclusive backup breaks the above very simple steps because
> two backup functions are executed on different connections.

So, how should we modify the steps for a non-exclusive backup?
>

You should at least not use psql in that way. You need to open psql in a
pipe and drive it through that. Or use a more appropriate client.


Basically we need to pause psql after pg_start_backup(), signal it
> to resume after the copy of database cluster is taken, and make
> it execute pg_stop_backup(). I'm afraid that the backup script
> will be complicated because of this pain of non-exclusive backup.
>

Yes, if you insist on using psql - which is not a good tool for this - then
yes. But you could for example make it a psql script that does something
along
SELECT pg_start_backup();
\! rsync ...
SELECT pg_stop_backup()

Or as said

Re: [HACKERS] Declarative partitioning

2016-04-20 Thread Amit Langote

Hi Ildar,

On Wed, Apr 20, 2016 at 11:46 PM, Ildar Musin  wrote:
> Thanks for clarification! I tried the updated patch, now it works correctly.

Great, thanks!

> I encountered another problem that concerns expressions as partitioning key.
> Probably there is still some work in progress. But if it will help here is
> my case:
>
> create table inh(a int, b int) partition by range ((a+b));
> create table inh_1 partition of inh for values start (0) end (10);
> create table inh_2 partition of inh for values start (10) end (20);
>
> Then if we run any SELECT query it crashes postgres:
>
> select * from inh;
>
> Crash occurs in get_check_expr_from_partbound(). It seems that function is
> not yet expecting an expression key and designed to handle only simple
> attributes keys. Backtrace:

You're right, silly mistake. :-(

Will fix

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump dump catalog ACLs

2016-04-20 Thread Stephen Frost

Noah, all,

* Noah Misch (n...@leadboat.com) wrote:
> On Sun, Apr 17, 2016 at 11:02:28PM -0400, Noah Misch wrote:
> > On Tue, Apr 05, 2016 at 05:50:18PM -0400, Stephen Frost wrote:
> > > I'll be doing more testing, review and clean-up (there are some
> > > whitespace only changes in the later patches that really should be
> > > included in the earlier patches where the code was actually changed)
> > > tonight with plans to push this tomorrow night.
> 
> > (2) pg_dump queries:
> > 
> > > @@ -14187,18 +14869,65 @@ dumpTable(Archive *fout, TableInfo *tbinfo)
> > 
> > > +   "FROM 
> > > pg_catalog.pg_attribute at "
> > > +"JOIN pg_catalog.pg_class c ON 
> > > (at.attrelid = c.oid) "
> > > +   "LEFT JOIN 
> > > pg_init_privs pip ON "
> > 
> > Since pg_attribute and pg_class require schema qualification here, so does
> > pg_init_privs.  Likewise elsewhere in the patch's pg_dump changes.
> > 
> > 
> > (3) pg_dumpall became much slower around the time of these commits.  On one
> > machine (POWER7 3.55 GHz), a pg_dumpall just after initdb slowed from 0.25s 
> > at
> > commit 6c268df^ to 4.0s at commit 7a54270.  On a slower machine (Opteron
> > 1210), pg_dumpall now takes 19s against such a fresh cluster.
> 
> [This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 9.6 open item.  Stephen,
> since you committed the patch believed to have created it, you own this open
> item.  If that responsibility lies elsewhere, please let us know whose
> responsibility it is to fix this.  Since new open items may be discovered at
> any time and I want to plan to have them all fixed well in advance of the ship
> date, I will appreciate your efforts toward speedy resolution.  Please
> present, within 72 hours, a plan to fix the defect within seven days of this
> message.  Thanks.

I'm at PGConf.US but will be reviewing this in detail after.  The schema
qualification will be easily taken care of, the additional time for
pg_dump is due to the queries looking at the catalog objects and is
therefore relatively fixed and is primairly only a large amount of the
time when dumping databases which are mostly empty.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Declarative partitioning

2016-04-20 Thread Ildar Musin


Hi Amit,

On 20.04.2016 13:28, Amit Langote wrote:

On 2016/04/19 23:52, Amit Langote wrote:

On Tue, Apr 19, 2016 at 11:26 PM, Alexander Korotkov

Another question is that it might be NOT what users expect from that.  From
the syntax side it very looks like defining something boxes regions for two
keys which could be replacement for subpartitioning.  But it isn't so.

Need to check why query with qual b < 100 behaves the way it does.
Something's going wrong there with the constraints (partition
predicates) that are being generated internally (as mentioned before,
still driven by constraint exclusion using the constraints generated
on-the-fly).

As for the composite range partition bounds in Ildar's example, it's
as if the second value in the key never determines the fate of a row
going into some partition, therefore no constraints should have been
generated for column b of the key.  I'm afraid that's not the case as
per the latest patch.  Will fix.

The strange behavior that Ildar reported should have been fixed with the
attached updated set of patches (v2):

create table test(a int, b int) partition by range (a, b);
create table test_1 partition of test for values start (0, 0) end (100, 100);
create table test_2 partition of test for values start (100, 100) end
(200, 200);
create table test_3 partition of test for values start (200, 200) end
(300, 300);
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE

insert into test(a, b) values (150, 50);
INSERT 0 1

select * from test where b < 100;
   a  | b
-+
  150 | 50
(1 row)

explain (costs off) select * from test where b < 100;
 QUERY PLAN
---
  Append
->  Seq Scan on test
  Filter: (b < 100)
->  Seq Scan on test_1
  Filter: (b < 100)
->  Seq Scan on test_2
  Filter: (b < 100)
->  Seq Scan on test_3
  Filter: (b < 100)
(9 rows)


Multi-column range partitioning seems a bit tricky as far as generating
constraints on individual columns using a partition's lower and upper
bounds (both composite values) is concerned.  I mentally pictured
something like the following example scenario:

create table test(a int, b int, c int)
 partition by range (a, b, c);
create table test_1 partition of test
 for values start (0, 0, 0) end (0, 2, 0);
create table test_2 partition of test
 for values start (0, 2, 0) end (0, 3, 0);
create table test_3 partition of test
 for values start (0, 3, 0) end (0, 4, 0);
create table test_4 partition of test
 for values start (0, 4, 0) end (1, 0, 0);
create table test_5 partition of test
 for values start (1, 0, 0) end (1, 2, 0);
create table test_6 partition of test
 for values start (1, 2, 0) end (1, 3, 0);
create table test_7 partition of test
 for values start (1, 3, 0) end (1, 4, 0);
create table test_8 partition of test
 for values start (1, 4, 0) end (2, 0, 0);

Useful to think of the above as sequence of ranges [000, 020), [020, 030),
[030, 040), [040, 100), [100, 120), [120, 130), [130, 140), [140, 200) for
purposes of finding the partition for a row.

Then constraints generated internally for each partition:

test_1: a = 0 AND b >= 0 AND b <= 2
test_2: a = 0 AND b >= 2 AND b <= 3
test_3: a = 0 AND b >= 3 AND b <= 4
test_4: a >= 0 AND a <= 1
test_5: a = 1 AND b >= 0 AND b <= 2
test_6: a = 1 AND b >= 2 AND b <= 3
test_7: a = 1 AND b >= 3 AND b <= 4
test_8: a >= 1 AND a <= 2

I will try further to poke holes in my thinking about this.  Please feel
free to point out if you find any.

Thanks,
Amit

Thanks for clarification! I tried the updated patch, now it works correctly.

I encountered another problem that concerns expressions as partitioning 
key. Probably there is still some work in progress. But if it will help 
here is my case:


create table inh(a int, b int) partition by range ((a+b));
create table inh_1 partition of inh for values start (0) end (10);
create table inh_2 partition of inh for values start (10) end (20);

Then if we run any SELECT query it crashes postgres:

select * from inh;

Crash occurs in get_check_expr_from_partbound(). It seems that function 
is not yet expecting an expression key and designed to handle only 
simple attributes keys. Backtrace:


#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x004add8a in hashname (fcinfo=0x7ffdbdb9c760) at hashfunc.c:145
#2  0x0099cc08 in DirectFunctionCall1Coll (func=0x4add66 
, collation=0, arg1=0) at fmgr.c:1027
#3  0x009724dd in CatalogCacheComputeHashValue (cache=0x26590b0, 
nkeys=2, cur_skey=0x7ffdbdb9cbf0) at catcache.c:207
#4  0x00974979 in SearchCatCache (cache=0x26590b0, v1=32807, 
v2=0, v3=0, v4=0) at catcache.c:1151
#5  0x00988e35 in SearchSysCache (cacheId=6, key1=32807, key2=0, 
key3=0, key4=0) at syscache.c:1006
#6  0x00988fe3 in SearchSysCacheAttName (relid=32807, 
attname=0x0) at syscache.c:1106
#7  0x0098a744 in get_attnum (relid=32807, attname=0x0) at 
lsyscache.c:825

Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-20 Thread Andres Freund

On 2016-04-19 20:27:31 +0530, Amit Kapila wrote:
> On Sun, Apr 17, 2016 at 2:26 AM, Andres Freund  wrote:
> >
> > On 2016-04-16 16:44:52 -0400, Noah Misch wrote:
> > > That is more controversial than the potential ~2% regression for
> > > old_snapshot_threshold=-1.  Alvaro[2] and Robert[3] are okay releasing
> > > that way, and Andres[4] is not.
> >
> > FWIW, I could be kinda convinced that it's temporarily ok, if there'd be
> > a clear proposal on the table how to solve the scalability issue around
> > MaintainOldSnapshotTimeMapping().
> >
> 
> It seems that for read-only workloads, MaintainOldSnapshotTimeMapping()
> takes EXCLUSIVE LWLock which seems to be a probable reason for a
> performance regression.

Yes, that's the major problem.


> Now, here the question is do we need to acquire that lock if xmin is
> not changed since the last time value of
> oldSnapshotControl->latest_xmin is updated or xmin is lesser than
> equal to oldSnapshotControl->latest_xmin?  If we don't need it for
> above cases, I think it can address the performance regression to a
> good degree for read-only workloads when the feature is enabled.

I think the more fundamental issue is that the time->xid mapping is
built at GetSnapshotData() time (via MaintainOldSnapshotTimeMapping()),
and not when xids are assigned. Snapshots are created a lot more
frequently in nearly all use-cases than xids are assigned.  That's what
forces the exclusive lock to be in the read path, rather than the write
path.

What's the reason for this?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapshot too old, configured by time

2016-04-20 Thread Kevin Grittner

On Tue, Apr 19, 2016 at 12:49 PM, Alvaro Herrera
 wrote:

> Well, it seems I'm outvoted.

The no-op change to attempt to force an explicit choice of whether
to test for snapshot age after calling BufferGetPage() has been
reverted.  This eliminates about 500 back-patching pain points in
65 files.

In case anyone notices some code left at the bottom of bufmgr.h
related to inline functions, that was left on purpose, because I am
pretty sure that the fix for the performance regression observed
when the "snapshot too old" feature is disabled will involve making
at least part of TestForOldSnapshot() an inline function -- so it
seemed dumb to rip that out now only to put it back again right
away.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Remove regress-python3-mangle.mk

2016-04-20 Thread Tom Lane

Yury Zhuravlev  writes:
> Noah Misch wrote:
>> I may not understand that second sentence.  We have multiple buildfarm 
>> members
>> verifying the python2 case and multiple members verifying the python3 case.

> I wrote about special python3 features what not testing today. We testing 
> Python3 as subset of Python2. 

It's not our job to test Python features, especially not version-dependent
features.

> If we do not want to have two different code base, then it is necessary to 
> use than this: https://pythonhosted.org/six/
> The license allows us to copy this code into postgres.
> Or use standart Python tool 2to3: 
> https://docs.python.org/2/library/2to3.html

You have not explained why we need to change anything.  The current
approach may be a bit of a kluge, but it works fine for our purposes,
AFAICS.  Also, I would bet that those tools would not know anything
about converting PL language names between 'plpythonu' and 'plpython2u'
and 'plpython3u'; so even if we used one of them, we would still need a
layer pretty similar to what we have.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_stat_activity crashes

2016-04-20 Thread Petr Jelinek


Hi,

I noticed sporadic segfaults when selecting from pg_stat_activity on 
current HEAD.


The culprit is the 53be0b1add7064ca5db3cd884302dfc3268d884e commit which 
added more wait info into the pg_stat_get_activity(). More specifically, 
the following code is broken:


+   proc = BackendPidGetProc(beentry->st_procpid);
+   wait_event_type = 
pgstat_get_wait_event_type(proc->wait_event_info);


This needs to check if proc is NULL. When reading the code I noticed 
that the new functions pg_stat_get_backend_wait_event_type() and 
pg_stat_get_backend_wait_event() suffer from the same problem.


Here is PoC patch which fixes the problem. I am wondering if we should 
raise warning in the pg_stat_get_backend_wait_event_type() and 
pg_stat_get_backend_wait_event() like the pg_signal_backend() does when 
proc is NULL instead of just returning NULL which is what this patch 
does though.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 64c4cc4..355e58c 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -783,13 +783,23 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			values[5] = CStringGetTextDatum(beentry->st_activity);
 
 			proc = BackendPidGetProc(beentry->st_procpid);
-			wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
+			if (proc != NULL)
+			{
+wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
+wait_event = pgstat_get_wait_event(proc->wait_event_info);
+
+			}
+			else
+			{
+wait_event_type = NULL;
+wait_event = NULL;
+			}
+
 			if (wait_event_type)
 values[6] = CStringGetTextDatum(wait_event_type);
 			else
 nulls[6] = true;
 
-			wait_event = pgstat_get_wait_event(proc->wait_event_info);
 			if (wait_event)
 values[7] = CStringGetTextDatum(wait_event);
 			else
@@ -990,11 +1000,8 @@ pg_stat_get_backend_wait_event_type(PG_FUNCTION_ARGS)
 		wait_event_type = "";
 	else if (!has_privs_of_role(GetUserId(), beentry->st_userid))
 		wait_event_type = "";
-	else
-	{
-		proc = BackendPidGetProc(beentry->st_procpid);
+	else if ((proc = BackendPidGetProc(beentry->st_procpid)) != NULL)
 		wait_event_type = pgstat_get_wait_event_type(proc->wait_event_info);
-	}
 
 	if (!wait_event_type)
 		PG_RETURN_NULL();
@@ -1014,11 +1021,8 @@ pg_stat_get_backend_wait_event(PG_FUNCTION_ARGS)
 		wait_event = "";
 	else if (!has_privs_of_role(GetUserId(), beentry->st_userid))
 		wait_event = "";
-	else
-	{
-		proc = BackendPidGetProc(beentry->st_procpid);
+	else if ((proc = BackendPidGetProc(beentry->st_procpid)) != NULL)
 		wait_event = pgstat_get_wait_event(proc->wait_event_info);
-	}
 
 	if (!wait_event)
 		PG_RETURN_NULL();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Remove regress-python3-mangle.mk

2016-04-20 Thread Yury Zhuravlev


Noah Misch wrote:

Considering we have 2756 lines of Python test SQL and just thirteen lines of
sed to mangle them, the current method is scaling nicely.
What pitfalls hides this approach? Convert python2 code to python3 it is 
really bad practice. 


I may not understand that second sentence.  We have multiple buildfarm members
verifying the python2 case and multiple members verifying the python3 case.


I wrote about special python3 features what not testing today. We testing 
Python3 as subset of Python2. 


The PL/Python implementation does not view them as different languages;

But it is really different languages.

If we do not want to have two different code base, then it is necessary to 
use than this: https://pythonhosted.org/six/


The license allows us to copy this code into postgres.
Or use standart Python tool 2to3: 
https://docs.python.org/2/library/2to3.html


Thanks. 
--

Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FATAL: could not send end-of-streaming message to primary: no COPY in progress

2016-04-20 Thread Kyotaro HORIGUCHI

At Wed, 20 Apr 2016 16:16:40 +0900, Fujii Masao  wrote 
in 
> On Thu, Mar 31, 2016 at 9:15 AM, Thomas Munro
>  wrote:
> > Hi hackers,
> >
> > If you shut down a primary server, a standby that is streaming from it 
> > says54:
> >
> > LOG:  replication terminated by primary server
> > DETAIL:  End of WAL reached on timeline 1 at 0/14F4B68.
> > FATAL:  could not send end-of-streaming message to primary: no COPY in 
> > progress
> >
> > Isn't that FATAL ereport a bug?
> 
> ISTM that the cause is that walsender exits and replication connection is
> closed just after "COPY 0" is sent. That is, then after receiving "COPY 0",
> walreceiver tries to send an end-of-copy message to the primary, but fails
> because the connection has been already closed.

Though the message is followed by repetitions of other FATAL
messages, the message above itself seems a bit alarming.

> > How is clean server shutdown supposed to work?
> 
> One option is to make walsender wait for end-of-copy message from walreceiver
> before it closes the connection and exits, after sending "COPY 0" message.
> But one question is; how should walsender behave when walreceiver gets stuck
> and cannot reply an end-of-copy message to walsender? Probably we need
> the timeout (maybe we can use wal_sender_timeout here but not sure yet
> if it's appropriate or not).

-1. It is totally useless other than to avoid the FATAL message.

> Another option is to prevent walreceiver from sending an end-of-copy message.
> If "COPY 0" always means the exit of walsender and the termination of
> the connection, there seems to be no need to send back an end-of-copy message.
> I've not checked yet how this interferes with other replication logics, 
> though.

Looking into walsender.c, walsender thinks "COPY 0" is a signal
of its death coming just after, that is, proc_exit(0).

On the other hand the comment at the beginning of walreceiver.c
says that,

 * If the primary server ends streaming, but doesn't disconnect, walreceiver
 * goes into "waiting" mode, and waits for the startup process to give new
 * instructions. The startup process will treat that the same as
 * disconnection, and will rescan the archive/pg_xlog directory. But when the
 * startup process wants to try streaming replication again, it will just
 * nudge the existing walreceiver process that's waiting, instead of launching
 * a new one.

If we assume this is an useful behavior and want to keep it, a
termination after an end of XLOG streaming is just the same with
that for psql.

| FATAL:  terminating connection due to administrator command
| server closed the connection unexpectedly
| This probably means the server terminated abnormally
| before or while processing the request.

Or, we should provide another command to inform a termination.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2016-04-20 Thread Michael Paquier

On Mon, Apr 4, 2016 at 3:22 PM, Michael Paquier
 wrote:
> The approach introducing the concept of WAL progress addresses the
> problem of unnecessary checkpoints and of unnecessary standby
> snapshots. If we take the problem only from the angle of RUNNING_XACTS
> the current logic used to check if a checkpoint should be skipped or
> not is not addressed.

This discussion has been stalling for a while regarding the main
patches... Attached are rebased versions based on the approach with
XLogInsertExtended(). And I have added as well a couple of days ago an
entry on the 9.6 open item page, not as a 9.6 open item, but as an
older bug to keep at least track of this problem.
-- 
Michael
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d60c123..5ae6f49 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8357,8 +8357,12 @@ CreateCheckPoint(int flags)
 	if ((flags & (CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_END_OF_RECOVERY |
   CHECKPOINT_FORCE)) == 0)
 	{
+		elog(LOG, "Not a forced or shutdown checkpoint: progress_lsn %X/%X, ckpt %X/%X",
+			 (uint32) (progress_lsn >> 32), (uint32) progress_lsn,
+			 (uint32) (ControlFile->checkPoint >> 32), (uint32) ControlFile->checkPoint);
 		if (progress_lsn == ControlFile->checkPoint)
 		{
+			elog(LOG, "Checkpoint is skipped");
 			WALInsertLockRelease();
 			LWLockRelease(CheckpointLock);
 			END_CRIT_SECTION();
@@ -8525,7 +8529,11 @@ CreateCheckPoint(int flags)
 	 * recovery we don't need to write running xact data.
 	 */
 	if (!shutdown && XLogStandbyInfoActive())
-		LogStandbySnapshot();
+	{
+		XLogRecPtr lsn = LogStandbySnapshot();
+		elog(LOG, "snapshot taken by checkpoint %X/%X",
+			 (uint32) (lsn >> 32), (uint32) lsn);
+	}
 
 	START_CRIT_SECTION();
 
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 79cfd7b..082e589 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -333,7 +333,9 @@ BackgroundWriterMain(void)
 GetLastCheckpointRecPtr() < current_progress_lsn &&
 last_progress_lsn < current_progress_lsn)
 			{
-(void) LogStandbySnapshot();
+XLogRecPtr lsn = LogStandbySnapshot();
+elog(LOG, "snapshot taken by bgwriter %X/%X",
+	 (uint32) (lsn >> 32), (uint32) lsn);
 last_snapshot_ts = now;
 last_progress_lsn = current_progress_lsn;
 			}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f9644db..d60c123 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -439,11 +439,30 @@ typedef struct XLogwrtResult
  * the WAL record is just copied to the page and the lock is released. But
  * to avoid the deadlock-scenario explained above, the indicator is always
  * updated before sleeping while holding an insertion lock.
+ *
+ * The progressAt values indicate the insertion progress used to determine
+ * WAL insertion activity since a previous checkpoint, which is aimed at
+ * finding out if a checkpoint should be skipped or not or if standby
+ * activity should be logged. Progress position is basically updated
+ * for all types of records, for the time being only snapshot logging
+ * is out of this scope to properly skip their logging on idle systems.
+ * Tracking the WAL activity directly in WALInsertLock has the advantage
+ * to not rely on taking an exclusive lock on all the WAL insertion locks,
+ * hence reducing the impact of the activity lookup. This takes also
+ * advantage to avoid 8-byte torn reads on some platforms by using the
+ * fact that each insert lock is located on the same cache line.
+ * XXX: There is still room for more improvements here, particularly
+ * WAL operations related to unlogged relations (INIT_FORKNUM) should not
+ * update the progress LSN as those relations are reset during crash
+ * recovery so enforcing buffers of such relations to be flushed for
+ * example in the case of a load only on unlogged relations is a waste
+ * of disk write.
  */
 typedef struct
 {
 	LWLock		lock;
 	XLogRecPtr	insertingAt;
+	XLogRecPtr	progressAt;
 } WALInsertLock;
 
 /*
@@ -881,6 +900,9 @@ static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
  * which pages need a full-page image, and retry.  If fpw_lsn is invalid, the
  * record is always inserted.
  *
+ * 'flags' gives more in-depth control on the record being inserted. As of
+ * now, this controls if the progress LSN positions are updated.
+ *
  * The first XLogRecData in the chain must be for the record header, and its
  * data must be MAXALIGNed.  XLogInsertRecord fills in the xl_prev and
  * xl_crc fields in the header, the rest of the header must already be filled
@@ -893,7 +915,9 @@ static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
  * WAL rule "write the log before the data".)
  */
 XLogRecPtr
-XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn)
+XLogInsertRecord(XLogRecData

Re: [HACKERS] Description of ForeignPath

2016-04-20 Thread Amit Langote


Fujita-san,

On 2016/04/20 16:20, Etsuro Fujita wrote:
> On 2016/04/18 17:31, Amit Langote wrote:
>> Is the following description now outdated:
>>
>> "ForeignPath represents a potential scan of a foreign table"
>>
>> Considering that there now exists FdwRoutine.GetForeignJoinPaths() whose
>> product is nothing else but a ForeignPath, should it now say (patch
>> attached):
>>
>> "ForeignPath represents a potential scan of foreign table(s)"
>>
>> Or something better.
> 
> I think it'd be better to match the comment with that for
> create_foreignscan_path().  So how about "ForeignPath represents a
> potential scan of a foreign table, foreign join, or foreign upper-relation
> processing"?  I think we would probably need to update the comment in
> src/backend/optimizer/README (L358), too.

That's a lot better.  Updated patch attached.

Thanks,
Amit
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 5e12459..2407172 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -355,7 +355,7 @@ RelOptInfo  - a relation or joined relations
   BitmapHeapPath - top of a bitmapped index scan
   TidPath   - scan by CTID
   SubqueryScanPath - scan a subquery-in-FROM
-  ForeignPath   - scan a foreign table
+  ForeignPath   - scan a foreign table, foreign join or foreign upper-relation
   CustomPath- for custom scan providers
   AppendPath- append multiple subpaths together
   MergeAppendPath - merge multiple subpaths, preserving their common sort order
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index e9dfb66..45739c3 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1030,7 +1030,8 @@ typedef struct SubqueryScanPath
 } SubqueryScanPath;
 
 /*
- * ForeignPath represents a potential scan of a foreign table
+ * ForeignPath represents a potential scan of a foreign table, foreign join
+ * or foreign upper-relation.
  *
  * fdw_private stores FDW private data about the scan.  While fdw_private is
  * not actually touched by the core code during normal operations, it's

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Description of ForeignPath

2016-04-20 Thread Etsuro Fujita


On 2016/04/18 17:31, Amit Langote wrote:

Is the following description now outdated:

"ForeignPath represents a potential scan of a foreign table"

Considering that there now exists FdwRoutine.GetForeignJoinPaths() whose
product is nothing else but a ForeignPath, should it now say (patch attached):

"ForeignPath represents a potential scan of foreign table(s)"

Or something better.


I think it'd be better to match the comment with that for 
create_foreignscan_path().  So how about "ForeignPath represents a 
potential scan of a foreign table, foreign join, or foreign 
upper-relation processing"?  I think we would probably need to update 
the comment in src/backend/optimizer/README (L358), too.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-04-20 Thread Kyotaro HORIGUCHI

At Wed, 20 Apr 2016 11:51:09 +0900, Masahiko Sawada  
wrote in 
> On Mon, Apr 18, 2016 at 2:15 PM, Kyotaro HORIGUCHI
>  wrote:
> > At Fri, 15 Apr 2016 17:36:57 +0900, Masahiko Sawada  
> > wrote in 
> > 
> >> How about if check_hook just parses parameter in
> >> CurrentMemoryContext(i.g., T_AllocSetContext), and then the
> >> assign_hook copies syncrep_parse_result to TopMemoryContext.
> >> Because syncrep_parse_result is a global variable, these hooks can see it.
> >
> > Hmm. Somewhat uneasy but should work. The attached patch does it.
..
> Thank you for updating the patch.
> 
> Here are some comments.
> 
> +   Assert(syncrep_parse_result == NULL);
> +
> 
> Why do we need Assert at this point?
> It's possible that syncrep_parse_result is not NULL after setting
> s_s_names by ALTER SYSTEM.

Thank you for pointing it out. It is just a trace of an
assumption no longer useful.

> +   /*
> +* this memory block will be freed as a part of the
> memory contxt for
> +* config file processing.
> +*/
> 
> s/contxt/context/

Thanks. I removed whole the comment and the corresponding code
since it's meaningless.

assign_s_s_names causes SEGV when it is called without calling
check_s_s_names. I think that's not the case for this varialbe
because it is unresettable amid a session. It is very uneasy for
me but I don't see a proper means to reset
syncrep_parse_result. MemoryContext deletion hook would work but
it seems to be an overkill for this single use.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index 3c9142e..bdd6de0 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -68,6 +68,7 @@
 #include "storage/proc.h"
 #include "tcop/tcopprot.h"
 #include "utils/builtins.h"
+#include "utils/memutils.h"
 #include "utils/ps_status.h"
 
 /* User-settable parameters for sync rep */
@@ -361,11 +362,6 @@ SyncRepInitConfig(void)
 {
 	int			priority;
 
-	/* Update the config data of synchronous replication */
-	SyncRepFreeConfig(SyncRepConfig);
-	SyncRepConfig = NULL;
-	SyncRepUpdateConfig();
-
 	/*
 	 * Determine if we are a potential sync standby and remember the result
 	 * for handling replies from standby.
@@ -868,47 +864,50 @@ SyncRepUpdateSyncStandbysDefined(void)
 }
 
 /*
- * Parse synchronous_standby_names and update the config data
- * of synchronous standbys.
+ * Free a previously-allocated config data of synchronous replication.
  */
 void
-SyncRepUpdateConfig(void)
+SyncRepFreeConfig(SyncRepConfigData *config)
 {
-	int	parse_rc;
-
-	if (!SyncStandbysDefined())
+	if (!config)
 		return;
 
-	/*
-	 * check_synchronous_standby_names() verifies the setting value of
-	 * synchronous_standby_names before this function is called. So
-	 * syncrep_yyparse() must not cause an error here.
-	 */
-	syncrep_scanner_init(SyncRepStandbyNames);
-	parse_rc = syncrep_yyparse();
-	syncrep_scanner_finish();
-
-	if (parse_rc != 0)
-		ereport(ERROR,
-(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg_internal("synchronous_standby_names parser returned %d",
- parse_rc)));
-
-	SyncRepConfig = syncrep_parse_result;
-	syncrep_parse_result = NULL;
+	list_free_deep(config->members);
+	pfree(config);
 }
 
 /*
- * Free a previously-allocated config data of synchronous replication.
+ * Returns a copy of a replication config data into the TopMemoryContext.
  */
-void
-SyncRepFreeConfig(SyncRepConfigData *config)
+SyncRepConfigData *
+SyncRepCopyConfig(SyncRepConfigData *oldconfig)
 {
-	if (!config)
-		return;
+	MemoryContext		oldcxt;
+	SyncRepConfigData  *newconfig;
+	ListCell		   *lc;
 
-	list_free_deep(config->members);
-	pfree(config);
+	if (!oldconfig)
+		return NULL;
+
+	oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+
+	newconfig = (SyncRepConfigData *) palloc(sizeof(SyncRepConfigData));
+	newconfig->num_sync = oldconfig->num_sync;
+	newconfig->members = list_copy(oldconfig->members);
+
+	/*
+	 * The new members list is a combination of list cells on the new context
+	 * and data pointed from each cell on the old context. So we explicitly
+	 * copy the data.
+	 */
+	foreach (lc, newconfig->members)
+	{
+		lfirst(lc) = pstrdup((char *) lfirst(lc));
+	}
+
+	MemoryContextSwitchTo(oldcxt);
+
+	return newconfig;
 }
 
 #ifdef USE_ASSERT_CHECKING
@@ -952,13 +951,30 @@ SyncRepQueueIsOrderedByLSN(int mode)
  * ===
  */
 
+/*
+ * check_synchronous_standby_names and assign_synchronous_standby_names are to
+ * be used from guc.c. The former generates a result pointed by
+ * syncrep_parse_result in the current memory context as the side effect and
+ * the latter reads it. This won't

Re: [HACKERS] FATAL: could not send end-of-streaming message to primary: no COPY in progress

2016-04-20 Thread Fujii Masao

On Thu, Mar 31, 2016 at 9:15 AM, Thomas Munro
 wrote:
> Hi hackers,
>
> If you shut down a primary server, a standby that is streaming from it says54:
>
> LOG:  replication terminated by primary server
> DETAIL:  End of WAL reached on timeline 1 at 0/14F4B68.
> FATAL:  could not send end-of-streaming message to primary: no COPY in 
> progress
>
> Isn't that FATAL ereport a bug?

ISTM that the cause is that walsender exits and replication connection is
closed just after "COPY 0" is sent. That is, then after receiving "COPY 0",
walreceiver tries to send an end-of-copy message to the primary, but fails
because the connection has been already closed.

> How is clean server shutdown supposed to work?

One option is to make walsender wait for end-of-copy message from walreceiver
before it closes the connection and exits, after sending "COPY 0" message.
But one question is; how should walsender behave when walreceiver gets stuck
and cannot reply an end-of-copy message to walsender? Probably we need
the timeout (maybe we can use wal_sender_timeout here but not sure yet
if it's appropriate or not).

Another option is to prevent walreceiver from sending an end-of-copy message.
If "COPY 0" always means the exit of walsender and the termination of
the connection, there seems to be no need to send back an end-of-copy message.
I've not checked yet how this interferes with other replication logics, though.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] brin_summarize_new_values error checking

2016-04-20 Thread Fujii Masao

On Wed, Jan 27, 2016 at 11:56 AM, Fujii Masao  wrote:
> On Mon, Jan 25, 2016 at 4:03 PM, Jeff Janes  wrote:
>> In reviewing one of my patches[1], Fujii-san has pointed out that I
>> didn't include checks for being in recovery, or for working on another
>> backend's temporary index.
>>
>> I think that brin_summarize_new_values in 9.5.0 commits those same
>> sins. In its case, I don't think those are critical, as they just
>> result in getting less specific error messages that one might hope
>> for, rather than something worse like a panic.
>>
>> But still, we might want to address them.
>
> Agreed to add those checks.

Attached patch does this.

Regards,

-- 
Fujii Masao
*** a/src/backend/access/brin/brin.c
--- b/src/backend/access/brin/brin.c
***
*** 795,800  brin_summarize_new_values(PG_FUNCTION_ARGS)
--- 795,806 
  	Relation	heapRel;
  	double		numSummarized = 0;
  
+ 	if (RecoveryInProgress())
+ 		ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+  errmsg("recovery is in progress"),
+  errhint("BRIN page ranges cannot be summarized during recovery.")));
+ 
  	/*
  	 * We must lock table before index to avoid deadlocks.  However, if the
  	 * passed indexoid isn't an index then IndexGetRelation() will fail.
***
*** 817,822  brin_summarize_new_values(PG_FUNCTION_ARGS)
--- 823,838 
   errmsg("\"%s\" is not a BRIN index",
  		RelationGetRelationName(indexRel;
  
+ 	/*
+ 	 * Reject attempts to read non-local temporary relations; we would be
+ 	 * likely to get wrong data since we have no visibility into the owning
+ 	 * session's local buffers.
+ 	 */
+ 	if (RELATION_IS_OTHER_TEMP(indexRel))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 			   errmsg("cannot access temporary indexes of other sessions")));
+ 
  	/* User must own the index (comparable to privileges needed for VACUUM) */
  	if (!pg_class_ownercheck(indexoid, GetUserId()))
  		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

47 matches

Mail list logo