date:20160614

Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails

2016-06-14 Thread Kyotaro HORIGUCHI

Sorry, I'm confused about the minRecoveryPoint.

Reconsidered a bit.

At Tue, 14 Jun 2016 20:31:11 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20160614.203111.229211034.horiguchi.kyot...@lab.ntt.co.jp>
> > > After looking more closely, I found that the minRecoveryPoint
> > > tends to be too small as the backup end point, and up to the
> > > record at the lastReplayedRecPtr can affect the pages on disk and
> > > they can go into the backup just taken.
> > >
> > > My conclusion here is that do_pg_stop_backup should return
> > > lastReplayedRecPtr, not minRecoveryPoint.
> > 
> > I have been thinking quite a bit about this patch, and this logic
> > sounds quite right to me. When stopping the backup we need to let the
> > user know up to which point it needs to replay WAL, and relation pages
> > are touched up to lastReplayedEndRecPtr.
> 
> Yes, but by design, the changes in buffers don't go into disk
> until buffer replacing occurs, which updates minRecoveryPoint. My
> understanding is that the problem is that a restart point that is
> not accompanied with buffer updates advances only the redo point
> of the last checkpoint and doesn't update minRecoveryPoint, which
> may be behind the redo point at the time.
> 
> It seems to me that we could more agressively advance the
> minRecoveryPoint (but must not let it go too far..), but it is
> right for it to aim a bit smaller than the ideal location.

It's wrong. minRecoveryPoint should be greater than or equal to
the maximum buffer-touching LSN reached in previous recoveries,
and it can be the same to replayEndRecPtr but may be behind it if
no acutual modification on page files is done hereafter. xlog.c
works that way. The value of the minRecoveryPoint smaller than
the redo point of the last checkpoint with no buffer flush is
allowable from this point of view but it is not proper for the
end point of a backup.

If we skip recording the last checkpoint position when it
eventually causes no buffer flush, minRecoveryPoint is again
usable for the purpose. However, it causes repeated restartpoint
trial on the same (skipped) checkpoint record.

As the consequence, we can solve this problemn also by explicitly
updating the minRecoveryPoint for an executed restartpoint
without no buffer flush.

The attached patch performs this way and also solves the problem.

Which one do you think is more preferable? Or any other solution?

This patch updates minRecoeryPoint only for restartpoints that
caused no buffer flush but such restriction might not be
necessary.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 5927a196295eea63424011c15d7359ed8141812c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 15 Jun 2016 12:00:33 +0900
Subject: [PATCH] Advancing minRecoveryPoint for executed empty restart point.

Currently, restart point with no buffer flush doesn't update the
minRecoveryPoint but updates lastCheckPoint. This can cause
do_pg_stop_backup() to return the stop LSN smaller than the start LSN
given by do_pg_start_backup() and leads to falure in taking base
backup.

This patch let CreateRestartPoint update the minRecoveryPoint if an
executed restartpoint is accompanied with no buffer flush.
---
 src/backend/access/transam/xlog.c | 9 +
 src/test/recovery/t/001_stream_rep.pl | 5 +
 2 files changed, 14 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e4645a3..7697223 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8797,6 +8797,15 @@ CreateRestartPoint(int flags)
 	CheckPointGuts(lastCheckPoint.redo, flags);
 
 	/*
+	 * basebackup needs minRecoveryPoint to be grater than or equal to the
+	 * redo point of this checkpoint. If no buffer is flushed so far,
+	 * minRecoveryPoint has not advanced during this checkpoint. So explicitly
+	 * advance it to there for the case.
+	 */
+	if (CheckpointStats.ckpt_bufs_written == 0)
+		UpdateMinRecoveryPoint(lastCheckPointRecPtr, false);
+
+	/*
 	 * Remember the prior checkpoint's redo pointer, used later to determine
 	 * the point at which we can truncate the log.
 	 */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 7b42f21..cee4768 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -24,6 +24,11 @@ $node_standby_1->start;
 # pg_basebackup works on a standby).
 $node_standby_1->backup($backup_name);
 
+# Take a second backup of the standby while the master is offline.
+$node_master->stop;
+$node_standby_1->backup('my_backup_2');
+$node_master->start;
+
 # Create second standby node linking to standby 1
 my $node_standby_2 = get_new_node('standby_2');
 $node_standby_2->init_from_backup($node_standby_1, $backup_name,
-- 
1.8.3.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your

Re: [HACKERS] increase message string buffer size of watch command of psql

2016-06-14 Thread Tom Lane

Ioseph Kim  writes:
> 2016ë 06ì 15ì¼ 01:56ì Tom Lane ì´(ê°) ì´ ê¸:
>> I take it from the vast silence that nobody particularly cares one way 
>> or the other. On reflection I think that this would be a good change 
>> to make, so I'll go do so unless I hear complaints soon. regards, tom 
>> lane 

> I propose to change from asctime() to sql current_timestamp value,
> then users will  change date format with set command DateStyle.

That would require an additional SQL query each time through the loop,
which seems like undue expense.  It's also not terribly friendly to the
goal of localization, I should think, given the limited number of
datestyle options and the fact that none of them actually change day or
month names to non-English choices.  And it would imply changing the
timestamps from psql's timezone to the backend's.  While that might have
been a good way to do it in a green field, it's not the way \watch has
worked in the past, and given the lack of complaints I'm disinclined
to change that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increase message string buffer size of watch command of psql

2016-06-14 Thread Ioseph Kim




2016년 06월 15일 01:56에 Tom Lane 이(가) 쓴 글:
I take it from the vast silence that nobody particularly cares one way 
or the other. On reflection I think that this would be a good change 
to make, so I'll go do so unless I hear complaints soon. regards, tom 
lane 


I propose to change from asctime() to sql current_timestamp value,
then users will  change date format with set command DateStyle.

Regards, Ioseph.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Joshua D. Drake


On 06/14/2016 05:08 PM, Cat wrote:


We have the capability to provide (semi-)structured data. Might be an idea
to make greater use of it.



postgres=# SELECT * from 
to_json(row(current_setting('server_version_num'))) as version;



Sincerely,

jD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Vik Fearing

On 15/06/16 02:08, Cat wrote:
> is it possible to introduce a JSONB output to it.

No thanks.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-14 Thread Amit Langote

On 2016/06/15 0:50, Robert Haas wrote:
> On Tue, Jun 14, 2016 at 4:06 AM, Amit Langote wrote:
>> You're right.  It indeed should be possible to push down ft1-ft2 join.
>> However it could not be done without also modifying
>> build_tlist_to_deparse() a little (as Ashutosh proposed [1] to do
>> upthread).  Attached new version of the patch with following changes:
>>
>> 1) Modified the check in foreign_join_ok() so that it is not overly
>> restrictive.  Previously, it used ph_needed as the highest level at which
>> the PHV is needed (by definition) and checked using it whether it is above
>> the join under consideration, which ended up being an overkill.  ISTM, we
>> can just decide from joinrel->relids of the join under consideration
>> whether we are above the lowest level where the PHV could be evaluated
>> (ph_eval_at) and stop if so.  So in the example you provided, it won't now
>> reject {ft1, ft2}, but will {ft4, ft1, ft2}.
>>
>> 2) Modified build_tlist_to_deparse() to get rid of the PlaceHolderVars in
>> the targetlist to deparse by using PVC_RECURSE_PLACEHOLDER flag of
>> pull_var_clause().  That is because we do not yet support anything other
>> than Vars in deparseExplicitTargetList() (+1 to your patch to change the
>> Assert to elog).
> 
> OK, I committed this version with some cosmetic changes.  I ripped out
> the header comment to foreign_join_ok(), which is just a nuisance to
> maintain.  It unnecessarily recapitulates the tests contained within
> the function, requiring us to update the comments in two places
> instead of one every time we make a change for no real gain.  And I
> rewrote the comment you added.

Thanks.

Regards,
Amit





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Cat

On Tue, Jun 14, 2016 at 01:38:44PM -0700, Joshua D. Drake wrote:
> On 06/14/2016 12:46 PM, Jim Nasby wrote:
> 
> >Any ideas on naming for such a function? version_detail()? I suppose
> >while we're at this we might as well provide the compile details as well.
> 
> version(detail) or version(verbose)

If we're looking at forward only changes, is it possible to introduce a
JSONB output to it. Then people can rip out whichever component they want
at will.

For example:

{
"full": 10.0,
"major": 10,
"patchlevel": 0
}

and whatever else may be pertinent. I used numeric types above but they
can be strings if that works better.

We have the capability to provide (semi-)structured data. Might be an idea
to make greater use of it.

-- 
  "A search of his car uncovered pornography, a homemade sex aid, women's 
  stockings and a Jack Russell terrier."
- 
http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-118083480

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reviewing freeze map code

2016-06-14 Thread Thomas Munro

On Wed, Jun 15, 2016 at 12:44 AM, Robert Haas  wrote:
> On Tue, Jun 14, 2016 at 8:11 AM, Robert Haas  wrote:
 I noticed that the tuples that it reported were always offset 1 in a
 page, and that the page always had a maxoff over a couple of hundred,
 and that we called record_corrupt_item because VM_ALL_VISIBLE returned
 true but HeapTupleSatisfiesVacuum on the first tuple returned
 HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE.
 It did that because HEAP_XMAX_COMMITTED was not set and
 TransactionIdIsInProgress returned true for xmax.
>>>
>>> So this seems like it might be a visibility map bug rather than a bug
>>> in the test code, but I'm not completely sure of that.  How was it
>>> legitimate to mark the page as all-visible if a tuple on the page
>>> still had a live xmax?  If xmax is live and not just a locker then the
>>> tuple is not visible to the transaction that wrote xmax, at least.
>>
>> Ah, wait a minute.  I see how this could happen.  Hang on, let me
>> update the pg_visibility patch.
>
> The problem should be fixed in the attached revision of
> pg_check_visible.  I think what happened is:
>
> 1. pg_check_visible computed an OldestXmin.
> 2. Some transaction committed.
> 3. VACUUM computed a newer OldestXmin and marked a page all-visible with it.
> 4. pg_check_visible then used its older OldestXmin to check the
> visibility of tuples on that page, and saw delete-in-progress as a
> result.
>
> I added a guard against a similar scenario involving xmin in the last
> version of this patch, but forgot that we need to protect xmax in the
> same way.  With this version of the patch, I can no longer get any
> TIDs to pop up out of pg_check_visible in my testing.  (I haven't run
> your test script for lack of the proper Python environment...)

I can still reproduce the problem with this new patch.  What I see is
that the OldestXmin, the new RecomputedOldestXmin and the tuple's xmax
are all the same.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread David G. Johnston

On Tue, Jun 14, 2016 at 5:58 PM, Merlin Moncure  wrote:

> On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasby 
> wrote:
> > On 6/14/16 3:56 PM, Tom Lane wrote:
> >>
> >> Jim Nasby  writes:
> >>>
> >>> On 6/14/16 3:01 PM, Robert Haas wrote:
> 
>  This seems kind of silly, because anybody who is writing code that
>  might have to run against an existing version of the database won't be
>  able to use it.  The one thing that absolutely has to be cross-version
>  is the method of determining which version you're running against.
> >>
> >>
> >>> We're talking about a function that doesn't currently exist anyway.
> >>
> >>
> >> Huh?  We're talking about PQserverVersion(), comparisons to
> >> PG_VERSION_NUM,
> >> and related APIs.  Those most certainly exist now, and trying to
> supplant
> >> them seems like a giant waste of time.
> >>
> >> On the other hand, parsing fields out of version() mechanically has been
> >> deprecated for as long as those other APIs have existed (which is since
> >> 8.0 or so).  version() is only meant for human consumption, so I see no
> >> reason it shouldn't just start returning "10.0", "10.1", etc.  If that
> >> breaks anyone's code, well, they should have switched to one of the
> >> easier methods years ago.
> >
> >
> > The original post was:
> >
> >>   IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3
> >
> > and \df *version* on my HEAD doesn't show any other options.
>
> Right.  It's the only way to handle things on the SQL level well,
> that, and pg_settings approaches.  In other words, there is no easier
> way.  I think it's pretty reasonable to assume there's a lot more code
> interfacing with the database from SQL than from C.
>

We could stand to be more explicit here.  The docs for version() indicated
the server_version_num should be used for "machine processing".

The implied correct way to access the canonical server version is thus:

SELECT current_setting('server_version_num');

I'd say we should at least provide the above as an example; the reader can
find their way to Chapter 18.1 if they are curious about alternatives.

On the topic of option "D" I'd be fine with fine with functions:
   and ; but that is
independent of this discussion.

Option E: Give the DBA control.  If they know they have one or more
mis-behaving applications but it is out their control to patch the code to
work properly they can change this supposedly human-readable output to
conform the historical x.y.z format.  This would disabled by default.
Humans can easily interpret both versions so please save the comment about
not having GUCs that influence user-visible behavior.  If your argument for
changing the format outright is "its for human consumption only" then
apparently this output should not be considered important enough to adhere
to that rule.  Non-humans depending on its format are subject to our, or
the DBA's, whims.

David J.

Re: [HACKERS] 10.0

2016-06-14 Thread Merlin Moncure

On Tue, Jun 14, 2016 at 4:13 PM, Jim Nasby  wrote:
> On 6/14/16 3:56 PM, Tom Lane wrote:
>>
>> Jim Nasby  writes:
>>>
>>> On 6/14/16 3:01 PM, Robert Haas wrote:

 This seems kind of silly, because anybody who is writing code that
 might have to run against an existing version of the database won't be
 able to use it.  The one thing that absolutely has to be cross-version
 is the method of determining which version you're running against.
>>
>>
>>> We're talking about a function that doesn't currently exist anyway.
>>
>>
>> Huh?  We're talking about PQserverVersion(), comparisons to
>> PG_VERSION_NUM,
>> and related APIs.  Those most certainly exist now, and trying to supplant
>> them seems like a giant waste of time.
>>
>> On the other hand, parsing fields out of version() mechanically has been
>> deprecated for as long as those other APIs have existed (which is since
>> 8.0 or so).  version() is only meant for human consumption, so I see no
>> reason it shouldn't just start returning "10.0", "10.1", etc.  If that
>> breaks anyone's code, well, they should have switched to one of the
>> easier methods years ago.
>
>
> The original post was:
>
>>   IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3
>
> and \df *version* on my HEAD doesn't show any other options.

Right.  It's the only way to handle things on the SQL level well,
that, and pg_settings approaches.  In other words, there is no easier
way.  I think it's pretty reasonable to assume there's a lot more code
interfacing with the database from SQL than from C.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Jim Nasby


On 6/14/16 3:56 PM, Tom Lane wrote:

Jim Nasby  writes:

On 6/14/16 3:01 PM, Robert Haas wrote:

This seems kind of silly, because anybody who is writing code that
might have to run against an existing version of the database won't be
able to use it.  The one thing that absolutely has to be cross-version
is the method of determining which version you're running against.



We're talking about a function that doesn't currently exist anyway.


Huh?  We're talking about PQserverVersion(), comparisons to PG_VERSION_NUM,
and related APIs.  Those most certainly exist now, and trying to supplant
them seems like a giant waste of time.

On the other hand, parsing fields out of version() mechanically has been
deprecated for as long as those other APIs have existed (which is since
8.0 or so).  version() is only meant for human consumption, so I see no
reason it shouldn't just start returning "10.0", "10.1", etc.  If that
breaks anyone's code, well, they should have switched to one of the
easier methods years ago.


The original post was:


  IF substring(version() FROM $q$([0-9]+\.[0-9]+)$q$)::NUMERIC >= 9.3


and \df *version* on my HEAD doesn't show any other options.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Use of index for 50% column restriction

2016-06-14 Thread Bruce Momjian

On Tue, Jun 14, 2016 at 03:24:12PM -0500, Jim Nasby wrote:
> On 6/8/16 4:36 PM, Bruce Momjian wrote:
> >Just a follow-up, but even with a randomized correlation order, it seems
> >25% restrictivity generates a Bitmap Index Scan:
> 
> AFAIK we do the bitmap heap scan in heap order, thereby eliminating the
> effect of correlation?

High correlation would still cause us to access fewer heap pages than
random data.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Jim Nasby


On 6/14/16 3:38 PM, Joshua D. Drake wrote:

On 06/14/2016 12:46 PM, Jim Nasby wrote:


Any ideas on naming for such a function? version_detail()? I suppose
while we're at this we might as well provide the compile details as well.


version(detail) or version(verbose)


I don't think that makes as much sense as a different function name, 
since the output is fundamentally different than version().

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Should pg_export_snapshot() and currtid() be tagged parallel-unsafe?

2016-06-14 Thread Andreas Seltenreich

Digging through the sqlsmith logging db, I noticed the following errors:

ERROR:  cannot update SecondarySnapshot during a parallel operation
ERROR:  cannot assign XIDs during a parallel operation

Queries raising the first one always contain calls to currtid() or
currtid2().  Queries raising the second one always contain a call to
pg_export_snapshot().  Patch to tag them as unsafe attached.

regards,
Andreas
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f33c3ff..6a65e77 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -1347,9 +1347,9 @@ DATA(insert OID = 1291 (  suppress_redundant_updates_trigger	PGNSP PGUID 12 1 0
 DESCR("trigger to suppress updates when new and old records match");
 
 DATA(insert OID = 1292 ( tideq			   PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tideq _null_ _null_ _null_ ));
-DATA(insert OID = 1293 ( currtid		   PGNSP PGUID 12 1 0 0 0 f f f f t f v s 2 0 27 "26 27" _null_ _null_ _null_ _null_ _null_ currtid_byreloid _null_ _null_ _null_ ));
+DATA(insert OID = 1293 ( currtid		   PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 27 "26 27" _null_ _null_ _null_ _null_ _null_ currtid_byreloid _null_ _null_ _null_ ));
 DESCR("latest tid of a tuple");
-DATA(insert OID = 1294 ( currtid2		   PGNSP PGUID 12 1 0 0 0 f f f f t f v s 2 0 27 "25 27" _null_ _null_ _null_ _null_ _null_ currtid_byrelname _null_ _null_ _null_ ));
+DATA(insert OID = 1294 ( currtid2		   PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 27 "25 27" _null_ _null_ _null_ _null_ _null_ currtid_byrelname _null_ _null_ _null_ ));
 DESCR("latest tid of a tuple");
 DATA(insert OID = 1265 ( tidne			   PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tidne _null_ _null_ _null_ ));
 DATA(insert OID = 2790 ( tidgt			   PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tidgt _null_ _null_ _null_ ));
@@ -3135,7 +3135,7 @@ DESCR("xlog filename, given an xlog location");
 DATA(insert OID = 3165 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 1700 "3220 3220" _null_ _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
 DESCR("difference in bytes, given two xlog locations");
 
-DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v r 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
+DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v u 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
 
 DATA(insert OID = 3810 (  pg_is_in_recovery		PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_is_in_recovery _null_ _null_ _null_ ));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Tom Lane

Jim Nasby  writes:
> On 6/14/16 3:01 PM, Robert Haas wrote:
>> This seems kind of silly, because anybody who is writing code that
>> might have to run against an existing version of the database won't be
>> able to use it.  The one thing that absolutely has to be cross-version
>> is the method of determining which version you're running against.

> We're talking about a function that doesn't currently exist anyway.

Huh?  We're talking about PQserverVersion(), comparisons to PG_VERSION_NUM,
and related APIs.  Those most certainly exist now, and trying to supplant
them seems like a giant waste of time.

On the other hand, parsing fields out of version() mechanically has been
deprecated for as long as those other APIs have existed (which is since
8.0 or so).  version() is only meant for human consumption, so I see no
reason it shouldn't just start returning "10.0", "10.1", etc.  If that
breaks anyone's code, well, they should have switched to one of the
easier methods years ago.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Jim Nasby

On 6/14/16 3:01 PM, Robert Haas wrote:

D) Add a version function to 10.0 that returns both parts separately.
>
> My vote is D. Parsing version() output is a wart, but coming out with a
> split output version of that in 9.6 that still has to support 3 numbers
> would also be a wart. We've lived with the parsing wart this long, so lets
> just add an explicit output version to 10.0.
>
> Any ideas on naming for such a function? version_detail()? I suppose while
> we're at this we might as well provide the compile details as well.

This seems kind of silly, because anybody who is writing code that
might have to run against an existing version of the database won't be
able to use it.  The one thing that absolutely has to be cross-version
is the method of determining which version you're running against.

We're talking about a function that doesn't currently exist anyway. So 
no matter what, you won't be able to use it if you're interested in 
<10.0 (or <9.6 if we went with one of the other proposals).

Unless folks were thinking this is something that would be backpatched?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Use of index for 50% column restriction

2016-06-14 Thread Jim Nasby


On 6/8/16 4:36 PM, Bruce Momjian wrote:

Just a follow-up, but even with a randomized correlation order, it seems
25% restrictivity generates a Bitmap Index Scan:


AFAIK we do the bitmap heap scan in heap order, thereby eliminating the 
effect of correlation?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Joshua D. Drake


On 06/14/2016 12:46 PM, Jim Nasby wrote:


Any ideas on naming for such a function? version_detail()? I suppose
while we're at this we might as well provide the compile details as well.


version(detail) or version(verbose)

JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gettimeofday is at the end of its usefulness?

2016-06-14 Thread Jim Nasby


On 6/8/16 9:56 AM, Tom Lane wrote:

Thom Brown  writes:

On 15 May 2014 at 19:56, Bruce Momjian  wrote:

On Tue, May 13, 2014 at 06:58:11PM -0400, Tom Lane wrote:

A recent question from Tim Kane prompted me to measure the overhead
costs of EXPLAIN ANALYZE, which I'd not checked in awhile.  Things
are far worse than I thought.  On my current server (by no means
lavish hardware: Xeon E5-2609 @2.40GHz) a simple seqscan can run
at something like 110 nsec per row:



Did this idea die, or is it still worth considering?


We still have a problem, for sure.  I'm not sure that there was any
consensus on what to do about it.  Using clock_gettime(CLOCK_REALTIME)
if available would be a straightforward change that should ameliorate
gettimeofday()'s 1-usec-precision-limit problem; but it doesn't do
anything to fix the excessive-overhead problem.  The ideas about the
latter were all over the map, and none of them looked easy.

If you're feeling motivated to work on this area, feel free.


Semi-related: someone (Robert I think) recently mentioned investigating 
"vectorized" executor nodes, where multiple tuples would be processed in 
one shot. If we had that presumably the explain penalty would be a moot 
point.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prepared statements and generic plans

2016-06-14 Thread Bruce Momjian

On Tue, Jun 14, 2016 at 08:20:12AM -0400, ''br...@momjian.us' *EXTERN*' wrote:
> > This has caused confussion in the past, see
> > https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com
> > 
> > > Right.  Updated patch attached.
> > 
> > I am happy with the patch as it is.
> 
> Good.

Patch applied.  Thanks for the assistance.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c is not marked as test covered

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 1:14 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jun 14, 2016 at 12:51 PM, Tom Lane  wrote:
>>> FWIW, I follow all of your reasoning except this.  If we believe that the
>>> parallel worker context line is useful, then it is a bug that plpgsql
>>> suppresses it.  If we don't believe it's useful, then we should get
>>> rid of it.  "Do nothing" is simply not a consistent stance here.
>
>> Well, if PL/pgsql suppresses context and nobody's complained about
>> that up until now, fixing it doesn't seem any more urgent than any
>> other bug we've had for N releases.
>
> I have not dug into the code enough to find out exactly what's happening
> in Peter's complaint, but it seems like it would be a good idea to find
> that out before arguing along these lines.  It seems entirely likely
> to me that this is somehow parallel-query-specific.  Even if it isn't,
> I don't buy your argument.  Adding a new case in which context is
> suppressed is a perfectly reasonable basis for thinking that an old
> bug has acquired new urgency.

OK.  I find this whole thing slightly puzzling because Noah wrote this
in the test file:

  -- Provoke error in worker.  The original message CONTEXT contains a worker
  -- PID that must be hidden in the test output.  PL/pgSQL conveniently
  -- substitutes its own CONTEXT.

The phrasing of the comment implies that (1) the behavior is desirable
at least for the purpose at hand and (2) the behavior is unrelated to
parallel query. However, it's surely possible that (2) is false and
(1) is not the consensus.  Noah is usually pretty careful about this
sort of thing, but he might have made a mistake.  Considering that, I
decide to take a look.

I wasn't able in a quick test to verify that this behavior doesn't
happen without parallel query.  I also wasn't able to figure out
exactly why it was happening.  I did verify that this statement
generates an error that is propagated to the leader.  The expected
output looks like this:

ERROR:  invalid input syntax for integer: "BA"
CONTEXT:  SQL statement "select stringu1::int2 from tenk1 where unique1 = 1"
PL/pgSQL function inline_code_block line 5 at SQL statement

The first line of context begins with "SQL statement" and I believe
that's coming from spi.c.  The second line of the context seems to
come from PL/pgsql. But what accounts for the failure of the "parallel
query" line to appear in the context message?  This could be explained
by _SPI_error_callback flushing the existing context in its handler,
but it doesn't seem to do that.  I guess this needs some more looking
at.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 3:46 PM, Jim Nasby  wrote:
> On 6/13/16 2:12 PM, Merlin Moncure wrote:
>>
>> A) make a variant of version() that returns major/minor/bugfix as
>> separate fields with minor being set to 0 for all released versions
>> 10.0 and beyond.  We could then add a NOTE to the version function and
>> other places suggesting to use that instead for 9.6.
>>
>> B) Preserve x.y.z as returned by version() and show server_version for
>> those usages only, with 'y' being always 0 for 10.0 and beyond.  For
>> all other purposes (marketing/documentation/etc) that structure would
>> *not* be preserved, and we would put notes in the documentation
>> describing why the extra digit is there.
>>
>> C) Do neither A or B, and let our users discover such issues on their own.
>
>
> D) Add a version function to 10.0 that returns both parts separately.
>
> My vote is D. Parsing version() output is a wart, but coming out with a
> split output version of that in 9.6 that still has to support 3 numbers
> would also be a wart. We've lived with the parsing wart this long, so lets
> just add an explicit output version to 10.0.
>
> Any ideas on naming for such a function? version_detail()? I suppose while
> we're at this we might as well provide the compile details as well.

This seems kind of silly, because anybody who is writing code that
might have to run against an existing version of the database won't be
able to use it.  The one thing that absolutely has to be cross-version
is the method of determining which version you're running against.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2016-06-14 Thread Jim Nasby

On 6/12/16 2:13 AM, Ants Aasma wrote:

On Fri, Jun 10, 2016 at 5:23 AM, Haribabu Kommi
 wrote:

> 1. Instead of doing the entire database files encryption, how about
> providing user an option to protect only some particular tables that
> wants the encryption at table/tablespace level. This not only provides
> an option to the user, it reduces the performance impact on tables
> that doesn't need any encryption. The problem with this approach
> is that every xlog record needs to validate to handle the encryption
> /decryption, instead of at page level.

Is there a real need for this? The customers I have talked to want to
encrypt the whole database and my goal is to make the feature fast
enough to make that feasible for pretty much everyone. I guess
switching encryption off per table would be feasible, but the key
setup would still need to be done at server startup. Per record
encryption would result in some additional information leakage though.
Overall I thought it would not be worth it, but I'm willing to have my
mind changed on this.

I actually design with this in mind. Tables that contain sensitive info 
go into designated schemas, partly so that you can blanket move all of 
those to an encrypted tablespace (or safer would be to move things not 
in those schemas to an unencrypted tablespace). Since that can be done 
with an encrypted filesystem maybe that's good enough. (It's not really 
clear to me what this buys us over an encrypted FS, other than a feature 
comparison checkmark...)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel safety tagging of extension functions

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 1:51 PM, Robert Haas  wrote:
> On Tue, Jun 14, 2016 at 6:37 AM, Andreas Karlsson  wrote:
>> I have rebased all my patches on the current master now (and skipped the
>> extensions I previously listed).
>
> Thanks, this is really helpful.  It was starting to get hard to keep
> track of what hadn't been applied yet.  I decided to prioritize
> getting committed the patches where the extension version had already
> been bumped by 749a787c5b25ae33b3d4da0ef12aa05214aa73c7, so I've now
> committed the patches for cube, hstore, intarray, ltree, pg_trgm, and
> seg.

I've now also committed the patches for sslinfo, unaccept, uuid-ossp, and xml2.

I took at look at the patch for tsearch2, but I think token_type() is
mismarked.  You have it marked PARALLEL SAFE but seems to depend on
the result of GetCurrentParser(), which returns a backend-private
state variable.  That was the only clear mistake I found, but I tend
to think that changing the markings on anything defined by
UNSUPPORTED_FUNCTION() is pretty silly, because there's no point in
going to extra planner effort to generate a parallel plan only to
error out as soon as we try to execute it.  I think you should leave
all of those out of the patch.

I also took a look at the patch for tablefunc.  I think that you've
got the markings right, here, but I think that it would be good to add
PARALLEL UNSAFE explicitly to the 1.1 version of the file for the
functions are unsafe, and add a comment like "-- query might do
anything" or some other indication as to why they are so marked, for
the benefit of future readers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 10.0

2016-06-14 Thread Jim Nasby


On 6/13/16 2:12 PM, Merlin Moncure wrote:

A) make a variant of version() that returns major/minor/bugfix as
separate fields with minor being set to 0 for all released versions
10.0 and beyond.  We could then add a NOTE to the version function and
other places suggesting to use that instead for 9.6.

B) Preserve x.y.z as returned by version() and show server_version for
those usages only, with 'y' being always 0 for 10.0 and beyond.  For
all other purposes (marketing/documentation/etc) that structure would
*not* be preserved, and we would put notes in the documentation
describing why the extra digit is there.

C) Do neither A or B, and let our users discover such issues on their own.


D) Add a version function to 10.0 that returns both parts separately.

My vote is D. Parsing version() output is a wart, but coming out with a 
split output version of that in 9.6 that still has to support 3 numbers 
would also be a wart. We've lived with the parsing wart this long, so 
lets just add an explicit output version to 10.0.


Any ideas on naming for such a function? version_detail()? I suppose 
while we're at this we might as well provide the compile details as well.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel safety tagging of extension functions

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 6:37 AM, Andreas Karlsson  wrote:
> I have rebased all my patches on the current master now (and skipped the
> extensions I previously listed).

Thanks, this is really helpful.  It was starting to get hard to keep
track of what hadn't been applied yet.  I decided to prioritize
getting committed the patches where the extension version had already
been bumped by 749a787c5b25ae33b3d4da0ef12aa05214aa73c7, so I've now
committed the patches for cube, hstore, intarray, ltree, pg_trgm, and
seg.  However, in pg_trgm, I changed some of the functions that you
had marked as PARALLEL RESTRICTED to be PARALLEL SAFE, because I
didn't see any reason why they needed to be PARALLEL RESTRICTED.  It's
OK for a parallel-safe function to depend on GUC values, because those
are synchronized from the leader to all worker processes.  Random
global variables won't necessarily be kept in sync, but anything
controlled by the GUC mechanism will be.  If there's some other reason
you think those functions aren't parallel-safe, please let me know.

I'm still in favor of rejecting the adminpack patch.  To me, that just
seems like attaching a larger magazine to the gun pointed at your
foot.  I can't deny that in a strict sense those functions are
parallel-safe, but I think they are better left alone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c is not marked as test covered

2016-06-14 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jun 14, 2016 at 12:51 PM, Tom Lane  wrote:
>> FWIW, I follow all of your reasoning except this.  If we believe that the
>> parallel worker context line is useful, then it is a bug that plpgsql
>> suppresses it.  If we don't believe it's useful, then we should get
>> rid of it.  "Do nothing" is simply not a consistent stance here.

> Well, if PL/pgsql suppresses context and nobody's complained about
> that up until now, fixing it doesn't seem any more urgent than any
> other bug we've had for N releases.

I have not dug into the code enough to find out exactly what's happening
in Peter's complaint, but it seems like it would be a good idea to find
that out before arguing along these lines.  It seems entirely likely
to me that this is somehow parallel-query-specific.  Even if it isn't,
I don't buy your argument.  Adding a new case in which context is
suppressed is a perfectly reasonable basis for thinking that an old
bug has acquired new urgency.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rename max_parallel_degree?

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 12:16 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Of course, it would be nice if we could make these counters 64-bit
>> integers, but we can't, because we don't rely on 64-bit reads and
>> writes to be atomic on all platforms.  So instead they'll have to be
>> uint32.  That means they could wrap (if you really work at it) but
>> subtraction will still return the right answer, so it's OK.
>
> OK ...
>
>> If we
>> want to allow the number of parallel workers started to be available
>> for statistical purposes, we can keep to uint32 values for that
>> (parallel_register_count_lo and parallel_register_count_hi, for
>> example), and increment the second one whenever the first one rolls
>> over to zero.
>
> And that's going to be atomic how exactly?

The only process that can look at that structure without taking a lock
is the postmaster.  And the postmaster would only examine
parallel_register_count_lo.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c is not marked as test covered

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 12:51 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jun 10, 2016 at 4:12 PM, Robert Haas  wrote:
>>> On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut
>>>  wrote:
 Elsewhere in this thread I suggested getting rid of the parallel worker
 context by default (except for debugging), but if we do want to keep it,
 then it seems to be a bug that a PL/pgSQL function can just eliminate it.
>
>> This is currently listed as an open item, but it doesn't seem very
>> actionable to me.  The fact that PL/plpgsql chucks the existing
>> context instead of appending to it is presumably a property of
>> PL/plpgsql, not parallel query, and changing that seems like it ought
>> to be out of scope for 9.6.
>
> FWIW, I follow all of your reasoning except this.  If we believe that the
> parallel worker context line is useful, then it is a bug that plpgsql
> suppresses it.  If we don't believe it's useful, then we should get
> rid of it.  "Do nothing" is simply not a consistent stance here.

Well, if PL/pgsql suppresses context and nobody's complained about
that up until now, fixing it doesn't seem any more urgent than any
other bug we've had for N releases.  That would go on the 9.6 open
items list in the section entitled "Older Bugs", where it would have
plenty of company.  Any time somebody wants to fix one of those, they
can, and that would be great, but there's no more or less urgency
right now than, say, four months ago, or six months from now.  It
can't be said that this open item is holding up the release if it's
just a rediscovery of an existing behavior which somebody happens not
to like.

On the other hand, if PL/pgsql does not suppress context in general
but suppresses only this one particular bit of context from parallel
query, then that is probably a bug in new code which should be fixed
before release.  But I don't think that's what is being argued.

Am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] increase message string buffer size of watch command of psql

2016-06-14 Thread Tom Lane

I wrote:
> Robert Haas  writes:
>> On Sun, Jun 12, 2016 at 10:55 AM, Ioseph Kim  wrote:
>>> Increase size of this title, please.
>>> 50 bytes is so small for multi language.
>>> And. I suggest that date string might be local language,
>>> or current_timestamp string.

>> This was already changed in commit dea2b5960.

> Well, we did part of that, but it's still using asctime().  Should we
> change that to strftime(..."%c"...) to be less English-centric?
> (The result seems to be the same in C locale.  pg_controldata has done
> it that way for a long time, with few complaints.)  If we want to do so,
> now would be the time, since 9.6 already whacked around the format
> of \watch output.

I take it from the vast silence that nobody particularly cares one way
or the other.  On reflection I think that this would be a good change
to make, so I'll go do so unless I hear complaints soon.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c is not marked as test covered

2016-06-14 Thread Tom Lane

Robert Haas  writes:
> On Fri, Jun 10, 2016 at 4:12 PM, Robert Haas  wrote:
>> On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut
>>  wrote:
>>> Elsewhere in this thread I suggested getting rid of the parallel worker
>>> context by default (except for debugging), but if we do want to keep it,
>>> then it seems to be a bug that a PL/pgSQL function can just eliminate it.

> This is currently listed as an open item, but it doesn't seem very
> actionable to me.  The fact that PL/plpgsql chucks the existing
> context instead of appending to it is presumably a property of
> PL/plpgsql, not parallel query, and changing that seems like it ought
> to be out of scope for 9.6.

FWIW, I follow all of your reasoning except this.  If we believe that the
parallel worker context line is useful, then it is a bug that plpgsql
suppresses it.  If we don't believe it's useful, then we should get
rid of it.  "Do nothing" is simply not a consistent stance here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel.c is not marked as test covered

2016-06-14 Thread Robert Haas

On Fri, Jun 10, 2016 at 4:12 PM, Robert Haas  wrote:
> On Fri, Jun 10, 2016 at 1:49 PM, Peter Eisentraut
>  wrote:
>> Regarding the patch that ended up being committed, I wonder if it is
>> intentional that PL/pgSQL overwrites the context from the parallel worker.
>> Shouldn't the context effectively look like
>>
>> ERROR:  message
>> CONTEXT:  parallel worker
>> CONTEXT:  PL/pgSQL function
>>
>> Elsewhere in this thread I suggested getting rid of the parallel worker
>> context by default (except for debugging), but if we do want to keep it,
>> then it seems to be a bug that a PL/pgSQL function can just eliminate it.
>
> Several people have suggested getting rid of that now, so maybe we
> should just go ahead and do it.
>
> How would we make it available for debugging, though?  That seems like
> something people will want.

This is currently listed as an open item, but it doesn't seem very
actionable to me.  The fact that PL/plpgsql chucks the existing
context instead of appending to it is presumably a property of
PL/plpgsql, not parallel query, and changing that seems like it ought
to be out of scope for 9.6.

As far as the context is concerned, the reason why that's there is
because I felt (and still feel, to some extent) that users might
experience errors that happen inside a parallel worker and it might be
important for debugging purposes to know that.  Suppose, for example,
that you mislabel a function as PARALLEL SAFE when in fact it relies
on some backend-local state (and should therefore be PARALLEL
RESTRICTED).  When it runs in a worker, it might generate an error
message like this:

ERROR: can't generate random numbers because you haven't specified a seed

...to which the user will reply, "oh yes I did; in fact I ran SELECT
magic_setseed(42) just before I ran the offending query!".  They'll
then contact an expert (hopefully) who will very possibly spend a lot
of time troubleshooting the wrong thing.  If the message says:

ERROR: can't generate random numbers because you haven't specified a seed
CONTEXT: parallel worker, PID 62413

...then the expert has a very good chance of guessing what has gone
wrong right away.

Now, against this scenario, there is every possibility that this
message will just annoy a lot of people.  It is certainly annoying for
regression test authors because the PID changes and, even if we took
the PID out, any given error might sometimes happen in the leader
rather than the worker, depending on timing.  I am not aware of a
concrete reason why it will annoy people in other situations, but
there may well be one (or more).

The simplest thing we could do here is nothing.  We can wait to see
what happens with PostgreSQL 9.6 and then decide whether to make a
change.  Or we can do something now.  We can eliminate the context
entirely and take our chances with scenarios like the one mentioned
above.  Or we can add a new GUC, something like hide_parallel_context,
which suppresses it and turn that off when running regression tests,
or even by default.  It's all just a matter of how much it bothers you
to sometimes get an extra context line vs. how much annoyance you
think will be caused by people wasting time troubleshooting because
they didn't realize parallel query was involved.

Personally, I'm very doubtful about our ability to paper over
artifacts like this.  Sophisticated users are going to end up having
to understand that there are parallel workers and that they propagate
their messages back to the leader, which appends its own context.  We
often seem to assume users won't find out about internal details like
that and so we don't make an effort to expose and document them, but I
don't think that often works out.  There are a whole lot of people who
know that they need to run EXPLAIN 6 times to find out what's really
going on.  I don't expect this case to be any different.  I'm actually
sort of flattered that parallel query is working well enough that
multiple people think they might not ever need to know whether an
error comes from the worker or from the leader, but I'm rather
doubtful that will be true in practice.

All that having been said, I don't know everything and this is just a
judgement call.  I exercised my judgement and that's how we got here.
If there's a consensus that my judgement should be overruled, I'm fine
with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Tom Lane

Amit Kapila  writes:
> On Mon, Jun 13, 2016 at 9:37 PM, Tom Lane  wrote:
>> ...  I got a core dump in the window.sql test:
>> which I think may be another manifestation of the failure-to-apply-proper-
>> pathtarget issue we're looking at in this thread.  Or maybe it's just
>> an unjustified assumption in make_partialgroup_input_target that the
>> input path must always have some sortgrouprefs assigned.

> I don't see this problem after your recent commit
> - 89d53515e53ea080029894939118365b647489b3.  Are you still getting it?

No.  I am suspicious that there's still a bug there, ie we're looking at
the wrong pathtarget; but the regression test doesn't actually crash.
That might only be because we don't choose the parallelized path.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rename max_parallel_degree?

2016-06-14 Thread Tom Lane

Robert Haas  writes:
> Of course, it would be nice if we could make these counters 64-bit
> integers, but we can't, because we don't rely on 64-bit reads and
> writes to be atomic on all platforms.  So instead they'll have to be
> uint32.  That means they could wrap (if you really work at it) but
> subtraction will still return the right answer, so it's OK.

OK ...

> If we
> want to allow the number of parallel workers started to be available
> for statistical purposes, we can keep to uint32 values for that
> (parallel_register_count_lo and parallel_register_count_hi, for
> example), and increment the second one whenever the first one rolls
> over to zero.

And that's going to be atomic how exactly?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Tom Lane

Robert Haas  writes:
> On Mon, Jun 13, 2016 at 6:21 PM, Tom Lane  wrote:
>> I think this is bad because it forces any external creators of
>> UPPERREL_GROUP_AGG to duplicate that code --- and heaven help everybody
>> if anyone is out of sync on whether to set the flag.  So I'd rather keep
>> set_grouped_rel_consider_parallel(), even if all it does is the above.
>> And make it global not static.  Ditto for the other upperrels.

> I'm slightly mystified by this because we really shouldn't be setting
> that flag more than once.  We don't want to do that work repeatedly,
> just once, and prior to adding any paths to the rel.  Are you
> imagining that somebody would try to created grouped paths before we
> finish scan/join planning?

Exactly.  The original pathification design contemplated that FDWs and
other extensions could inject Paths for the upperrels whenever they felt
like it, specifically during relation path building.  Even if you think
that's silly, it *must* be possible to do it during GetForeignUpperPaths,
else that hook is completely useless.  Therefore, adding any data other
than new Paths to the upperrel during create_grouping_paths is a broken
design unless there is some easy, reliable, and not too expensive way for
an FDW to make the same thing happen a bit earlier.  See the commentary in
optimizer/README starting at line 922.

I generally don't like rel->consider_parallel at all for basically this
reason, but it may be too late to undo that aspect of the parallel join
planning design.  I will settle for having an API call that allows FDWs
to get the flag set correctly.  Another possibility is to set the flag
before calling GetForeignUpperPaths, but that might be messy too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 4:06 AM, Amit Langote
 wrote:
> On 2016/06/14 6:51, Robert Haas wrote:
>> On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas  wrote:
>>> Although I have done a bit of review of this patch, it needs more
>>> thought than I have so far had time to give it.  I will update again
>>> by Tuesday.
>>
>> I've reviewed this a bit further and have discovered an infelicity.
>> The following is all with the patch applied.
>>
>> By itself, this join can be pushed down:
>>
>> contrib_regression=# EXPLAIN SELECT 13 FROM ft1 RIGHT JOIN ft2 ON
>> ft1.c1 = ft2.c1;
>>   QUERY PLAN
>> --
>>  Foreign Scan  (cost=100.00..137.66 rows=822 width=4)
>>Relations: (public.ft2) LEFT JOIN (public.ft1)
>> (2 rows)
>>
>> That's great.  However, when that query is used as the outer rel of a
>> left join, it can't:
>>
>> contrib_regression=# explain verbose select * from ft4 LEFT JOIN
>> (SELECT 13 FROM ft1 RIGHT JOIN ft2 ON ft1.c1 = ft2.c1) q on true;
>>  QUERY PLAN
>> -
>>  Nested Loop Left Join  (cost=353.50..920.77 rows=41100 width=19)
>>Output: ft4.c1, ft4.c2, ft4.c3, (13)
>>->  Foreign Scan on public.ft4  (cost=100.00..102.50 rows=50 width=15)
>>  Output: ft4.c1, ft4.c2, ft4.c3
>>  Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3"
>>->  Materialize  (cost=253.50..306.57 rows=822 width=4)
>>  Output: (13)
>>  ->  Hash Left Join  (cost=253.50..302.46 rows=822 width=4)
>>Output: 13
>>Hash Cond: (ft2.c1 = ft1.c1)
>>->  Foreign Scan on public.ft2  (cost=100.00..137.66
>> rows=822 width=4)
>>  Output: ft2.c1
>>  Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
>>->  Hash  (cost=141.00..141.00 rows=1000 width=4)
>>  Output: ft1.c1
>>  ->  Foreign Scan on public.ft1
>> (cost=100.00..141.00 rows=1000 width=4)
>>Output: ft1.c1
>>Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
>> (18 rows)
>>
>> Of course, because of the PlaceHolderVar, there's no way to push down
>> the ft1-ft2-ft4 join to the remote side.  But we could still push down
>> the ft1-ft2 join and then locally perform the join between the result
>> and ft4.  However, the proposed fix doesn't allow that, because
>> ph_eval_at is (b 4 5) and relids for the ft1-ft2 join is also (b 4 5),
>> and so the bms_is_subset(phinfo->ph_eval_at, relids) test returns
>> true.
>
> You're right.  It indeed should be possible to push down ft1-ft2 join.
> However it could not be done without also modifying
> build_tlist_to_deparse() a little (as Ashutosh proposed [1] to do
> upthread).  Attached new version of the patch with following changes:
>
> 1) Modified the check in foreign_join_ok() so that it is not overly
> restrictive.  Previously, it used ph_needed as the highest level at which
> the PHV is needed (by definition) and checked using it whether it is above
> the join under consideration, which ended up being an overkill.  ISTM, we
> can just decide from joinrel->relids of the join under consideration
> whether we are above the lowest level where the PHV could be evaluated
> (ph_eval_at) and stop if so.  So in the example you provided, it won't now
> reject {ft1, ft2}, but will {ft4, ft1, ft2}.
>
> 2) Modified build_tlist_to_deparse() to get rid of the PlaceHolderVars in
> the targetlist to deparse by using PVC_RECURSE_PLACEHOLDER flag of
> pull_var_clause().  That is because we do not yet support anything other
> than Vars in deparseExplicitTargetList() (+1 to your patch to change the
> Assert to elog).

OK, I committed this version with some cosmetic changes.  I ripped out
the header comment to foreign_join_ok(), which is just a nuisance to
maintain.  It unnecessarily recapitulates the tests contained within
the function, requiring us to update the comments in two places
instead of one every time we make a change for no real gain.  And I
rewrote the comment you added.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Amit Kapila

On Mon, Jun 13, 2016 at 10:36 PM, Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Mon, Jun 13, 2016 at 11:02 AM, Tom Lane  wrote:
> >> BTW, decent regression tests could be written without the need to
create
> >> enormous tables if the minimum rel size in create_plain_partial_paths()
> >> could be configured to something less than 1000 blocks.  I think it's
> >> fairly crazy that that arbitrary constant is hard-wired anyway.  Should
> >> we make it a GUC?
>
> > That was proposed before, and I didn't do it mostly because I couldn't
> > think of a name for it that didn't sound unbelievably corny.
>
> min_parallel_relation_size, or min_parallelizable_relation_size, or
> something like that?
>

You are right that such a variable will make it simpler to write tests for
parallel query.  I have implemented such a guc and choose to keep the name
as min_parallel_relation_size.  One thing to note is that in function
create_plain_partial_paths(), curently it is using PG_INT32_MAX/3 for
parallel_threshold to check for overflow, I have changed it to INT_MAX/3 so
as to be consistent with guc.c.  I am not sure if it is advisable to use
PG_INT32_MAX in guc.c as other similar parameters use INT_MAX.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

min_parallel_relation_size_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Tom Lane

David Rowley  writes:
> Do you think it's worth also applying the attached so as to have
> ressortgroupref set to NULL more consistently, instead of sometimes
> NULL and other times allocated to memory wastefully filled with zeros?

Meh --- that seems to add complication without buying a whole lot.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Use of CREATE OR REPLACE in bloom--1.0.sql

2016-06-14 Thread Tom Lane

Andreas Karlsson  writes:
> I do not think that extension SQL scripts should be using CREATE OR 
> REPLACE FUNCTION like bloom--1.0.sql currently does. I suspect that this 
> might just be a typo.

It's definitely a bug.  Grepping around found another instance in
sslinfo, and I also noticed the lack of a standard header on this
script and others.  Pushed, thanks for noticing that!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column

2016-06-14 Thread Tom Lane

Bernd Helmle  writes:
> --On 14. Juni 2016 10:32:13 + Albe Laurenz 
> wrote:
>> I first thought of using the internal ROWID column that's probably
>> similar to your case, but that wouldn't fit into a tid's 6 bytes, and I
>> found that I could only add resjunk columns for existing columns of the
>> table.
>> Making the internal ROWID an explicit column in the foreign table seemed
>> just too ugly.

> The Informix FDW uses SelfItemPointerAttributeNumber. Luckily the Informix
> ROWID is a 4 byte encoded identifier (3 first significant bytes are the
> logical page number, last significant bytes is the slot number within that
> page). Maybe you can find a way of logically addressing your data, too? It
> only needs to fit within 6 bytes, afaik.

There's been some speculation about allowing FDWs to control the type of
the CTID column created for a foreign table, but it hasn't gotten past
the speculation stage yet.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column

2016-06-14 Thread Aleksey Demakov

A very quick and dirty hack I did in src/backend/optimizer/plan/initsplan.c (in 
9.5.3):

--- initsplan.c.orig2016-06-14 19:08:27.0 +0600
+++ initsplan.c 2016-06-14 19:10:55.0 +0600
@@ -185,9 +185,12 @@
if (IsA(node, Var))
{
Var*var = (Var *) node;
-   RelOptInfo *rel = find_base_rel(root, var->varno);
+   RelOptInfo *rel;
int attno = var->varattno;
 
+   if (var->varno == INDEX_VAR)
+   continue;
+   rel = find_base_rel(root, var->varno);
if (bms_is_subset(where_needed, rel->relids))
continue;
Assert(attno >= rel->min_attr && attno <= 
rel->max_attr);


And then in my FDW I add the tuple id column like this:

static void
MyAddForeignUpdateTargets(Query *parsetree,
RangeTblEntry 
*target_rte,
Relation 
target_relation)
{
Var*var;
TargetEntry *tle;

/* Make a Var representing the desired value */
var = makeVar(INDEX_VAR, /* instead of parsetree->resultRelation,*/
  target_relation->rd_att->natts + 1,
  INT8OID,
  -1,
  InvalidOid,
  0);

/* Wrap it in a resjunk TLE with the right name ... */
tle = makeTargetEntry((Expr *) var,
  
list_length(parsetree->targetList) + 1,
  pstrdup(MY_FDW_TUPLE_ID),
  true);

/* ... and add it to the query's targetlist */
parsetree->targetList = lappend(parsetree->targetList, tle);
}

I was able to run successfully a couple of very simple tests with these. This 
seems to
indicate that tweaking the core to handle this case properly is doable.

The question is if this approach is conceptually correct and if so what are the 
other
required places to patch.

Regards,
Aleksey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Amit Kapila

On Tue, Jun 14, 2016 at 2:16 AM, Tom Lane  wrote:
>
> Robert Haas  writes:
> > In practice, we don't yet have the ability for
> > parallel-safe paths from subqueries to affect planning at higher query
> > levels, but that's because the pathification stuff landed too late in
> > the cycle for me to fully react to it.
>
> I wonder if that's not just from confusion between subplans and
> subqueries.
>

Won't the patch as written will allow parallel-restricted things to be
pushed to workers for UNION ALL kind of queries?

Explain verbose Select * from (SELECT c1+1 FROM t1 UNION ALL SELECT c1+1
FROM t2 where c1 < 10) ss(a);

In above kind of queries, set_rel_consider_parallel() might set
consider_parallel as true for rel, but later in set_append_rel_size(), it
might pull some unsafe clauses in target of childrel.   Basically, I am
wondering about the same problem as we discussed here [1].

[1] -
https://www.postgresql.org/message-id/CAA4eK1Jz5tG2D2PCNYqYob2cb4dKowKYwVJ7OmP5vwyRyCMx4g%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column

2016-06-14 Thread Bernd Helmle

--On 14. Juni 2016 10:32:13 + Albe Laurenz 
wrote:

> I first thought of using the internal ROWID column that's probably
> similar to your case, but that wouldn't fit into a tid's 6 bytes, and I
> found that I could only add resjunk columns for existing columns of the
> table.
> Making the internal ROWID an explicit column in the foreign table seemed
> just too ugly.

The Informix FDW uses SelfItemPointerAttributeNumber. Luckily the Informix
ROWID is a 4 byte encoded identifier (3 first significant bytes are the
logical page number, last significant bytes is the slot number within that
page). Maybe you can find a way of logically addressing your data, too? It
only needs to fit within 6 bytes, afaik.

-- 
Thanks

Bernd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reviewing freeze map code

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 8:11 AM, Robert Haas  wrote:
>>> I noticed that the tuples that it reported were always offset 1 in a
>>> page, and that the page always had a maxoff over a couple of hundred,
>>> and that we called record_corrupt_item because VM_ALL_VISIBLE returned
>>> true but HeapTupleSatisfiesVacuum on the first tuple returned
>>> HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE.
>>> It did that because HEAP_XMAX_COMMITTED was not set and
>>> TransactionIdIsInProgress returned true for xmax.
>>
>> So this seems like it might be a visibility map bug rather than a bug
>> in the test code, but I'm not completely sure of that.  How was it
>> legitimate to mark the page as all-visible if a tuple on the page
>> still had a live xmax?  If xmax is live and not just a locker then the
>> tuple is not visible to the transaction that wrote xmax, at least.
>
> Ah, wait a minute.  I see how this could happen.  Hang on, let me
> update the pg_visibility patch.

The problem should be fixed in the attached revision of
pg_check_visible.  I think what happened is:

1. pg_check_visible computed an OldestXmin.
2. Some transaction committed.
3. VACUUM computed a newer OldestXmin and marked a page all-visible with it.
4. pg_check_visible then used its older OldestXmin to check the
visibility of tuples on that page, and saw delete-in-progress as a
result.

I added a guard against a similar scenario involving xmin in the last
version of this patch, but forgot that we need to protect xmax in the
same way.  With this version of the patch, I can no longer get any
TIDs to pop up out of pg_check_visible in my testing.  (I haven't run
your test script for lack of the proper Python environment...)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From 18815b0d6fcfc2048e47f104ef85ee981687d4de Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Fri, 10 Jun 2016 14:42:46 -0400
Subject: [PATCH 1/2] Add integrity-checking functions to pg_visibility.

The new pg_check_visible() and pg_check_frozen() functions can be used to
verify that the visibility map bits for a relation's data pages match the
actual state of the tuples on those pages.

Amit Kapila and Robert Haas, reviewed by Andres Freund.  Additional
testing help by Thomas Munro.
---
 contrib/pg_visibility/Makefile|   2 +-
 contrib/pg_visibility/pg_visibility--1.0--1.1.sql |  17 ++
 contrib/pg_visibility/pg_visibility--1.0.sql  |  52 
 contrib/pg_visibility/pg_visibility--1.1.sql  |  67 +
 contrib/pg_visibility/pg_visibility.c | 313 ++
 contrib/pg_visibility/pg_visibility.control   |   2 +-
 doc/src/sgml/pgvisibility.sgml|  28 +-
 src/tools/pgindent/typedefs.list  |   1 +
 8 files changed, 427 insertions(+), 55 deletions(-)
 create mode 100644 contrib/pg_visibility/pg_visibility--1.0--1.1.sql
 delete mode 100644 contrib/pg_visibility/pg_visibility--1.0.sql
 create mode 100644 contrib/pg_visibility/pg_visibility--1.1.sql

diff --git a/contrib/pg_visibility/Makefile b/contrib/pg_visibility/Makefile
index fbbaa2e..379591a 100644
--- a/contrib/pg_visibility/Makefile
+++ b/contrib/pg_visibility/Makefile
@@ -4,7 +4,7 @@ MODULE_big = pg_visibility
 OBJS = pg_visibility.o $(WIN32RES)
 
 EXTENSION = pg_visibility
-DATA = pg_visibility--1.0.sql
+DATA = pg_visibility--1.1.sql pg_visibility--1.0--1.1.sql
 PGFILEDESC = "pg_visibility - page visibility information"
 
 ifdef USE_PGXS
diff --git a/contrib/pg_visibility/pg_visibility--1.0--1.1.sql b/contrib/pg_visibility/pg_visibility--1.0--1.1.sql
new file mode 100644
index 000..2c97dfd
--- /dev/null
+++ b/contrib/pg_visibility/pg_visibility--1.0--1.1.sql
@@ -0,0 +1,17 @@
+/* contrib/pg_visibility/pg_visibility--1.0--1.1.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_visibility UPDATE TO '1.1'" to load this file. \quit
+
+CREATE FUNCTION pg_check_frozen(regclass, t_ctid OUT tid)
+RETURNS SETOF tid
+AS 'MODULE_PATHNAME', 'pg_check_frozen'
+LANGUAGE C STRICT;
+
+CREATE FUNCTION pg_check_visible(regclass, t_ctid OUT tid)
+RETURNS SETOF tid
+AS 'MODULE_PATHNAME', 'pg_check_visible'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION pg_check_frozen(regclass) FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_check_visible(regclass) FROM PUBLIC;
diff --git a/contrib/pg_visibility/pg_visibility--1.0.sql b/contrib/pg_visibility/pg_visibility--1.0.sql
deleted file mode 100644
index da511e5..000
--- a/contrib/pg_visibility/pg_visibility--1.0.sql
+++ /dev/null
@@ -1,52 +0,0 @@
-/* contrib/pg_visibility/pg_visibility--1.0.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION pg_visibility" to load this file. \quit
-
--- Show visibility map information.
-CREATE FUNCTION pg_visibility_map(regclass, blkno bigint,
-

Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails

2016-06-14 Thread Michael Paquier

On Tue, Jun 14, 2016 at 8:31 PM, Kyotaro HORIGUCHI
 wrote:
>> +# Take a second backup of the standby while the master is offline.
>> +$node_master->stop;
>> +$node_standby_1->backup('my_backup_2');
>> +$node_master->start;
>
> I'm not sure that adding the test case for a particular bug like
> this is appropriate. But it would be acceptable because it
> doesn't take long time and it is separate from standard checks.

We already take a backup from a standby when master is connected, it
should not cost much in terms of time.

> It seems to me that we could more agressively advance the
> minRecoveryPoint (but must not let it go too far..), but it is
> right for it to aim a bit smaller than the ideal location.

It may be risky to propose such a change for a backpatch. Anyway, in
any case there is no guarantee that when using the low-level backup
routines pg_start/stop_backup with a custom backup method the minimum
recovery point will be correct.. pg_basebackup does that a bit more
carefully if I recall correctly (too lazy to look at the code now :)).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prepared statements and generic plans

2016-06-14 Thread ''br...@momjian.us' *EXTERN*'

On Tue, Jun 14, 2016 at 08:37:12AM +, Albe Laurenz wrote:
> Bruce Momjian wrote:
> > However, for the wire protocol prepare/execute, how do you do EXPLAIN?
> > The only way I can see doing it is to put the EXPLAIN in the prepare
> > query, but I wasn't sure that works.  So, I just wrote and tested the
> > attached C program and it properly output the explain information, e.g.
> > 
> > res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, 
> > NULL);
> > ---
> > generated:
> > 
> > QUERY PLAN
> > 
> > Seq Scan on pg_language  (cost=0.00..1.04 rows=4 width=114)
> > 
> > so that works --- good.
> 
> Hm, yes.
> 
> Were you just curious or is it relevant for the documentation update?

I was curious because if there was no way to do it, I should document
that.

> >>> Looking at how the code behaves, it seems custom plans that are _more_
> >>> expensive (plus planning cost) than the generic plan switch to the
> >>> generic plan after five executions, as now documented.  Custom plans
> >>> that are significantly _cheaper_ than the generic plan _never_ use the
> >>> generic plan.
> >>
> >> Yes, that's what the suggested documentation improvement says as well,
> >> right?
> > 
> > Yes.  What is odd is that it isn't the plan of the actual supplied
> > parameters that is cheaper, just the generic plan that assumes each
> > distinct value in the query is equally likely to be used.  So, when we
> > say the generic plan is cheaper, it is just comparing the custom plan
> > with the supplied parameters vs. the generic plan --- it is not saying
> > that running the supplied constants with the generic plan will execute
> > faster, because in fact we might be using a sub-optimial generic plan.
> 
> Right, that's why it is important to document that it is estimates that are
> compared, not actual costs.
> 
> This has caused confussion in the past, see
> https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com
> 
> > Right.  Updated patch attached.
> 
> I am happy with the patch as it is.

Good.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reviewing freeze map code

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 8:08 AM, Robert Haas  wrote:
> On Tue, Jun 14, 2016 at 2:53 AM, Thomas Munro
>  wrote:
>> On Tue, Jun 14, 2016 at 4:02 AM, Robert Haas  wrote:
>>> On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas  wrote:
> How about changing the return tuple of heap_prepare_freeze_tuple to
> a bitmap?  Two flags: "Freeze [not] done" and "[No] more freezing
> needed"

 Yes, I think something like that sounds about right.
>>>
>>> Here's a patch.  I took the approach of adding a separate bool out
>>> parameter instead.  I am also attaching an update of the
>>> check-visibility patch which responds to assorted review comments and
>>> adjusting it for the problems found on Friday which could otherwise
>>> lead to false positives.  I'm still getting occasional TIDs from the
>>> pg_check_visible() function during pgbench runs, though, so evidently
>>> not all is well with the world.
>>
>> I'm still working out how half this stuff works, but I managed to get
>> pg_check_visible() to spit out a row every few seconds with the
>> following brute force approach:
>>
>> CREATE TABLE foo (n int);
>> INSERT INTO foo SELECT generate_series(1, 10);
>>
>> Three client threads (see attached script):
>> 1.  Run VACUUM in a tight loop.
>> 2.  Run UPDATE foo SET n = n + 1 in a tight loop.
>> 3.  Run SELECT pg_check_visible('foo'::regclass) in a tight loop, and
>> print out any rows it produces.
>>
>> I noticed that the tuples that it reported were always offset 1 in a
>> page, and that the page always had a maxoff over a couple of hundred,
>> and that we called record_corrupt_item because VM_ALL_VISIBLE returned
>> true but HeapTupleSatisfiesVacuum on the first tuple returned
>> HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE.
>> It did that because HEAP_XMAX_COMMITTED was not set and
>> TransactionIdIsInProgress returned true for xmax.
>
> So this seems like it might be a visibility map bug rather than a bug
> in the test code, but I'm not completely sure of that.  How was it
> legitimate to mark the page as all-visible if a tuple on the page
> still had a live xmax?  If xmax is live and not just a locker then the
> tuple is not visible to the transaction that wrote xmax, at least.

Ah, wait a minute.  I see how this could happen.  Hang on, let me
update the pg_visibility patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reviewing freeze map code

2016-06-14 Thread Robert Haas

On Tue, Jun 14, 2016 at 2:53 AM, Thomas Munro
 wrote:
> On Tue, Jun 14, 2016 at 4:02 AM, Robert Haas  wrote:
>> On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas  wrote:
 How about changing the return tuple of heap_prepare_freeze_tuple to
 a bitmap?  Two flags: "Freeze [not] done" and "[No] more freezing
 needed"
>>>
>>> Yes, I think something like that sounds about right.
>>
>> Here's a patch.  I took the approach of adding a separate bool out
>> parameter instead.  I am also attaching an update of the
>> check-visibility patch which responds to assorted review comments and
>> adjusting it for the problems found on Friday which could otherwise
>> lead to false positives.  I'm still getting occasional TIDs from the
>> pg_check_visible() function during pgbench runs, though, so evidently
>> not all is well with the world.
>
> I'm still working out how half this stuff works, but I managed to get
> pg_check_visible() to spit out a row every few seconds with the
> following brute force approach:
>
> CREATE TABLE foo (n int);
> INSERT INTO foo SELECT generate_series(1, 10);
>
> Three client threads (see attached script):
> 1.  Run VACUUM in a tight loop.
> 2.  Run UPDATE foo SET n = n + 1 in a tight loop.
> 3.  Run SELECT pg_check_visible('foo'::regclass) in a tight loop, and
> print out any rows it produces.
>
> I noticed that the tuples that it reported were always offset 1 in a
> page, and that the page always had a maxoff over a couple of hundred,
> and that we called record_corrupt_item because VM_ALL_VISIBLE returned
> true but HeapTupleSatisfiesVacuum on the first tuple returned
> HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE.
> It did that because HEAP_XMAX_COMMITTED was not set and
> TransactionIdIsInProgress returned true for xmax.

So this seems like it might be a visibility map bug rather than a bug
in the test code, but I'm not completely sure of that.  How was it
legitimate to mark the page as all-visible if a tuple on the page
still had a live xmax?  If xmax is live and not just a locker then the
tuple is not visible to the transaction that wrote xmax, at least.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUG] pg_basebackup from disconnected standby fails

2016-06-14 Thread Kyotaro HORIGUCHI

Hello, thank you for looking this.

At Fri, 10 Jun 2016 17:39:59 +0900, Michael Paquier  
wrote in 
> On Thu, Jun 9, 2016 at 9:55 PM, Kyotaro HORIGUCHI
>  wrote:
> > Hello, I found that pg_basebackup from a replication standby
> > fails after the following steps, on 9.3 and the master.
> >
> > - start a replication master
> > - start a replication standby
> > - stop the master in the mode other than immediate.
> >
> > pg_basebackup to the standby will fail with the following error.
> >
> >> pg_basebackup: could not get transaction log end position from server:
> >> ERROR:  could not find any WAL files
> 
> Indeed, and you could just do the following to reproduce the failure
> with the recovery test suite, so I would suggest adding this test in
> the patch:
> --- a/src/test/recovery/t/001_stream_rep.pl
> +++ b/src/test/recovery/t/001_stream_rep.pl
> @@ -24,6 +24,11 @@ $node_standby_1->start;
>  # pg_basebackup works on a standby).
>  $node_standby_1->backup($backup_name);
> 
> +# Take a second backup of the standby while the master is offline.
> +$node_master->stop;
> +$node_standby_1->backup('my_backup_2');
> +$node_master->start;

I'm not sure that adding the test case for a particular bug like
this is appropriate. But it would be acceptable because it
doesn't take long time and it is separate from standard checks.

> > After looking more closely, I found that the minRecoveryPoint
> > tends to be too small as the backup end point, and up to the
> > record at the lastReplayedRecPtr can affect the pages on disk and
> > they can go into the backup just taken.
> >
> > My conclusion here is that do_pg_stop_backup should return
> > lastReplayedRecPtr, not minRecoveryPoint.
> 
> I have been thinking quite a bit about this patch, and this logic
> sounds quite right to me. When stopping the backup we need to let the
> user know up to which point it needs to replay WAL, and relation pages
> are touched up to lastReplayedEndRecPtr.

Yes, but by design, the changes in buffers don't go into disk
until buffer replacing occurs, which updates minRecoveryPoint. My
understanding is that the problem is that a restart point that is
not accompanied with buffer updates advances only the redo point
of the last checkpoint and doesn't update minRecoveryPoint, which
may be behind the redo point at the time.

It seems to me that we could more agressively advance the
minRecoveryPoint (but must not let it go too far..), but it is
right for it to aim a bit smaller than the ideal location.

So I proposed the patch as a solution instead of changing
minRecoveryPoint's movement. As the result, the explanation for
it is accurate enough, but obscuring the root cause.


> This LSN could be greater
> than the minimum recovery point as there is no record to mark the end
> of the backup, and pg_control has normally, well surely, being backup
> up last but that's not a new problem it would as well have been backup
> up before the minimum recovery point has been reached...

Yes. it would cause no new problem except that the amount of WAL
files to be copied could be larger than ideal amount.

> Still, perhaps we could improve the documentation regarding that? Say
> it is recommended to enforce the minimum recovery point in pg_control
> to the stop backup LSN to ensure that the backup recovers to a
> consistent state when taking a backup from a standby.

If I understand that correctly, I don't find a explicit mention
to minimum recovery point (and should not be, I suppose) in PITR
and pg_baseback pages in the documentaiton.

Up to where we need to reach a consistent state from a backup is
not the "Minimum recovery ending point" in pg_control. It holds
the consistency point in the previous recovery. What we need to
reach there for a backup is the WAL files up to the LSN returned
by the (do_)pg_stop_backup(). All of the files should have been
archived on master and should have been automatically transferred
by pg_basebackup with -X/x option.  (I don't know what to do when
-X/x is not specified for pg_basebackup to standby, though..)

> I am attaching an updated patch with the test btw.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread Amit Kapila

On Mon, Jun 13, 2016 at 8:11 PM, Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Mon, Jun 13, 2016 at 7:51 PM, Tom Lane  wrote:
> >> I think the real question here is why the code removed by 04ae11f62
> >> was wrong.  It was unsafe to use apply_projection_to_path, certainly,
> >> but using create_projection_path directly would have avoided the
> >> stated problem.  And it's very unclear that this new patch doesn't
> >> bring back that bug in a different place.
>
> > This new patch still doesn't seem to be right, but it won't bring back
the
> > original problem because apply_projection_to_path will be only done if
> > grouped_rel is parallel_safe which means it doesn't have any
> > parallel-unsafe or parallel-restricted clause in quals or target list.
>
> The problem cited in 04ae11f62's commit message is that
> apply_projection_to_path would overwrite the paths' pathtargets in-place,
> causing problems if they'd been used for other purposes elsewhere.
>

The main reason for removal of that code is that because there was no check
there to prevent assigning of parallel-restricted clauses to pathtarget of
partial paths.  I think the same is indicated in commit message as well, if
we focus on below part of commit message:
 "especially because this code has no check that the PathTarget is in fact
parallel-safe."

Due to above reason, I don't see how the suggestion given by you to avoid
the problem by using create_projection_path can work.

>  I do
> not share your confidence that using apply_projection_to_path within
> create_grouping_paths is free of such a hazard.
>

Fair enough, let me try to explain the problem and someways to solve it in
some more detail.  The main thing that got missed by me in the patch
related to commit-04ae11f62 is that the partial path list of rel also needs
to have a scanjoin_target. I was under assumption that
create_grouping_paths will take care of assigning appropriate Path targets
for the parallel paths generated by it.  If we see, create_grouping_paths()
do take care of adding the appropriate path targets for the paths generated
by that function.  However, it doesn't do anything for partial paths.   The
patch sent by me yesterday [1] which was trying to assign
partial_grouping_target to partial paths won't be the right fix, because
(a) partial_grouping_target includes Aggregates (refer
make_partialgroup_input_target) which we don't want for partial paths; (b)
it is formed using grouping_target passed in function
create_grouping_paths() whereas we need the path target formed from
final_target for partial paths (as partial paths are scanjoin relations)
 as is done for main path list (in grouping_planner(),  /* Forcibly apply
that target to all the Paths for the scan/join rel.*/).   Now, I think we
have following ways to solve it:

(a) check whether the scanjoin_target contains any parallel-restricted
clause before applying the same to partial path list in grouping_planner.
However it could lead to duplicate checks in some cases for
parallel-restricted clause, once in apply_projection_to_path() for main
pathlist and then again before applying to partial paths.  I think we can
avoid that by having an additional parameter in apply_projection_to_path()
which can indicate if the check for parallel-safety is done inside that
function and what is the result of same.

(b) generate the appropriate scanjoin_target for partial path list in
create_grouping_paths using final_target. However I think this might lead
to some duplicate code in create_grouping_paths() as we might have to some
thing similar to what we have done in grouping_planner for generating such
a target.  I think if we want we can pass scanjoin_target to
create_grouping_paths() as well.   Again, we have to check for
parallel-safety for scanjoin_target before applying it to partial paths,
but we need to do that only when grouped_rel is considered parallel-safe.

Thoughts?

[1] -
https://www.postgresql.org/message-id/CAA4eK1Jm6HrJAPX26xyLGWes%2B%2Br5%3DdOyOGRWeTa4q8NKd-UhVQ%40mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Rename max_parallel_degree?

2016-06-14 Thread Julien Rouhaud

On 14/06/2016 04:09, Robert Haas wrote:
> On Mon, Jun 13, 2016 at 5:42 AM, Julien Rouhaud
>  wrote:
>> Agreed, and fixed in attached v3.
> 
> I don't entirely like the new logic in
> RegisterDynamicBackgroundWorker.

I'm not that happy with it too. We can avoid iterating over every slots
if the feature isn't activated though (max_parallel_workers >=
max_worker_processes).

> I wonder if we can't drive this off
> of a couple of counters, instead of having the process registering the
> background worker iterate over every slot.  Suppose we add two
> counters to BackgroundWorkerArray, parallel_register_count and
> parallel_terminate_count.  Whenever a backend successfully registers a
> parallel worker, it increments parallel_register_count.  Whenever the
> postmaster marks a parallel wokrer slot as no longer in use, it
> increments parallel_terminate_count.  Then, the number of active
> parallel workers is just parallel_register_count -
> parallel_terminate_count.  (We can't have the postmaster and the
> backends share the same counter, because then it would need locking,
> and the postmaster can't try to take spinlocks - can't even use
> atomics, because those might be emulated using spinlocks.)
> 

I wanted to maintain counters at first, but it seemed more invasive, and
I thought that the max_parallel_worker would be ueful in environnements
where there're lots of parallel workers and dynamic workers used, so
finding a free slot would require iterating over most of the slots most
of the time anyway.  I'm of course also ok with maintaining counters.

> If we want to allow the number of parallel workers started to be available
> for statistical purposes, we can keep to uint32 values for that
> (parallel_register_count_lo and parallel_register_count_hi, for
> example), and increment the second one whenever the first one rolls
> over to zero.
> 

I didn't think about monitoring. I'm not sure if this counter would be
really helpful without also having the number of time a parallel worker
couldn't be launched (and I'd really like to have this one).

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Use of CREATE OR REPLACE in bloom--1.0.sql

2016-06-14 Thread Andreas Karlsson


Hi,

I do not think that extension SQL scripts should be using CREATE OR 
REPLACE FUNCTION like bloom--1.0.sql currently does. I suspect that this 
might just be a typo.


I have attached the tiny patch which fixes this.

Andreas
diff --git a/contrib/bloom/bloom--1.0.sql b/contrib/bloom/bloom--1.0.sql
index 87b5442..132a550 100644
--- a/contrib/bloom/bloom--1.0.sql
+++ b/contrib/bloom/bloom--1.0.sql
@@ -1,4 +1,4 @@
-CREATE OR REPLACE FUNCTION blhandler(internal)
+CREATE FUNCTION blhandler(internal)
 RETURNS index_am_handler
 AS 'MODULE_PATHNAME'
 LANGUAGE C;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel safety tagging of extension functions

2016-06-14 Thread Andreas Karlsson


On 06/07/2016 05:54 PM, Andreas Karlsson wrote:

On 06/07/2016 05:44 PM, Robert Haas wrote:

cube: I think we need a new extension version.
hstore: Does not apply for me.
intarray: Does not apply for me.


Those three and ltree, pg_trgm, and seg depend on my patch with fixes
for the GiST/GIN function signatures in
https://www.postgresql.org/message-id/574f091a.3050...@proxel.se. The
reason for the dependency is that I also mark the those functions with
changed signatures as parallel safe.


I have rebased all my patches on the current master now (and skipped the 
extensions I previously listed).


Andreas


parallel-contrib-v5-adminpack.patch.gz
Description: application/gzip


parallel-contrib-v5-cube.patch.gz
Description: application/gzip


parallel-contrib-v5-dblink.patch.gz
Description: application/gzip


parallel-contrib-v5-hstore.patch.gz
Description: application/gzip


parallel-contrib-v5-intarray.patch.gz
Description: application/gzip


parallel-contrib-v5-ltree.patch.gz
Description: application/gzip


parallel-contrib-v5-pg_trgm.patch.gz
Description: application/gzip


parallel-contrib-v5-pg_visibility.patch.gz
Description: application/gzip


parallel-contrib-v5-seg.patch.gz
Description: application/gzip


parallel-contrib-v5-sslinfo.patch.gz
Description: application/gzip


parallel-contrib-v5-tablefunc.patch.gz
Description: application/gzip


parallel-contrib-v5-tsearch2.patch.gz
Description: application/gzip


parallel-contrib-v5-unaccent.patch.gz
Description: application/gzip


parallel-contrib-v5-uuid-ossp.patch.gz
Description: application/gzip


parallel-contrib-v5-xml2.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column

2016-06-14 Thread Albe Laurenz

Aleksey Demakov wrote:
> I have a data store where tuples have unique identities that normally are not 
> visible.
> I also have a FDW to work with this data store. As per the docs to implement 
> updates
> for this data store I have AddForeignUpdateTargets() function that adds an 
> artificial
> column to the target list.
> 
> It seems to me that I cannot re-use a system attribute number for this 
> artificial resjunk
> column (as, for instance, postgres_fdw uses SelfItemPointerAttributeNumber). 
> These
> attributes have specific meaning not compatible with my tuple identity.
> 
> On other hand using a regular AttrNumber might confuse the query planner. In 
> contrast
> e..g with Oracle FDW that can use a unique key to identify the row, in my 
> data store
> the tuple identity is normally not visible. So the data planner might break 
> if it sees a
> Var node with an unexpected varattno number.
>
> What is the best approach to handle such a case?
> 
> 1. Give up on this entirely and require a unique key for any table used thru 
> FDW.
> 
> 2. Force the FDW to expose the identity column either explicitly by the user 
> who
> creates a foreign table or automatically adding it in the corresponding 
> trigger
> (preferably still making it hidden for normal scans).
> 
> 3. Modify the postgresql core to nicely handle the case of an unknown target
> column added by a FDW.
> 
> 4. Something else?

When implementing this for oracle_fdw, I went with your solution #1.
The downside is that anything that does not have a unique key cannot be
modified.

I first thought of using the internal ROWID column that's probably similar to
your case, but that wouldn't fit into a tid's 6 bytes, and I found that I could
only add resjunk columns for existing columns of the table.
Making the internal ROWID an explicit column in the foreign table seemed
just too ugly.

I don't know if #3 would be difficult, but it sure would make things easier
for FDW developers.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Using FDW AddForeignUpdateTargets for a hidden pseudo-column

2016-06-14 Thread Aleksey Demakov

Hi all,

I have a data store where tuples have unique identities that normally are not 
visible.
I also have a FDW to work with this data store. As per the docs to implement 
updates
for this data store I have AddForeignUpdateTargets() function that adds an 
artificial
column to the target list.

It seems to me that I cannot re-use a system attribute number for this 
artificial resjunk
column (as, for instance, postgres_fdw uses SelfItemPointerAttributeNumber). 
These
attributes have specific meaning not compatible with my tuple identity.

On other hand using a regular AttrNumber might confuse the query planner. In 
contrast
e..g with Oracle FDW that can use a unique key to identify the row, in my data 
store
the tuple identity is normally not visible. So the data planner might break if 
it sees a
Var node with an unexpected varattno number.

What is the best approach to handle such a case?

1. Give up on this entirely and require a unique key for any table used thru 
FDW.

2. Force the FDW to expose the identity column either explicitly by the user who
creates a foreign table or automatically adding it in the corresponding trigger
(preferably still making it hidden for normal scans).

3. Modify the postgresql core to nicely handle the case of an unknown target
column added by a FDW.

4. Something else?

Regards,
Aleksey



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ERROR: ORDER/GROUP BY expression not found in targetlist

2016-06-14 Thread David Rowley

On 14 June 2016 at 04:07, Tom Lane  wrote:
> Just as an experiment to see what would happen, I did
>
> -   int parallel_threshold = 1000;
> +   int parallel_threshold = 1;
>
> and ran the regression tests.  I got a core dump in the window.sql test:
>
> Program terminated with signal 11, Segmentation fault.
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> 4307Index   sgref = 
> final_target->sortgrouprefs[i];
> (gdb) bt
> #0  0x00664dbc in make_partialgroup_input_target (root=0x1795018,
> input_rel=0x17957a8, target=0x17bf228, rollup_lists=0x0,
> rollup_groupclauses=0x0) at planner.c:4307
> #1  create_grouping_paths (root=0x1795018, input_rel=0x17957a8,
> target=0x17bf228, rollup_lists=0x0, rollup_groupclauses=0x0)
> at planner.c:3420
> #2  0x00667405 in grouping_planner (root=0x1795018,
> inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1794
> #3  0x00668c80 in subquery_planner (glob=,
> parse=0x1703580, parent_root=,
> hasRecursion=, tuple_fraction=0) at planner.c:769
> #4  0x00668ea5 in standard_planner (parse=0x1703580,
> cursorOptions=256, boundParams=0x0) at planner.c:308
> #5  0x006691b6 in planner (parse=,
> cursorOptions=, boundParams=)
> at planner.c:178
> #6  0x006fb069 in pg_plan_query (querytree=0x1703580,
> cursorOptions=256, boundParams=0x0) at postgres.c:798
> (gdb) p debug_query_string
> $1 = 0x1702078 "SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;"

I see you've committed a fix for this. Thank you.

I thought it would be good to be consistent with the ressortgroupref
handling, and I quite like your fix in that regard.

Do you think it's worth also applying the attached so as to have
ressortgroupref set to NULL more consistently, instead of sometimes
NULL and other times allocated to memory wastefully filled with zeros?

The patch also saved on allocating the array's memory when it's not needed.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


make_pathtarget_from_tlist.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prepared statements and generic plans

2016-06-14 Thread Albe Laurenz

Bruce Momjian wrote:
> However, for the wire protocol prepare/execute, how do you do EXPLAIN?
> The only way I can see doing it is to put the EXPLAIN in the prepare
> query, but I wasn't sure that works.  So, I just wrote and tested the
> attached C program and it properly output the explain information, e.g.
> 
>   res = PQprepare(conn, "prep1", "EXPLAIN SELECT * FROM pg_language", 0, 
> NULL);
>   ---
> generated:
> 
>   QUERY PLAN
> 
>   Seq Scan on pg_language  (cost=0.00..1.04 rows=4 width=114)
> 
> so that works --- good.

Hm, yes.

Were you just curious or is it relevant for the documentation update?

>>> Looking at how the code behaves, it seems custom plans that are _more_
>>> expensive (plus planning cost) than the generic plan switch to the
>>> generic plan after five executions, as now documented.  Custom plans
>>> that are significantly _cheaper_ than the generic plan _never_ use the
>>> generic plan.
>>
>> Yes, that's what the suggested documentation improvement says as well,
>> right?
> 
> Yes.  What is odd is that it isn't the plan of the actual supplied
> parameters that is cheaper, just the generic plan that assumes each
> distinct value in the query is equally likely to be used.  So, when we
> say the generic plan is cheaper, it is just comparing the custom plan
> with the supplied parameters vs. the generic plan --- it is not saying
> that running the supplied constants with the generic plan will execute
> faster, because in fact we might be using a sub-optimial generic plan.

Right, that's why it is important to document that it is estimates that are
compared, not actual costs.

This has caused confussion in the past, see
https://www.postgresql.org/message-id/flat/561E749D.4090301%40socialserve.com#561e749d.4090...@socialserve.com

> Right.  Updated patch attached.

I am happy with the patch as it is.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-14 Thread Amit Langote

On 2016/06/14 6:51, Robert Haas wrote:
> On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas  wrote:
>> Although I have done a bit of review of this patch, it needs more
>> thought than I have so far had time to give it.  I will update again
>> by Tuesday.
> 
> I've reviewed this a bit further and have discovered an infelicity.
> The following is all with the patch applied.
> 
> By itself, this join can be pushed down:
> 
> contrib_regression=# EXPLAIN SELECT 13 FROM ft1 RIGHT JOIN ft2 ON
> ft1.c1 = ft2.c1;
>   QUERY PLAN
> --
>  Foreign Scan  (cost=100.00..137.66 rows=822 width=4)
>Relations: (public.ft2) LEFT JOIN (public.ft1)
> (2 rows)
> 
> That's great.  However, when that query is used as the outer rel of a
> left join, it can't:
> 
> contrib_regression=# explain verbose select * from ft4 LEFT JOIN
> (SELECT 13 FROM ft1 RIGHT JOIN ft2 ON ft1.c1 = ft2.c1) q on true;
>  QUERY PLAN
> -
>  Nested Loop Left Join  (cost=353.50..920.77 rows=41100 width=19)
>Output: ft4.c1, ft4.c2, ft4.c3, (13)
>->  Foreign Scan on public.ft4  (cost=100.00..102.50 rows=50 width=15)
>  Output: ft4.c1, ft4.c2, ft4.c3
>  Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3"
>->  Materialize  (cost=253.50..306.57 rows=822 width=4)
>  Output: (13)
>  ->  Hash Left Join  (cost=253.50..302.46 rows=822 width=4)
>Output: 13
>Hash Cond: (ft2.c1 = ft1.c1)
>->  Foreign Scan on public.ft2  (cost=100.00..137.66
> rows=822 width=4)
>  Output: ft2.c1
>  Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
>->  Hash  (cost=141.00..141.00 rows=1000 width=4)
>  Output: ft1.c1
>  ->  Foreign Scan on public.ft1
> (cost=100.00..141.00 rows=1000 width=4)
>Output: ft1.c1
>Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
> (18 rows)
> 
> Of course, because of the PlaceHolderVar, there's no way to push down
> the ft1-ft2-ft4 join to the remote side.  But we could still push down
> the ft1-ft2 join and then locally perform the join between the result
> and ft4.  However, the proposed fix doesn't allow that, because
> ph_eval_at is (b 4 5) and relids for the ft1-ft2 join is also (b 4 5),
> and so the bms_is_subset(phinfo->ph_eval_at, relids) test returns
> true.

You're right.  It indeed should be possible to push down ft1-ft2 join.
However it could not be done without also modifying
build_tlist_to_deparse() a little (as Ashutosh proposed [1] to do
upthread).  Attached new version of the patch with following changes:

1) Modified the check in foreign_join_ok() so that it is not overly
restrictive.  Previously, it used ph_needed as the highest level at which
the PHV is needed (by definition) and checked using it whether it is above
the join under consideration, which ended up being an overkill.  ISTM, we
can just decide from joinrel->relids of the join under consideration
whether we are above the lowest level where the PHV could be evaluated
(ph_eval_at) and stop if so.  So in the example you provided, it won't now
reject {ft1, ft2}, but will {ft4, ft1, ft2}.

2) Modified build_tlist_to_deparse() to get rid of the PlaceHolderVars in
the targetlist to deparse by using PVC_RECURSE_PLACEHOLDER flag of
pull_var_clause().  That is because we do not yet support anything other
than Vars in deparseExplicitTargetList() (+1 to your patch to change the
Assert to elog).

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpRfQRKNCvfPj8p%3DD9%2BDVZeuTfSN3hnGowKho%3DrKCSeD9dw%40mail.gmail.com
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 7d2512c..1abff3f 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -731,7 +731,9 @@ build_tlist_to_deparse(RelOptInfo *foreignrel)
 	 * We require columns specified in foreignrel->reltarget->exprs and those
 	 * required for evaluating the local conditions.
 	 */
-	tlist = add_to_flat_tlist(tlist, foreignrel->reltarget->exprs);
+	tlist = add_to_flat_tlist(tlist,
+		pull_var_clause((Node *) foreignrel->reltarget->exprs,
+		PVC_RECURSE_PLACEHOLDERS));
 	tlist = add_to_flat_tlist(tlist,
 			  pull_var_clause((Node *) fpinfo->local_conds,
 			  PVC_RECURSE_PLACEHOLDERS));
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 1de0bc4..73900d9 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2202,6 +2202,64 @@ SELECT t1.c1, t2.c1 FROM (ft5 t1 JOIN v_ft5 t2 ON (t1.c1 = t2.c1)) left join (ft
Remote SQL: SELECT c1 FROM "S 1"."T 4"
 (27 rows)
 
+-- non-Var items

Re: [HACKERS] Possible gaps/garbage in the output of XLOG reader

2016-06-14 Thread Michael Paquier

On Thu, Apr 9, 2015 at 7:05 PM, Antonin Houska  wrote:
> While playing with xlogreader, I was lucky enough to see one of the many
> record validations to fail. After having some fun with gdb, I found out that
> in some cases the reader does not enforce enough data to be in state->readBuf
> before copying into state->readRecordBuf starts. This should not happen if the
> callback always reads XLOG_BLCKSZ bytes, but in fact only *reqLen* is the
> mandatory size of the chunk delivered.
>
> There are probably various ways to fix this problem. Attached is what I did in
> my environment. I hit the problem on 9.4.1, but the patch seems to apply to
> master too.

This looks similar to what is discussed here:
https://www.postgresql.org/message-id/flat/caboikdpspbymig6j01dkq6om2+bnkxhtpkoyqhm2a4oywgk...@mail.gmail.com
And there is a different patch which takes a lower-level approach, and
it seems to me cleaner approach in its way of calculating the record
offset when it goes through several XLOG pages.

Perhaps you could help review it? It is attached to the next commit fest.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reviewing freeze map code

2016-06-14 Thread Thomas Munro

On Tue, Jun 14, 2016 at 4:02 AM, Robert Haas  wrote:
> On Sat, Jun 11, 2016 at 5:00 PM, Robert Haas  wrote:
>>> How about changing the return tuple of heap_prepare_freeze_tuple to
>>> a bitmap?  Two flags: "Freeze [not] done" and "[No] more freezing
>>> needed"
>>
>> Yes, I think something like that sounds about right.
>
> Here's a patch.  I took the approach of adding a separate bool out
> parameter instead.  I am also attaching an update of the
> check-visibility patch which responds to assorted review comments and
> adjusting it for the problems found on Friday which could otherwise
> lead to false positives.  I'm still getting occasional TIDs from the
> pg_check_visible() function during pgbench runs, though, so evidently
> not all is well with the world.

I'm still working out how half this stuff works, but I managed to get
pg_check_visible() to spit out a row every few seconds with the
following brute force approach:

CREATE TABLE foo (n int);
INSERT INTO foo SELECT generate_series(1, 10);

Three client threads (see attached script):
1.  Run VACUUM in a tight loop.
2.  Run UPDATE foo SET n = n + 1 in a tight loop.
3.  Run SELECT pg_check_visible('foo'::regclass) in a tight loop, and
print out any rows it produces.

I noticed that the tuples that it reported were always offset 1 in a
page, and that the page always had a maxoff over a couple of hundred,
and that we called record_corrupt_item because VM_ALL_VISIBLE returned
true but HeapTupleSatisfiesVacuum on the first tuple returned
HEAPTUPLE_DELETE_IN_PROGRESS instead of the expected HEAPTUPLE_LIVE.
It did that because HEAP_XMAX_COMMITTED was not set and
TransactionIdIsInProgress returned true for xmax.

Not sure how much of this was already obvious!  I will poke at it some
more tomorrow.

-- 
Thomas Munro
http://www.enterprisedb.com
# create table foo (n int);
# insert into foo select generate_series(1, 10);

import psycopg2
import threading
import time

keep_running = True

def vacuum_thread(dsn):
  conn = psycopg2.connect(dsn)
  conn.set_isolation_level(0)
  cursor = conn.cursor()
  while keep_running:
cursor.execute("VACUUM")

def update_thread(dsn):
  conn = psycopg2.connect(dsn)
  cursor = conn.cursor()
  while keep_running:
cursor.execute("UPDATE foo SET n = n + 1")
conn.commit()

def check_thread(dsn):
  conn = psycopg2.connect(dsn)
  conn.set_isolation_level(0)
  cursor = conn.cursor()
  while keep_running:
cursor.execute("SELECT pg_check_visible('foo'::regclass)")
for row in cursor:
  print row
conn.rollback()

if __name__ == "__main__":
  dsn = "dbname=postgres"
  t1 = threading.Thread(target = vacuum_thread, args = [dsn])
  t1.start()
  t2 = threading.Thread(target = update_thread, args = [dsn])
  t2.start()
  t3 = threading.Thread(target = check_thread, args = [dsn])
  t3.start()
  try:
time.sleep(86400)
  except KeyboardInterrupt:
keep_running = False
  t1.join()
  t2.join()
  t3.join()


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-14 Thread Ashutosh Bapat

On Tue, Jun 14, 2016 at 4:10 AM, Robert Haas  wrote:

> On Mon, Jun 13, 2016 at 5:51 PM, Robert Haas 
> wrote:
> > On Fri, Jun 10, 2016 at 4:14 PM, Robert Haas 
> wrote:
> >> Although I have done a bit of review of this patch, it needs more
> >> thought than I have so far had time to give it.  I will update again
> >> by Tuesday.
> >
> > I've reviewed this a bit further and have discovered an infelicity.
>
> Also, independent of the fix for this particular issue, I think it
> would be smart to apply the attached patch to promote the assertion
> that failed here to an elog().  If we have more bugs of this sort, now
> or in the future, I'd like to catch them even in non-assert-enabled
> builds by getting a sensible error rather than just by failing an
> assertion.  I think it's our general practice to check node types with
> elog() rather than Assert() when the nodes are coming from some
> far-distant part of the code, which is certainly the case here.
>
> I plan to commit this without delay unless there are vigorous,
> well-reasoned objections.
>

Fine with me. Serves the purpose for which I added the Assert, but in a
better manner. May be the error message can read "non-Var
nodes/targets/expressions not expected in target list". I am not sure what
do we call individual (whole) members of target list.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

61 matches

Mail list logo