Re: [HACKERS] Dockerfile for testing with Perl 5.8.8

2016-03-02 Thread Craig Ringer
On 3 March 2016 at 13:08, Craig Ringer  wrote:

> On 3 March 2016 at 11:06, Craig Ringer  wrote:
>
>> Hi all
>>
>> I've prepared a Dockerfile that produces a canned Perl 5.8.8 environment
>> based on CentOS 5 on any host with Docker. It has ccache installed and
>> enabled, git installed, IPC::Run installed, all the mess required to make
>> CPAN work sensibly done, etc.
>>
>>
> That one didn't include a new enough flex or bison, so it couldn't rebuild
> a fully clean tree that hadn't been built on the host first.
>
> Fixed in the attached along with some other cleanups.
>

You can now get this at https://github.com/2ndQuadrant/pgtap-perl


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Shulgin, Oleksandr
On Wed, Mar 2, 2016 at 7:33 PM, Alvaro Herrera 
wrote:

> Shulgin, Oleksandr wrote:
>
> > Alright.  I'm attaching the latest version of this patch split in two
> > parts: the first one is NULLs-related bugfix and the second is the
> > "improvement" part, which applies on top of the first one.
>
> So is this null-related bugfix supposed to be backpatched?  (I assume
> it's not because it's very likely to change existing plans).
>

For the good, because cardinality estimations will be more accurate in
these cases, so yes I would expect it to be back-patchable.

-- 
Alex


Re: [HACKERS] Confusing with commit time usage in logical decoding

2016-03-02 Thread Andres Freund
On 2016-03-01 18:31:42 +0100, Petr Jelinek wrote:
> On 01/03/16 18:18, Andres Freund wrote:
> >I'd rather just initialize commit_time to parsed->xact_time.
> >
> >This indeed is clearly a bug. I do wonder if anybody has a good idea
> >about how to add regression tests for this? It's rather annoying that
> >we have to suppress timestamps in the test_decoding tests, because
> >they're obviously not reproducible...
> >
> 
> The test for commit timestamps checks that the timestamps are within
> reasonable time frame (for example, bigger than value of a timestamp column
> in the table since that's assigned before commit obviously) , it's not
> perfect but similar approach should catch issues like this one.

Fixed, including such a test. Thanks for the report; and for the idea of
the fix!

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: redo failed in physical streaming replication while stopping the master server

2016-03-02 Thread Michael Paquier
On Wed, Mar 2, 2016 at 4:25 PM, wcting  wrote:
> /*
>  * If at page start, we must skip over the page header.  But
> we can't
>  * do that until we've read in the page, since the header
> size is
>  * variable.
>  */
>
> i don't know the meaning behind this comments,
>
>  if ((RecPtr->xrecoff % XLogSegSize) == 0)
> it's a long page header, else a short page header,
>
> so "the header size" can be calculated ? right?

This means that the page must be read first, before recalculating the
record pointer to be the first one after the page header. This is done
a little bit after in ReadRecord():
pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) readBuf);
targetRecOff = RecPtr->xrecoff % XLOG_BLCKSZ;
if (targetRecOff == 0)
{
/*
 * At page start, so skip over page header.  The Assert checks that
 * we're not scribbling on caller's record pointer; it's OK because we
 * can only get here in the continuing-from-prev-record case, since
 * XRecOffIsValid rejected the zero-page-offset case otherwise.
 */
Assert(RecPtr == &tmpRecPtr);
RecPtr->xrecoff += pageHeaderSize;
targetRecOff = pageHeaderSize;
}
And XLogPageHeaderSize() makes the difference between a long a short header.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-02 Thread Craig Ringer
On 3 March 2016 at 13:23, Michael Paquier  wrote:

> On Thu, Mar 3, 2016 at 2:20 PM, Craig Ringer 
> wrote:
> > On the Perl 5.8.8 test env I've set up now, per
> >
> >
> http://www.postgresql.org/message-id/camsr+ygr6pu-guyp-ft98xwxasc9n6j-awzaqxvw_+p3rtc...@mail.gmail.com
> >
> > master currently fails with
> >
> > t/004_timeline_switch."remove_tree" is not exported by the File::Path
> > module
> >
> > remove_tree is only in File::Path 2.07 but Perl 5.8.8 has 1.08. No
> buildfarm
> > member has complained, so clearly we don't actually test with the stated
> > supported Perl version.
> >
> > The attached patch fixes it to use the legacy File::Path interface
> 'rmtree'
> > so master runs on 588 again.
>
> Crap. Thanks for spotting that, I was careless. You could still use
> qw() and not use File::Path::rmtree, but that's a matter of taste.
>

New patch series.

The first three are simple fixes that should go in without fuss:

001 fixes the above 5.8.8 compat issue.

002  fixes another minor whoopsie, a syntax error in  src/test/recovery/t/
003_recovery_targets.pl that never got noticed because exit codes are
ignored.

003 runs perltidy on PostgresNode.pm to bring it back into conformance
after the recovery tests commit.

The rest are feature patches:

004 allows filtering on RecursiveCopy by a predicate function. Needed for
filesystem level backups (007). It could probably be squashed with 007 if
desired.

005 adds the new psql functions psql_expert and psql_check. Then 006
renames psql_expert to psql and fixes up all call sites across the tree to
use the new interface. I found the bug in 002 as part of that process. I
anticipate that 005 and 006 would be squashed into one commit to master,
but I've kept them separate in my tree for easier review.

007 adds PostgresNode support for hot and cold filesystem-level backups
using pg_start_backup and pg_stop_backup, which will be required for some
coming tests and are useful by themselves.


Each patch is pretty simple except for the psql patch, and even that's not
exactly hairy stuff. The potential risks are low as this is code that's not
part of the server and isn't even run on 'make check' by default. I've
tested it all under a real Perl 5.8.8 on CentOS 5.

I don't expect to be doing much more on the framework at this point as I
want to be able to get back to the code I had to enhance the framework in
order to test

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 7f5cf0ac7368790e4597642b59560fa1c16597d8 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 3 Mar 2016 13:18:23 +0800
Subject: [PATCH 1/7] TAP: File::Path::remove_tree is new in 2.x, use 1.x
 rmtree instead

Preserve Perl 5.8.8 support by using the legacy rmtree call
instead of remove_tree from File::Path 2.x.
---
 src/test/recovery/t/004_timeline_switch.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/test/recovery/t/004_timeline_switch.pl b/src/test/recovery/t/004_timeline_switch.pl
index 58bf580..a180b94 100644
--- a/src/test/recovery/t/004_timeline_switch.pl
+++ b/src/test/recovery/t/004_timeline_switch.pl
@@ -3,7 +3,7 @@
 # on a new timeline.
 use strict;
 use warnings;
-use File::Path qw(remove_tree);
+use File::Path;
 use PostgresNode;
 use TestLib;
 use Test::More tests => 1;
@@ -46,7 +46,7 @@ $node_master->teardown_node;
 $node_standby_1->promote;
 
 # Switch standby 2 to replay from standby 1
-remove_tree($node_standby_2->data_dir . '/recovery.conf');
+File::Path::rmtree($node_standby_2->data_dir . '/recovery.conf');
 my $connstr_1 = $node_standby_1->connstr;
 $node_standby_2->append_conf(
 	'recovery.conf', qq(
-- 
2.1.0

From c346e15f613931d0b8d65025076a998234dbe26e Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 3 Mar 2016 14:07:32 +0800
Subject: [PATCH 2/7] TAP: Fix syntax error in recovery tests

pg_create_restore_point was never running successfully in the
TAP tests for recovery because of a typo.
---
 src/test/recovery/t/003_recovery_targets.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/test/recovery/t/003_recovery_targets.pl b/src/test/recovery/t/003_recovery_targets.pl
index 8b581cc..a0d8b45 100644
--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -83,8 +83,8 @@ $node_master->psql('postgres',
 my $recovery_name = "my_target";
 my $lsn4 =
   $node_master->psql('postgres', "SELECT pg_current_xlog_location();");
-$node_master->psql('postgres',
-	"SELECT pg_create_restore_point('$recovery_name'");
+$node_master->psql_check('postgres',
+	"SELECT pg_create_restore_point('$recovery_name');");
 
 # Force archiving of WAL file
 $node_master->psql('postgres', "SELECT pg_switch_xlog()");
-- 
2.1.0

From ca0c28be54329473ef0bed07358b519835c5b447 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Tue, 1 Mar 2016 21:44:11 +0800
Subject: [PATCH 3/7] TAP: Perltidy PostgresNode.pm

Recen

Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

2016-03-02 Thread Michael Paquier
On Thu, Mar 3, 2016 at 3:34 PM, Michael Paquier
 wrote:
> That's one of my concerns about this patch now: it is trying to do too
> much. I think that there is definitely a need for both things:
> applications may be fine to pay the lagging price when remote_apply is
> used by not having an extra error handling layer, or they cannot
> accept a perhaps large of lag and are ready to have an extra error
> handling layer to manage read failures on standbys. So I would
> recommend a split to begin with:
> 1) Addition of remote_apply in synchronous_commit, which would be
> quite a simple patch, and I am convinced that there are benefits in
> having that. Compared to the previous patch sent, a standby message is
> sent back to the master once COMMIT has been applied, accelerating
> things a bit.

Hm. Looking now at
http://www.postgresql.org/message-id/canp8+j+jcpnoojc-kqltt4pdyox2sq6wywqcsy6aahwkvna...@mail.gmail.com
it would be nice to get a clear solution for it first, though the use
of signals to wake up the WAL receiver and enforce it to send a new
LSN apply position back to the master to unlock it asap does not look
very appealing. Seeing that no patch has been sent for 9.6 regarding
that, it would be better to simply drop this code from the causal-read
patch perhaps...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: "Causal reads" mode for load balancing reads without stale data

2016-03-02 Thread Michael Paquier
On Tue, Mar 1, 2016 at 11:53 AM, Thomas Munro
 wrote:
> On Tue, Mar 1, 2016 at 2:46 PM, Amit Langote
>  wrote:
>>
>> Hi,
>>
>> On 2016/02/29 18:05, Thomas Munro wrote:
>>> On Mon, Feb 29, 2016 at 9:05 PM, Amit Langote wrote:
 + servers.  A transaction that is run with
 causal_reads set
 + to on is guaranteed either to see the effects of all
 + completed transactions run on the primary with the setting on, 
 or to
 + receive an error "standby is not available for causal reads".

 "A transaction that is run" means "A transaction that is run on a
 standby", right?
>>>
>>> Well, it could be any server, standby or primary.  Of course standbys
>>> are the interesting case since it it was already true that if you run
>>> two sequential transactions run on the primary, the second can see the
>>> effect of the first, but I like the idea of a general rule that
>>> applies anywhere, allowing you not to care which server it is.
>>
>> I meant actually in context of that sentence only.
>
> Ok, here's a new version that includes that change, fixes a conflict
> with recent commit 10b48522 and removes an accidental duplicate copy
> of the README file.

Finally I got my eyes on this patch.

To put it short, this patch introduces two different concepts:
- addition of a new value, remote_apply, for synchronous_commit, which
is actually where things overlap a bit with the N-sync patch, because
by combining the addition of remote_apply with the N-sync patch, it is
possible to ensure that an LSN is applied to multiple targets instead
of one now. In this case, as the master will wait for the LSN to be
applied on the sync standby, there is no need for the application to
have error handling in case a read transaction is happening on the
standby as the change is already visible on the standby when
committing it on master. However the cost here is that if the standby
node lags behind, it puts some extra wait as well on the master side.
- casual reads, which makes the master able to drop standbys that are
lagging too much behind and let callers of standbys know that it is
lagging to much behind and cannot satisfy causal reads. In this case
error handling is needed by the application in case a standby is
lagging to much behind.

That's one of my concerns about this patch now: it is trying to do too
much. I think that there is definitely a need for both things:
applications may be fine to pay the lagging price when remote_apply is
used by not having an extra error handling layer, or they cannot
accept a perhaps large of lag and are ready to have an extra error
handling layer to manage read failures on standbys. So I would
recommend a split to begin with:
1) Addition of remote_apply in synchronous_commit, which would be
quite a simple patch, and I am convinced that there are benefits in
having that. Compared to the previous patch sent, a standby message is
sent back to the master once COMMIT has been applied, accelerating
things a bit.
2) Patch for causal reads, with all its extra parametrization logic
and stuff to select standby candidates.

Here is as well a more detailed review...

Regression tests are failing, rules.sql is generating diffs because
pg_stat_replication is changed.

CausalReadsWaitForLSN() should be called for 2PC I think, for PREPARE,
COMMIT PREPARED and ROLLBACK PREPARED. WAL replay for 2PC should also
call XLogRequestWalReceiverReply() when needed.

The new parameters are missing from postgresql.conf.sample.

+static bool
+SyncRepCheckEarlyExit(void)
+{
Doing this refactoring would actually make sense as a separate patch I
think, and that would simplify the core patch for causal reads.

+For this reason we say that the causal reads guarantee only holds as
+long as the absolute difference between the system clocks of the
+machines is no more than max_clock_skew.  The theory is that NTP makes
+it possible to reason about the maximum possible clock difference
+between machines and choose a value that allows for a much larger
+difference.  However, we do make a best effort attempt to detect
+misconfigured systems as described above, to catch the case of servers
+not running ntp a correctly configured ntp daemon, or with a clock so
+far out of whack that ntp refuses to fix it.
Just wondering how this reacts when standby and master are on
different timezones. I know of two ways to measure WAL lag: one is
what you are doing, by using a timestamp and rely on the two servers
to be in sync at clock level. The second is in bytes with a WAL
quantity, though it is less user-friendly to set up, say max_wal_lag
or similar, symbolized by the number of WAL segments the standby is
lagging behind, the concerns regarding clock sync across nodes go
away. To put it short, I find the timestamp approach easy to set up
and understand for the user, but not really solid as it depends much
on the state dependency between different hosts, while a difference
between flush and app

Re: [HACKERS] postgres_fdw vs. force_parallel_mode on ppc

2016-03-02 Thread Tom Lane
Noah Misch  writes:
> I've modified buildfarm member mandrill to use force_parallel_mode=regress and
> max_parallel_degree=5; a full run passes.  We'll now see if it intermittently
> fails the stats test, like Tom witnessed:
> http://www.postgresql.org/message-id/30385.1456077...@sss.pgh.pa.us

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2016-03-02%2023%3A34%3A10

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-03-02 Thread Kyotaro HORIGUCHI
At Wed, 2 Mar 2016 17:57:27 -0600, Jim Nasby  wrote 
in <56d77de7.7080...@bluetreble.com>
> On 3/2/16 5:41 PM, Tom Lane wrote:
> > Jim Nasby  writes:
> >> On 3/2/16 4:21 PM, Peter Geoghegan wrote:
> >>> I think you should commit this. The chances of anyone other than you
> >>> and Masahiko recalling that you developed this tool in 3 years is
> >>> essentially nil. I think that the cost of committing a developer-level
> >>> debugging tool like this is very low. Modules like pg_freespacemap
> >>> currently already have no chance of being of use to ordinary users.
> >>> All you need to do is restrict the functions to throw an error when
> >>> called by non-superusers, out of caution.
> >>>
> >>> It's a problem that modules like pg_stat_statements and
> >>> pg_freespacemap are currently lumped together in the documentation,
> >>> but we all know that.
> >
> >> +1.
> >
> > Would it make any sense to stick it under src/test/modules/ instead of
> > contrib/ ?  That would help make it clear that it's a debugging tool
> > and not something we expect end users to use.
> 
> I haven't looked at it in detail; is there something inherently
> dangerous about it?

I don't see any danger but the interface doesn't seem to fit use
by DBAs.

> When I'm forced to wear a DBA hat, I'd really love to be able to find
> out what VM status for a large table is. If it's in contrib they'll
> know the tool is there; if it's under src then there's about 0 chance
> of that. I'd think SU-only and any appropriate warnings would be
> enough heads-up for DBAs to be careful with it.

It looks to expose nothing about table contents. At lesast
anybody who can see the name of a table are safely allowable to
use this on it.

A possible usage (for me) of this would be directly couting
(un)vacuumed or (un)freezed pages in a relation. It would be
convenient that the 'freezed' and 'vacuumed' bits are in separate
integers. It would be usable even stats values for these bits are
shown in statistics views.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SP-GiST support for inet datatypes

2016-03-02 Thread Oleg Bartunov
On Thu, Mar 3, 2016 at 8:51 AM, Oleg Bartunov  wrote:

>
>
> On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli  wrote:
>
>> Attached patches add SP-GiST support to the inet datatypes.  The
>> operator class comes with a small change on the SP-GiST framework to allow
>> fixed number of child nodes.
>>
>> The index is like prefix tree except that it doesn't bother to split the
>> addresses into parts as text is split.  It also doesn't use labels to know
>> the part after the prefix, but relies on static node numbers.
>>
>>
> Thanks, Emre for interesting spgist. We are bit busy and will take a look
> on your patches when come to our spgist patch.
>
>

Emre, I checked original thread and didn't find sample data. Could you
provide them for testing ?


> The GiST index released with version 9.4 performs really bad with real
>> world data.  SP-GiST works much better with the query posted to the
>> performance list [1] a while ago:
>>
>> > hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
>> 2914;
>> > SELECT 732
>> >
>> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
>> routes.route && hmm.route;
>> >QUERY
>> PLAN
>> >
>> 
>> >  Nested Loop  (cost=0.41..571742.27 rows=2248 width=7) (actual
>> time=12.643..20474.813 rows=8127 loops=1)
>> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
>> time=0.017..0.524 rows=732 loops=1)
>> >->  Index Only Scan using route_gist on routes  (cost=0.41..552.05
>> rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
>> >  Index Cond: (route && (hmm.route)::inet)
>> >  Heap Fetches: 8127
>> >  Planning time: 1.507 ms
>> >  Execution time: 20475.605 ms
>> > (7 rows)
>> >
>> > hasegeli=# DROP INDEX route_gist;
>> > DROP INDEX
>> >
>> > hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
>> > CREATE INDEX
>> >
>> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
>> routes.route && hmm.route;
>> >   QUERY PLAN
>> >
>> -
>> >  Nested Loop  (cost=0.41..588634.27 rows=2248 width=7) (actual
>> time=0.081..16.961 rows=8127 loops=1)
>> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
>> time=0.022..0.079 rows=732 loops=1)
>> >->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
>> rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
>> >  Index Cond: (route && (hmm.route)::inet)
>> >  Heap Fetches: 8127
>> >  Planning time: 1.376 ms
>> >  Execution time: 15.936 ms
>>
>> [1]
>> http://www.postgresql.org/message-id/flat/alpine.DEB.2.11.1508251504160.31004@pyrite#alpine.DEB.2.11.1508251504160.31004@pyrite
>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>


Re: [HACKERS] SP-GiST support for inet datatypes

2016-03-02 Thread Oleg Bartunov
On Wed, Mar 2, 2016 at 11:56 PM, Emre Hasegeli  wrote:

> Attached patches add SP-GiST support to the inet datatypes.  The operator
> class comes with a small change on the SP-GiST framework to allow fixed
> number of child nodes.
>
> The index is like prefix tree except that it doesn't bother to split the
> addresses into parts as text is split.  It also doesn't use labels to know
> the part after the prefix, but relies on static node numbers.
>
>
Thanks, Emre for interesting spgist. We are bit busy and will take a look
on your patches when come to our spgist patch.


> The GiST index released with version 9.4 performs really bad with real
> world data.  SP-GiST works much better with the query posted to the
> performance list [1] a while ago:
>
> > hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
> 2914;
> > SELECT 732
> >
> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> >QUERY PLAN
> >
> 
> >  Nested Loop  (cost=0.41..571742.27 rows=2248 width=7) (actual
> time=12.643..20474.813 rows=8127 loops=1)
> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.017..0.524 rows=732 loops=1)
> >->  Index Only Scan using route_gist on routes  (cost=0.41..552.05
> rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
> >  Index Cond: (route && (hmm.route)::inet)
> >  Heap Fetches: 8127
> >  Planning time: 1.507 ms
> >  Execution time: 20475.605 ms
> > (7 rows)
> >
> > hasegeli=# DROP INDEX route_gist;
> > DROP INDEX
> >
> > hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
> > CREATE INDEX
> >
> > hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
> routes.route && hmm.route;
> >   QUERY PLAN
> >
> -
> >  Nested Loop  (cost=0.41..588634.27 rows=2248 width=7) (actual
> time=0.081..16.961 rows=8127 loops=1)
> >->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
> time=0.022..0.079 rows=732 loops=1)
> >->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
> rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
> >  Index Cond: (route && (hmm.route)::inet)
> >  Heap Fetches: 8127
> >  Planning time: 1.376 ms
> >  Execution time: 15.936 ms
>
> [1]
> http://www.postgresql.org/message-id/flat/alpine.DEB.2.11.1508251504160.31004@pyrite#alpine.DEB.2.11.1508251504160.31004@pyrite
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-02 Thread Michael Paquier
On Thu, Mar 3, 2016 at 2:20 PM, Craig Ringer  wrote:
> On the Perl 5.8.8 test env I've set up now, per
>
> http://www.postgresql.org/message-id/camsr+ygr6pu-guyp-ft98xwxasc9n6j-awzaqxvw_+p3rtc...@mail.gmail.com
>
> master currently fails with
>
> t/004_timeline_switch."remove_tree" is not exported by the File::Path
> module
>
> remove_tree is only in File::Path 2.07 but Perl 5.8.8 has 1.08. No buildfarm
> member has complained, so clearly we don't actually test with the stated
> supported Perl version.
>
> The attached patch fixes it to use the legacy File::Path interface 'rmtree'
> so master runs on 588 again.

Crap. Thanks for spotting that, I was careless. You could still use
qw() and not use File::Path::rmtree, but that's a matter of taste.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2016-03-02 Thread Haribabu Kommi
On Fri, Feb 5, 2016 at 2:29 PM, Haribabu Kommi  wrote:
>
>This patch needs to be applied on top discard_hba_and_ident_cxt patch
>that is posted earlier.

Here I attached a re-based patch to the latest head with inclusion of
discard_hba_ident_cxt patch for easier review as a single patch.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v13.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-02 Thread Craig Ringer
On 3 March 2016 at 05:26, Alvaro Herrera  wrote:

>
> Pushed it with that fix.  I also added a further "data_" prefix, so it's
> "data_${name}_" now.  Hopefully this is less problematic than
> yesterday's ...
>
>
On the Perl 5.8.8 test env I've set up now, per

http://www.postgresql.org/message-id/camsr+ygr6pu-guyp-ft98xwxasc9n6j-awzaqxvw_+p3rtc...@mail.gmail.com

master currently fails with

t/004_timeline_switch."remove_tree" is not exported by the File::Path
module

remove_tree is only in File::Path 2.07 but Perl 5.8.8 has 1.08. No
buildfarm member has complained, so clearly we don't actually test with the
stated supported Perl version.

The attached patch fixes it to use the legacy File::Path interface 'rmtree'
so master runs on 588 again.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From a703cd78cf834df794253bb1cfd897daa96d244a Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 3 Mar 2016 13:18:23 +0800
Subject: [PATCH] TAP: File::Path::remove_tree is new in 2.x, use 1.x rmtree
 instead

Preserve Perl 5.8.8 support by using the legacy rmtree call
instead of remove_tree from File::Path 2.x.
---
 src/test/recovery/t/004_timeline_switch.pl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/test/recovery/t/004_timeline_switch.pl b/src/test/recovery/t/004_timeline_switch.pl
index 58bf580..a180b94 100644
--- a/src/test/recovery/t/004_timeline_switch.pl
+++ b/src/test/recovery/t/004_timeline_switch.pl
@@ -3,7 +3,7 @@
 # on a new timeline.
 use strict;
 use warnings;
-use File::Path qw(remove_tree);
+use File::Path;
 use PostgresNode;
 use TestLib;
 use Test::More tests => 1;
@@ -46,7 +46,7 @@ $node_master->teardown_node;
 $node_standby_1->promote;
 
 # Switch standby 2 to replay from standby 1
-remove_tree($node_standby_2->data_dir . '/recovery.conf');
+File::Path::rmtree($node_standby_2->data_dir . '/recovery.conf');
 my $connstr_1 = $node_standby_1->connstr;
 $node_standby_2->append_conf(
 	'recovery.conf', qq(
-- 
2.1.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue with NULLS LAST, with postgres_fdw sort pushdown

2016-03-02 Thread Tom Lane
Ashutosh Bapat  writes:
> On Thu, Mar 3, 2016 at 7:27 AM, Michael Paquier 
> wrote:
>> Per explain.c, this looks inconsistent to me. Shouldn't NULLS LAST be
>> applied only if DESC is used in this ORDER BY clause?

> ... In this case we are constructing a query to be
> sent to the foreign server and it's better not to leave the defaults to be
> interpreted by the foreign server; in case it interprets them in different
> fashion. get_rule_orderby() also explicitly adds these options.

Yeah, I agree that we don't need to go out of our way to make the query
succinct here.  Explicitness is easier and safer too, so why not?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dockerfile for testing with Perl 5.8.8

2016-03-02 Thread Craig Ringer
On 3 March 2016 at 11:06, Craig Ringer  wrote:

> Hi all
>
> I've prepared a Dockerfile that produces a canned Perl 5.8.8 environment
> based on CentOS 5 on any host with Docker. It has ccache installed and
> enabled, git installed, IPC::Run installed, all the mess required to make
> CPAN work sensibly done, etc.
>
>
That one didn't include a new enough flex or bison, so it couldn't rebuild
a fully clean tree that hadn't been built on the host first.

Fixed in the attached along with some other cleanups.

I'm not proposing to commit this of course, though I'll probably stash it
on github and link to it in the wiki.  I just think it'll be a useful tool
for those of us doing TAP framework stuff - and anyone writing tests, since
those too will have to work with $ancientperl. (Yes, I stand by "ancient" -
CentOS 5, Debian Etch? Ancient.)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Dockerfile
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On columnar storage (2)

2016-03-02 Thread Haribabu Kommi
On Mon, Feb 1, 2016 at 12:11 AM, Alvaro Herrera
 wrote:
> So we discussed some of this stuff during the developer meeting in
> Brussels and the main conclusion is that we're going to split this up in
> multiple independently useful pieces, and write up the general roadmap
> in the wiki so that we can discuss in detail on-list.
>
> I'm marking this as Returned with Feedback now.
>
> Thanks everybody,

Here I attached the DBT-3 performance report that is measured on the
prototype patch
that is written for columnar storage as I mentioned in my earlier mail
with WOS and ROS
design.

Currently to measure the benefits of this design, we did the following changes,
1. Created the columnar storage index similar like other index methods
2. Used custom plan to generate the plan that can use the columnar storage
3. Optimized parallelism to use the columnar storage

The code is not fully ready yet, I posted the performance results to
get a view from
community, whether this approach is really beneficial?

I will provide the full details of the design and WIP patches later.

Regards,
Hari Babu
Fujitsu Australia


DBT3_performance_vci_community.xls
Description: MS-Excel spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

2016-03-02 Thread Tomas Vondra

Hi,

On 02/03/2016 06:46 AM, Noah Misch wrote:

On Mon, Feb 01, 2016 at 07:03:45PM +0100, Tomas Vondra wrote:

On 12/22/2015 03:49 PM, Noah Misch wrote:

On Mon, Feb 18, 2013 at 06:19:12PM -0300, Alvaro Herrera wrote:

I have pushed it now.  Further testing, of course, is always welcome.


While investigating stats.sql buildfarm failures, mostly on animals
axolotl and shearwater, I found that this patch (commit 187492b)
inadvertently removed the collector's ability to coalesce inquiries.
Every PGSTAT_MTYPE_INQUIRY received now causes one stats file write.
Before, pgstat_recv_inquiry() did:

if (msg->inquiry_time > last_statrequest)
last_statrequest = msg->inquiry_time;

and pgstat_write_statsfile() did:

globalStats.stats_timestamp = GetCurrentTimestamp();
... (work of writing a stats file) ...
last_statwrite = globalStats.stats_timestamp;
last_statrequest = last_statwrite;

If the collector entered pgstat_write_statsfile() with more inquiries
waiting in its socket receive buffer, it would ignore them as being too
old once it finished the write and resumed message processing. Commit
187492b converted last_statrequest to a "last_statrequests" list that we
wipe after each write.


So I've been looking at this today, and I think the attached patch should do
the trick. I can't really verify it, because I've been unable to reproduce the
non-coalescing - I presume it requires much slower system (axolotl is RPi, not
sure about shearwater).

The patch simply checks DBEntry,stats_timestamp in pgstat_recv_inquiry() and 
ignores requests that are already resolved by the last write (maybe this should 
accept a file written up to PGSTAT_STAT_INTERVAL in the past).


The required field is already in DBEntry (otherwise it'd be impossible to 
determine if backends need to send inquiries or not), so this is pretty trivial 
change. I can't think of a simpler patch.


Can you try applying the patch on a machine where the problem is reproducible? 
I might have some RPi machines laying around (for other purposes).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgstat-coalesce-v1.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Tom Lane
David Rowley  writes:
> I agree that it would be good to get this in as soon as possible. I'm
> currently very close to being done with writing Parallel Aggregate on
> top of the upper planner changes. So far this version is much cleaner
> as there's less cruft added compared with the other version,

Cool!  If you come across any points where it seems like it could be
done better or more easily, that would be tremendously valuable feedback
at this stage.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue with NULLS LAST, with postgres_fdw sort pushdown

2016-03-02 Thread Ashutosh Bapat
On Thu, Mar 3, 2016 at 7:27 AM, Michael Paquier 
wrote:

> On Wed, Mar 2, 2016 at 7:04 PM, Rajkumar Raghuwanshi
>  wrote:
> > On Wed, Mar 2, 2016 at 2:35 PM, Ashutosh Bapat
> >  wrote:
> >>
> >> Thanks Rajkumar for your report. Let me know if the attached patch fixes
> >> the issue.
>
>  if (pathkey->pk_nulls_first)
>  appendStringInfoString(buf, " NULLS FIRST");
> +else
> +appendStringInfoString(buf, " NULLS LAST");
> Per explain.c, this looks inconsistent to me. Shouldn't NULLS LAST be
> applied only if DESC is used in this ORDER BY clause?
>


I assume that you are referring to show_sortorder_options().

As per PG documentation
http://www.postgresql.org/docs/9.4/static/queries-order.html, "By default,
null values sort as if larger than any non-null value; that is, NULLS FIRST
is the default for DESC order, and NULLS LAST otherwise." What
show_sortorder_options() is doing is just trying to avoid printing the
defaults, which is arguably fine for an explain output; it leaves defaults
to be interpreted by user. In this case we are constructing a query to be
sent to the foreign server and it's better not to leave the defaults to be
interpreted by the foreign server; in case it interprets them in different
fashion. get_rule_orderby() also explicitly adds these options.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] extend pgbench expressions with functions

2016-03-02 Thread Michael Paquier
On Wed, Mar 2, 2016 at 3:10 AM, Robert Haas  wrote:
> On Wed, Feb 17, 2016 at 3:22 AM, Fabien COELHO  wrote:
>> Indeed. My gcc 4.8.4 with --Wall does not show the warning, too bad.
>>
>> Attached is the fixed patch for the array method.
>
> Committed with a few tweaks, including running pgindent over some of it.

Thanks. So the first set of functions is in, and the operators are
executed as functions as well. Fabien, are you planning to send
rebased versions of the rest? By that I mean the switch of the
existing subcommands into equivalent functions and the handling of
double values as parameters for those functions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread David Rowley
On 3 March 2016 at 04:29, Alvaro Herrera  wrote:
> Simon Riggs wrote:
>
>> ISTM that we are clearly "going for it"; everybody agrees we should apply
>> the patch now.
>>
>> The longer we hold off on applying it, the longer we wait for dependent
>> changes.
>
> Agreed -- we need this in tree as soon as realistically possible.
>
> There is a a bit a problem here, because this patch conflicts heavily
> with at least one other patch that's been in the queue for a long time,
> which is Kommi/Rowley's patch for parallel aggregation; the more we
> delay applying this one, the worse the deadlines for that one.
>
> I assume they are hard at work updating that patch to apply on top of
> Tom's patch.  It's not realistic to expect that we would apply any
> further planner changes before this one is in.

I agree that it would be good to get this in as soon as possible. I'm
currently very close to being done with writing Parallel Aggregate on
top of the upper planner changes. So far this version is much cleaner
as there's less cruft added compared with the other version, of which
would need to be removed again after the upper planner changes are in
anyway. Putting parallel aggregate in first would be asking Tom to
re-invent parallel aggregate when he rebases the upper planner stuff
on the new master branch, which makes very little sense.

So I agree that it would be nice to get the upper planner changes in
first, but soon! not at the end of March 'fest, as doing so would most
likely kill parallel aggregate for 9.6, and I kinda think that would
be silly as (I think) it's pretty much the biggest missing piece of
the parallel query set.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql completion for ids in multibyte string

2016-03-02 Thread Thomas Munro
On Thu, Mar 3, 2016 at 2:07 PM, Kyotaro HORIGUCHI
 wrote:
> Hello, thank you for the comments.
>
> At Wed, 2 Mar 2016 10:10:55 +1300, Thomas Munro 
>  wrote in 
> 
>> On Wed, Mar 2, 2016 at 7:54 AM, Robert Haas  wrote:
>> > On Mon, Feb 29, 2016 at 6:32 PM, Thomas Munro
>> >  wrote:
>> >> On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
>> >>  wrote:
>> >>> Hello, this is the second patch plitted out. This allows
>> >>> multibyte names to be completed in psql.
>> >>>
>> >>> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro 
>> >>> HORIGUCHI  wrote in 
>> >>> <20151106.114717.170526268.horiguchi.kyot...@lab.ntt.co.jp>
>>  At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote 
>>   wrote in 
>>  <563b224b.3020...@lab.ntt.co.jp>
>>  > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
>>  > > Hello. I don't know whether this is a bug fix or improvement,
>>  >
>>  > Would it be 50-50? :-)
>> 
>>  Yeah, honestly saying, I doubt that this is valuable but feel
>>  uneasy to see some of the candidates vanish as completon proceeds
>>  for no persuasive reason.
>> >>
>> >> +1 from me, it's entirely reasonable to want to name database objects
>> >> in any human language and use auto-completion.  It's not working today
>> >> as you showed.
>> >>
>> >>> The current version of tab-completion failed to complete names
>> >>> with multibyte characters.
>> >>>
>> >>> =# create table いろは (あ int);
>> >>> =# create table いこい (あ int);
>> >>> =# drop table 
>> >>> "いろは" hint_plan.   pg_toast.
>> >>> "いこい" information_schema.  pg_toast_temp_1.
>> >>> ALL IN TABLESPACEpg_catalog.  public.
>> >>> dbms_stats.  pg_temp_1.
>> >>> postgres=# alter table "い
>> >>> =# drop table "い
>> >>> =# drop table "い   /* No candidate offered */
>> >>>
>> >>> This is because _complet_from_query counts the string length in
>> >>> bytes, instead of characters. With this patch the candidates will
>> >>> appear.
>> >>>
>> >>> =# drop table "い
>> >>> "いこい"  "いろは"
>> >>> postgres=# drpo table "い
>> >>
>> >> The patch looks correct to me: it counts characters rather than bytes,
>> >> which is the right thing to do because the value is passed to substr()
>> >> which also works in characters rather than bytes.  I tested with
>> >> "éclair", and without the patch, tab completion doesn't work if you
>> >> press tab after 'é'.  With the patch it does.
>> >
>> > OK, but I am a little concerned about this code:
>> >
>> > /* Find something that matches */
>> > if (result && PQresultStatus(result) == PGRES_TUPLES_OK)
>> > {
>> > const char *item;
>> >
>> > while (list_index < PQntuples(result) &&
>> >(item = PQgetvalue(result, list_index++, 0)))
>> > if (pg_strncasecmp(text, item, string_length) == 0)
>> > return pg_strdup(item);
>> > }
>> >
>> > Every other use of string_length in this function is using it as the
>> > argument to the SQL substring function, which cares about characters,
>> > not bytes.  But this use seems to care about bytes, not characters.
>> >
>> > Am I wrong?
>>
>> Ugh, and the other problem is that string_length is always 0 there if
>> state isn't 0 (in master it is static so that the value is reused for
>> subsequent calls, but this patch made it automatic).
>
> Thanks for pointing it out.
>
>> I think we should leave string_length as it is and use a new variable
>> for character-based length, as in the attached.
>
> Basically agreed but I like byte_length for the previous
> string_length and string_length for string_length_cars. Also
> text_length is renamed in the attached patch.
>
> What do you think about this?

It seems fine this way too.  Maybe you should also rename
string_length to byte_length in create_or_drop_command_generator, just
for consistency.

One peculiarity I noticed when testing this on MacOSX 10.10.2 with
Apple-supplied libedit is that the following sequence does something
strange:

CREATE TABLE いこい ();
CREATE TABLE いこいこ ();
CREATE TABLE いろは ();
SELECT *  FROM "い
-> the line magically changes to:
SELECT *  FROM 

If you press tab at any other point in any of those Japanese strings,
it works correctly.  I couldn't make anything similar happen with
Latin multibyte characters, and it doesn't happen on Debian with
libreadline, or on MacOSX with libreadline (installed from MacPorts).
I suspect this is a problem with Apple libedit and not with psql or
your patch, but I didn't have time to investigate further.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Oleg Bartunov
On Mar 3, 2016 4:47 AM, "Michael Paquier"  wrote:
>
> On Wed, Mar 2, 2016 at 6:54 PM, Alexander Korotkov
>  wrote:
> > If FDWs existed then Postgres XC/XL were being developed then I believe
they
> > would try to build full-featured prototype of FDW based sharding. If
this
> > prototype succeed then we could make a full roadmap.
>
> Speaking here with my XC hat, that's actually the case. A couple of
> years back when I worked on it, there were discussions about reusing
> FDW routines for the purpose of XC, which would have been roughly
> reusing postgres_fdw + the possibility to send XID, snapshot and
> transaction timestamp to the remote nodes after getting that from the
> GTM (global transaction manager ensuring global data visibility and
> consistency), and have the logic for query pushdown in the FDW itself
> when planning query on what would have been roughly foreign tables
> (not entering in the details here, those would have not been entirely
> foreign tables). At this point the global picture was not completely
> set, XC being based on 9.1~9.2 and the FDW base routines were not as
> extended as they are now. As history has told, this global picture has
> never showed up, though it would should XC have been merged with 9.3.
> The point is that XC would have moved as using the FDW approach, as a
> set of plugins.
>
> This was a reason behind this email of 2013 on -hackers actually:
>
http://www.postgresql.org/message-id/cab7npqtdjf-58wuf-xz01nkj7wf0e+eukggqhd0igvsod4h...@mail.gmail.com

Good to remember!

> Michael
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Dockerfile for testing with Perl 5.8.8

2016-03-02 Thread Craig Ringer
Hi all

I've prepared a Dockerfile that produces a canned Perl 5.8.8 environment
based on CentOS 5 on any host with Docker. It has ccache installed and
enabled, git installed, IPC::Run installed, all the mess required to make
CPAN work sensibly done, etc.

Once you build the container once you can quickly re-use it to run test
builds. You can map a postgres source tree, working tree and ccache
directory from the host to make them persistent.

Understand that this container runs code fetched directly off the 'net as
root, albeit root within a container, during setup. It avoids running all
the perl/cpan stuff as root though, switching to a normal user as soon as
it's installed the required RPMs from centos repos and rpmforge, so it's
not just curl'ing random scripts into bash/perl as root.

To build the container:

mkdir pgtaptest
cp /path/to/Dockerfile pgtaptest
cd pgtaptest
docker build -t pgtaptest .

To run a shell in the container environment:

alias pgtaptest="docker run -i -t \
  -v /path/to/my/postgres/tree:/postgres \
  -v /path/to/builddir:/pgbuild \
  -v /path/to/ccache:/ccache \
  pgtaptest'

pgtaptest

Once in the working environment's shell, /postgres is the mapped source
tree, /pgbuild is the mapped build tree (or an empty dir, if you didn't map
it), and /ccache is the mapped ccache or - again - an empty tree if you
didn't map it. Unmapped volumes get discarded when the container instance
exits.

There's an alias preconfigured, recovery-check, that configures and makes
postgres and runs the recovery tests. To run it directly from the host,
assuming you created the alias above, just:

pgtaptest recovery-check

I've accepted some ugly inline script generation in the dockerfile to avoid
needing separate files, so it's self-contained.

Happy Perl 5.8'ing!

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Dockerfile
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-03-02 Thread Peter Eisentraut
On 2/11/16 9:30 PM, Michael Paquier wrote:
>> Well, Yury was saying upthread that some MSVC versions have a problem
>> with the existing coding, which would be a reason to back-patch ...
>> but I'd like to see a failing buildfarm member first.  Don't particularly
>> want to promise to support compilers not represented in the farm.
> 
> Grmbl. Forgot to attach the rebased patch upthread. Here is it now.
> 
> As of now the only complain has been related to MS2015 and MS2013. If
> we follow the pattern of cec8394b and [1], support to compile on newer
> versions of MSVC would be master and REL9_5_STABLE, but MS2013 is
> supported down to 9.3. Based on this reason, we would want to
> backpatch down to 9.3 the patch of this thread.

After reviewing this thread and relevant internet lore, I think this
might be the wrong way to address this problem.  It is in general not
guaranteed in C that a Boolean-sounding function or macro returns 0 or
1.  Prime examples are ctype.h functions such as isupper().  This is
normally not a problem because built-in conditionals such as if, &&, ||
handle this just fine.  So code like

-   Assert(!create || !!txn);
+   Assert(!create || txn != NULL);

is arguably silly either way.  There is no risk in writing just

Assert(!create || txn);

The problem only happens if you compare two "Boolean" values directly
with each other; and so maybe you shouldn't do that, or at least place
the extra care there instead, instead of fighting a permanent battle
with the APIs you're using.  (This isn't an outrageous requirement: You
can't compare floats or strings either without extra care.)

A quick look through the code based on the provided patch shows that
approximately the only place affected by this is

if (isLeaf != GinPageIsLeaf(page) || isData != GinPageIsData(page))
elog(ERROR, "right sibling of GIN page is of different type");

and that's not actually a problem because isLeaf and isData are earlier
populated by the same macros.  It would still be worth fixing, but a
localized fix seems better.


Now on the matter of stdbool, I tried putting an #include 
near the top of c.h and compile that to see what would happen.  This is
the first warning I see:

ginlogic.c: In function 'shimTriConsistentFn':
ginlogic.c:171:24: error: comparison of constant '2' with boolean
expression is always false [-Werror=bool-compare]
   if (key->entryRes[i] == GIN_MAYBE)
^

and then later on something related:

../../../../src/include/tsearch/ts_utils.h:107:13: note: expected '_Bool
(*)(void *, QueryOperand *) {aka _Bool (*)(void *, struct 
*)}' but argument is of type 'GinTernaryValue (*)(void *, QueryOperand
*) {aka char (*)(void *, struct  *)}'

So the compiler is actually potentially helpful, but as it stands,
PostgreSQL code is liable to break if you end up with stdbool.h somehow.

(plperl also fails to compile because of a hot-potato game about who is
actually responsible for defining bool.)

So one idea would be to actually get ahead of the game, include
stdbool.h if available, fix the mentioned issues, and maybe get more
robust code that way.

But the lore on the internet casts some doubt on that: There is no
guarantee that bool is 1 byte, that bool can be passed around like char,
or even that bool arrays are laid out like char arrays.  Maybe this all
works out okay, just like it has worked out so far that int is 4 bytes,
but we don't know enough about it.  We could probably add some configure
tests around that.

We could also go the other way and forcibly undefine an existing bool
type (since stdbool.h is supposed to use macros, not typedefs).  But
that might not work well if a header that is included later pulls in
stdbool.h on top of that.


My proposal on this particular patch is to do nothing.  The stdbool
issues should be looked into, for the sake of Windows and other
future-proofness.  But that will likely be an entirely different patch.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Tatsuo Ishii
> On Wed, Mar 2, 2016 at 6:54 PM, Alexander Korotkov
>  wrote:
>> If FDWs existed then Postgres XC/XL were being developed then I believe they
>> would try to build full-featured prototype of FDW based sharding. If this
>> prototype succeed then we could make a full roadmap.
> 
> Speaking here with my XC hat, that's actually the case. A couple of
> years back when I worked on it, there were discussions about reusing
> FDW routines for the purpose of XC, which would have been roughly
> reusing postgres_fdw + the possibility to send XID, snapshot and
> transaction timestamp to the remote nodes after getting that from the
> GTM (global transaction manager ensuring global data visibility and
> consistency), and have the logic for query pushdown in the FDW itself
> when planning query on what would have been roughly foreign tables
> (not entering in the details here, those would have not been entirely
> foreign tables). At this point the global picture was not completely
> set, XC being based on 9.1~9.2 and the FDW base routines were not as
> extended as they are now. As history has told, this global picture has
> never showed up, though it would should XC have been merged with 9.3.
> The point is that XC would have moved as using the FDW approach, as a
> set of plugins.
> 
> This was a reason behind this email of 2013 on -hackers actually:
> http://www.postgresql.org/message-id/cab7npqtdjf-58wuf-xz01nkj7wf0e+eukggqhd0igvsod4h...@mail.gmail.com
> 
> There were as well discussions about making the connection pooler a
> background worker and plug in that in a shared memory context that all
> backends connecting to this XC-like-postgres_fdw would use, though
> this is another story, for another time...

Thanks for the history. Very interesting...

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue with NULLS LAST, with postgres_fdw sort pushdown

2016-03-02 Thread Michael Paquier
On Wed, Mar 2, 2016 at 7:04 PM, Rajkumar Raghuwanshi
 wrote:
> On Wed, Mar 2, 2016 at 2:35 PM, Ashutosh Bapat
>  wrote:
>>
>> Thanks Rajkumar for your report. Let me know if the attached patch fixes
>> the issue.

 if (pathkey->pk_nulls_first)
 appendStringInfoString(buf, " NULLS FIRST");
+else
+appendStringInfoString(buf, " NULLS LAST");
Per explain.c, this looks inconsistent to me. Shouldn't NULLS LAST be
applied only if DESC is used in this ORDER BY clause?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Michael Paquier
On Wed, Mar 2, 2016 at 6:54 PM, Alexander Korotkov
 wrote:
> If FDWs existed then Postgres XC/XL were being developed then I believe they
> would try to build full-featured prototype of FDW based sharding. If this
> prototype succeed then we could make a full roadmap.

Speaking here with my XC hat, that's actually the case. A couple of
years back when I worked on it, there were discussions about reusing
FDW routines for the purpose of XC, which would have been roughly
reusing postgres_fdw + the possibility to send XID, snapshot and
transaction timestamp to the remote nodes after getting that from the
GTM (global transaction manager ensuring global data visibility and
consistency), and have the logic for query pushdown in the FDW itself
when planning query on what would have been roughly foreign tables
(not entering in the details here, those would have not been entirely
foreign tables). At this point the global picture was not completely
set, XC being based on 9.1~9.2 and the FDW base routines were not as
extended as they are now. As history has told, this global picture has
never showed up, though it would should XC have been merged with 9.3.
The point is that XC would have moved as using the FDW approach, as a
set of plugins.

This was a reason behind this email of 2013 on -hackers actually:
http://www.postgresql.org/message-id/cab7npqtdjf-58wuf-xz01nkj7wf0e+eukggqhd0igvsod4h...@mail.gmail.com

There were as well discussions about making the connection pooler a
background worker and plug in that in a shared memory context that all
backends connecting to this XC-like-postgres_fdw would use, though
this is another story, for another time...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Michael Paquier
On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera
 wrote:
> Well, the CopyData message has an Int32 field for the message length.
> I don't know the FE/BE protocol very well but I suppose each row
> corresponds to one CopyData message, or perhaps each column corresponds
> to one CopyData message.  In either case, it's not possible to go beyond
> 2GB without changing the protocol ...

Based on what I know from this stuff (OOM libpq and other stuff
remnants), one 'd' message means one row. fe-protocol3.c and
CopySendEndOfRow in backend's copy.c are confirming that as well. I am
indeed afraid that having extra logic to get chunks of data will
require extending the protocol with a new message type for this
purpose.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-03-02 Thread Thomas Munro
On Sun, Feb 28, 2016 at 8:04 AM, Masahiko Sawada  wrote:
> Attached latest version patch.
>
> The changes from previous version are,
> - Fix parser, lexer bugs.
> - Add regression test patch based on patch Suraji submitted.
>
> Please review it.
>
> [000_multi_sync_replication_v13.patch]

Hi Masahiko,

Hi,

I have a couple of small suggestions for the documentation and comments:

+Specifies a standby names that can support
synchronous replication using
+either two types of syntax; comma-separated list or dedicated
language, as
+described in .
+Transcations waiting for commit will be allowed to proceed after the
+specified number of standby servers confirms receipt of their data.

Suggestion: Specifies the standby names that can support
synchronous replication using either of two syntaxes: a
comma-separated list, or a more flexible syntax described in .  Transactions waiting for commit
will be allowed to proceed after a configurable subset of standby
servers confirms receipt of their data.  For the simple
comma-separated list syntax, it is one server.

+If the current any of synchronous standbys disconnects for
whatever reason,

s/the current any of/any of the current/

+no mechanism to enforce uniqueness. For each specified standby name,
+only the specified count of standbys will be chosen to be synchronous
+standbys, though exactly which one is indeterminate, the rest will
+represent potential synchronous standbys.

s/one/ones/
s/indeterminate, the/indeterminate.  The/

+made by a transcation have been transferred to one or more
synchronous standby
+server. This extends that standard levelof durability

s/transcation/transaction/
s/that standard levelof/the standard level of/

 offered by a transaction commit. This level of protection is referred
 to as 2-safe replication in computer science theory.

Is this still called "2-safe" or does this patch make it "N-safe",
"group-safe", or something else?

-The minimum wait time is the roundtrip time between primary to standby.
+The minimum wait time is the roundtrip time between primary to standbys.

Suggestion: The minimum wait time is the roundtrip time between the
primary and the slowest synchronous standby.

+Multiple synchronous replication is set up by setting 
+using dedicated language. The syntax of dedicated language is following.

Suggestion:  Multiple synchronous replication is set up by setting
 using the following
syntax.

+Using dedicated language, we can define a synchronous group with
a number N.
+synchronous group can have some members which are consdiered as
synchronous standby using comma-separated list.
+Any standby name is accepted at any position of its list, but '*'
is accepted at only tailing of the standby list.
+The leading N is a number which specifies that how many standbys
the master server waits to commit for. This number
+must be less than actual number of members of its group.
+The listed standby are given highest priority from left defined
starting with 1.

Suggestion: This syntax allows us to define a synchronous group that
will wait for at least N standbys, and a comma-separated list of group
members.  The special value * is accepted at the tail of
the member list, and matches any standby.  The number N must not be
greater than the number of members listed in the group, unless
* is used.  Priority is given to servers in the order that
they appear in the list.  The first named server has the highest
priority.

+All ASCII characters except for special characters(',', '"',
'[', ']', ' ') are allowed as standby name.
+When these special characters are used as standby name, whole
standby name string need to be written in
+double-quoted representation.

Suggestion:  ... are allowed in unquoted standby names.  To use these
special characters, the standby name should be enclosed in double
quotes.

+ * In 9.5 we support the possibility to have multiple synchronous standbys,

s/9.5/9.6/

+ * as defined in synchronous_standby_names. Before on standby can become a

s/ on / a /

+ * Waiters will be released from the queue once the number of standbys
+ * specified in synchronous_standby_names have caught.

s/caught/processed the commit record/

+ * Check whether specified standby is active, which means not only having
+ * pid but also having any priority.

s/having any priority/having a non-zero priority (meaning it is
configured as potential sync standby)./

- announce_next_takeover = true;

By removing this, haven't we lost the ability to announce takeover
more than once per walsender?  I'm not sure exactly where this should
go now but the walsender needs to detect its own transition from
potential to sync state.  Also, that message, where it appears below
should probably be tweaked slightly s/the/a/, so "standby \"%s\" is
now a synchronous standby with priority %u", not "... the synchronous
s

Re: [HACKERS] psql completion for ids in multibyte string

2016-03-02 Thread Kyotaro HORIGUCHI
Hello, thank you for the comments.

At Wed, 2 Mar 2016 10:10:55 +1300, Thomas Munro  
wrote in 
> On Wed, Mar 2, 2016 at 7:54 AM, Robert Haas  wrote:
> > On Mon, Feb 29, 2016 at 6:32 PM, Thomas Munro
> >  wrote:
> >> On Fri, Feb 26, 2016 at 6:34 PM, Kyotaro HORIGUCHI
> >>  wrote:
> >>> Hello, this is the second patch plitted out. This allows
> >>> multibyte names to be completed in psql.
> >>>
> >>> At Fri, 06 Nov 2015 11:47:17 +0900 (Tokyo Standard Time), Kyotaro 
> >>> HORIGUCHI  wrote in 
> >>> <20151106.114717.170526268.horiguchi.kyot...@lab.ntt.co.jp>
>  At Thu, 5 Nov 2015 18:32:59 +0900, Amit Langote 
>   wrote in <563b224b.3020...@lab.ntt.co.jp>
>  > On 2015/11/05 18:10, Kyotaro HORIGUCHI wrote:
>  > > Hello. I don't know whether this is a bug fix or improvement,
>  >
>  > Would it be 50-50? :-)
> 
>  Yeah, honestly saying, I doubt that this is valuable but feel
>  uneasy to see some of the candidates vanish as completon proceeds
>  for no persuasive reason.
> >>
> >> +1 from me, it's entirely reasonable to want to name database objects
> >> in any human language and use auto-completion.  It's not working today
> >> as you showed.
> >>
> >>> The current version of tab-completion failed to complete names
> >>> with multibyte characters.
> >>>
> >>> =# create table いろは (あ int);
> >>> =# create table いこい (あ int);
> >>> =# drop table 
> >>> "いろは" hint_plan.   pg_toast.
> >>> "いこい" information_schema.  pg_toast_temp_1.
> >>> ALL IN TABLESPACEpg_catalog.  public.
> >>> dbms_stats.  pg_temp_1.
> >>> postgres=# alter table "い
> >>> =# drop table "い
> >>> =# drop table "い   /* No candidate offered */
> >>>
> >>> This is because _complet_from_query counts the string length in
> >>> bytes, instead of characters. With this patch the candidates will
> >>> appear.
> >>>
> >>> =# drop table "い
> >>> "いこい"  "いろは"
> >>> postgres=# drpo table "い
> >>
> >> The patch looks correct to me: it counts characters rather than bytes,
> >> which is the right thing to do because the value is passed to substr()
> >> which also works in characters rather than bytes.  I tested with
> >> "éclair", and without the patch, tab completion doesn't work if you
> >> press tab after 'é'.  With the patch it does.
> >
> > OK, but I am a little concerned about this code:
> >
> > /* Find something that matches */
> > if (result && PQresultStatus(result) == PGRES_TUPLES_OK)
> > {
> > const char *item;
> >
> > while (list_index < PQntuples(result) &&
> >(item = PQgetvalue(result, list_index++, 0)))
> > if (pg_strncasecmp(text, item, string_length) == 0)
> > return pg_strdup(item);
> > }
> >
> > Every other use of string_length in this function is using it as the
> > argument to the SQL substring function, which cares about characters,
> > not bytes.  But this use seems to care about bytes, not characters.
> >
> > Am I wrong?
> 
> Ugh, and the other problem is that string_length is always 0 there if
> state isn't 0 (in master it is static so that the value is reused for
> subsequent calls, but this patch made it automatic).

Thanks for pointing it out.

> I think we should leave string_length as it is and use a new variable
> for character-based length, as in the attached.

Basically agreed but I like byte_length for the previous
string_length and string_length for string_length_cars. Also
text_length is renamed in the attached patch.

What do you think about this?

# I named it as version 3 at my own decision.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5f27120..ed92912 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3198,9 +3198,8 @@ static char *
 _complete_from_query(int is_schema_query, const char *text, int state)
 {
 	static int	list_index,
-string_length;
+byte_length;
 	static PGresult *result = NULL;
-
 	/*
 	 * If this is the first time for this completion, we fetch a list of our
 	 * "things" from the backend.
@@ -3211,9 +3210,18 @@ _complete_from_query(int is_schema_query, const char *text, int state)
 		char	   *e_text;
 		char	   *e_info_charp;
 		char	   *e_info_charp2;
+		const char *pstr = text;
+		int			string_length = 0;
 
 		list_index = 0;
-		string_length = strlen(text);
+		byte_length = strlen(text);
+
+		/* Count length as number of characters (not bytes), for passing to substring */
+		while (*pstr)
+		{
+			string_length++;
+			pstr += PQmblen(pstr, pset.encoding);
+		}
 
 		/* Free any prior result */
 		PQclear(result);
@@ -3353,7 +3361,7 @@ _complete_from_query(int is_schema_query, const char *text, int state)
 
 		while (list_index < PQntuples(result) &&
 			   (item = PQgetvalue(result, list_index++, 0)))
-			if (pg_strncasecmp(text, item, string_length) == 0)
+			if (pg_strncasecmp(text, item, byte_length) 

Re: [HACKERS] Fixing wrong comment on PQmblen and PQdsplen.

2016-03-02 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 1 Mar 2016 13:33:02 -0500, Robert Haas  wrote in 

> On Fri, Feb 26, 2016 at 12:33 AM, Kyotaro HORIGUCHI
>  wrote:
> > I divided the last patch into one typo-fix patch and one
> > improvement patch. This is the former one.
> 
> Committed.

Thank you.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb array-style subscription

2016-03-02 Thread Jim Nasby

On 3/2/16 6:24 PM, Tom Lane wrote:

If the patch were proposing a similar amount of new infrastructure to
support some datatype-extensible concept of subscripting, I'd be much
happier about it.


+1


I believe there's been some handwaving in the past about extensible
approaches to subscripting, though I haven't got time to troll the
archives for it right now.


I'd be able to make use of that in my ndarray data type. It would also 
be nice to be able to add things like matrix types, sparse arrays, and 
variable size arrays (ie: list of lists), and subscripting is how you'd 
want to interface with all of those.


Presumably the point type is handled specially today, so that should be 
taken care off too.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb array-style subscription

2016-03-02 Thread Tom Lane
Vitaly Burovoy  writes:
> I'd like to be a reviewer for the patch. It does not look big and very 
> invasive.
> Is it a final decision or it has a chance? If something there hurts
> committers, it can end up as "Rejected with feedback" (since the patch
> is already in the CF[1])?

Well, it is pretty invasive, and I'm not sure anyone has bought in on the
design.  The problem I've got with it is that it's a one-off that embeds
a whole lot of new parser/planner/executor infrastructure to serve exactly
one datatype, ie jsonb.  That does not seem like a good design approach,
nor in keeping with the way Postgres usually goes at things.

If the patch were proposing a similar amount of new infrastructure to
support some datatype-extensible concept of subscripting, I'd be much
happier about it.

I believe there's been some handwaving in the past about extensible
approaches to subscripting, though I haven't got time to troll the
archives for it right now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Tom Lane
Teodor Sigaev  writes:
> Yep, for now on my notebook (best from 5 tries):
> % pgbench -i -s 3000
> % pgbench  -s 3000 -c 4 -j 4 -P 1 -T 60
> HEAD569 tps
> patched 542 tps
> % pgbench  -s 3000 -c 4 -j 4 -P 1 -T 60 -S
> HEAD9500 tps
> patched 9458 tps

> Looks close to measurement error, but may be explained increased amount
> of work for planning. Including, may be, more complicated path tree.

Hmmm ... I'm now wondering about the "measurement error" theory.
I tried to repeat this measurement locally, focusing on the select-only
number since that should have a higher ratio of planning time to
execution.

Test setup:
cassert-off build
pgbench -i -s 100
sudo cpupower frequency-set --governor performance

repeat 3 times: pgbench -c 4 -j 4 -P 5 -T 60 -S

HEAD:
tps = 32508.217002 (excluding connections establishing)
tps = 33081.402766
tps = 32520.859913
average of 3: 32703 tps

WITH PATCH:
tps = 32815.922160 (excluding connections establishing)
tps = 33312.149718
tps = 32784.527489
average of 3: 32970 tps

(Hardware: dual quad-core Xeon E5-2609, running current RHEL6)

So I see no evidence for a slowdown on pgbench's SELECT queries.
Anybody else want to check performance on simple scan/join queries?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-03-02 Thread Jim Nasby

On 3/2/16 5:41 PM, Tom Lane wrote:

Jim Nasby  writes:

On 3/2/16 4:21 PM, Peter Geoghegan wrote:

I think you should commit this. The chances of anyone other than you
and Masahiko recalling that you developed this tool in 3 years is
essentially nil. I think that the cost of committing a developer-level
debugging tool like this is very low. Modules like pg_freespacemap
currently already have no chance of being of use to ordinary users.
All you need to do is restrict the functions to throw an error when
called by non-superusers, out of caution.

It's a problem that modules like pg_stat_statements and
pg_freespacemap are currently lumped together in the documentation,
but we all know that.



+1.


Would it make any sense to stick it under src/test/modules/ instead of
contrib/ ?  That would help make it clear that it's a debugging tool
and not something we expect end users to use.


I haven't looked at it in detail; is there something inherently 
dangerous about it?


When I'm forced to wear a DBA hat, I'd really love to be able to find 
out what VM status for a large table is. If it's in contrib they'll know 
the tool is there; if it's under src then there's about 0 chance of 
that. I'd think SU-only and any appropriate warnings would be enough 
heads-up for DBAs to be careful with it.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Convert pltcl from strings to objects

2016-03-02 Thread Tom Lane
Jim Nasby  writes:
> On 3/1/16 5:06 PM, Tom Lane wrote:
>> If we don't do that, I'm at least going to put in a similar #error for
>> Tcl 8.0; but I really think we ought to just say 8.4 is the minimum.

> Just confirmed that should be completely reasonable. I'll take a look at 
> it in a few days if you don't beat me to it.

Done already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb array-style subscription

2016-03-02 Thread Vitaly Burovoy
On 1/19/16, Alvaro Herrera  wrote:
> Dmitry Dolgov wrote:
>
>> I've cleaned up the code, created a separate JsonbRef node (and there are
>> a
>> lot of small changes because of that), abandoned an idea of "deep
>> nesting"
>> of assignments (because it doesn't relate to jsonb subscription, is more
>> about the
>> "jsonb_set" function, and anyway it's not a good idea). It looks fine for
>> me, and I need a little guidance - is it ok to propose this feature for
>> commitfest 2016-03 for a review?
>
> Has this patch been proposed in some commitfest previously?  One of the
> less-commonly-invoked rules of commitfests is that you can't add patches
> that are too invasive to the last one -- so your last chance for 9.6 was
> 2016-01.  This is harsh to patch submitters, but it helps keep the size
> of the last commitfest down to a reasonable level; otherwise we are
> never able to finish it.

I'd like to be a reviewer for the patch. It does not look big and very invasive.

Is it a final decision or it has a chance? If something there hurts
committers, it can end up as "Rejected with feedback" (since the patch
is already in the CF[1])?

[1]https://commitfest.postgresql.org/9/485/
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Publish autovacuum informations

2016-03-02 Thread Jim Nasby

On 3/2/16 10:48 AM, Julien Rouhaud wrote:

Good point, I don't see a lot of information available with this hooks
that a native system statistics couldn't offer. To have the same amount
of information, I think we'd need a pg_stat_autovacuum view that shows a
realtime insight of the workers, and also add some aggregated counters
to PgStat_StatTabEntry. I wonder if adding counters to
PgStat_StatTabEntry would be accepted though.


I would also really like to see a means of logging (auto)vacuum activity 
in the database itself. We figured out how to do that with 
pg_stat_statements, which was a lot harder... it seems kinda silly not 
to offer that for vacuum. Hooks plus shared memory data should allow for 
that (the only tricky bit is the hook would need to start and then 
commit a transaction, but that doesn't seem onerous).


I think the shared memory structures should be done as well. Having that 
real-time info is also valuable.


I don't see too much point in adding stuff to the stats system for this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-03-02 Thread Tom Lane
Jim Nasby  writes:
> On 3/2/16 4:21 PM, Peter Geoghegan wrote:
>> I think you should commit this. The chances of anyone other than you
>> and Masahiko recalling that you developed this tool in 3 years is
>> essentially nil. I think that the cost of committing a developer-level
>> debugging tool like this is very low. Modules like pg_freespacemap
>> currently already have no chance of being of use to ordinary users.
>> All you need to do is restrict the functions to throw an error when
>> called by non-superusers, out of caution.
>> 
>> It's a problem that modules like pg_stat_statements and
>> pg_freespacemap are currently lumped together in the documentation,
>> but we all know that.

> +1.

Would it make any sense to stick it under src/test/modules/ instead of
contrib/ ?  That would help make it clear that it's a debugging tool
and not something we expect end users to use.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Convert pltcl from strings to objects

2016-03-02 Thread Jim Nasby

On 3/1/16 5:06 PM, Tom Lane wrote:

If we don't do that, I'm at least going to put in a similar #error for
Tcl 8.0; but I really think we ought to just say 8.4 is the minimum.


Just confirmed that should be completely reasonable. I'll take a look at 
it in a few days if you don't beat me to it.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Convert pltcl from strings to objects

2016-03-02 Thread Jim Nasby

On 3/2/16 12:32 PM, Tom Lane wrote:

Jim Nasby  writes:

[ pltcl_objects_2.patch ]


I've pushed this with some minor fixes, as well as the followup work
mentioned in this thread.


Awesome, thanks!

I've asked Karl's opinion on increasing the minimum TCL version, but I 
suspect that won't be an issue.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-03-02 Thread Jim Nasby

On 3/2/16 4:21 PM, Peter Geoghegan wrote:

I think you should commit this. The chances of anyone other than you
and Masahiko recalling that you developed this tool in 3 years is
essentially nil. I think that the cost of committing a developer-level
debugging tool like this is very low. Modules like pg_freespacemap
currently already have no chance of being of use to ordinary users.
All you need to do is restrict the functions to throw an error when
called by non-superusers, out of caution.

It's a problem that modules like pg_stat_statements and
pg_freespacemap are currently lumped together in the documentation,
but we all know that.


+1.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WHERE clause not used when index is used (9.5)

2016-03-02 Thread David G. Johnston
Placing this specific message onto -bugs while keeping -hackers and
removing -novice.

Editing subject to include version and remove list identifiers.

There is continuing discussion on -hackers though mostly about how to do
this right in the future.  The specific problem stems from an attempted
performance improvement which is likely to be reverted.

David J.


On Tue, Mar 1, 2016 at 8:40 AM, Tom Lane  wrote:

> Tobias Florek  writes:
> > When creating an index to use for an ORDER BY clause, a simple query
> > starts to return more results than expected.  See the following detailed
> > log.
>
> Ugh.  That is *badly* broken.  I thought maybe it had something to do with
> the "abbreviated keys" work, but the same thing happens if you change the
> numeric column to integer, so I'm not very sure where to look.  Who's
> touched btree key comparison logic lately?
>
> (Problem is reproducible in 9.5 and HEAD, but not 9.4.)
>
>
> > Create enough test data for planer to use an index (if exists) for the
> > condition.
>
> > CREATE TABLE "index_cond_test" AS
> > SELECT
> >   (10 + random() * 10)::int AS "final_score",
> >   round((10 + random() * 10)::numeric, 5) "time_taken"
> > FROM generate_series(1, 1) s;
>
>
> > Run control query without an index (will be less than 1 rows). Pay
> > attention to tuples of (20,a) with a > 11.
>
> > SELECT *
> > FROM "index_cond_test"
> > WHERE (final_score, time_taken) < (20, 11)
> > ORDER BY final_score DESC, time_taken ASC;
>
>
> > Or wrapped in count(*), to make it even more obvious
>
> > SELECT count(*) FROM ( SELECT *
> >FROM "index_cond_test"
> >WHERE (final_score, time_taken) < (20, 11)
> >ORDER BY final_score DESC, time_taken ASC) q;
>
> > Create the index
>
> > CREATE INDEX "index_cond_test_ranking" ON "index_cond_test" USING
> btree (final_score DESC, time_taken ASC);
>
> > Run test query (will return all 1 rows)
>
> > SELECT *
> > FROM "index_cond_test"
> > WHERE (final_score, time_taken) < (20, 11)
> > ORDER BY final_score DESC, time_taken ASC;
>
> > or wrapped
>
> > SELECT count(*) FROM ( SELECT *
> >FROM "index_cond_test"
> >WHERE (final_score, time_taken) < (20, 11)
> >ORDER BY final_score DESC, time_taken ASC) q;
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types

2016-03-02 Thread Jim Nasby

On 3/2/16 3:52 PM, Pavel Stehule wrote:

Right, and it's arguably dubious that that doesn't already work.
Unfortunately, these % things are just random plpgsql parser hacks, not
real types.  Maybe this should be done in the main PostgreSQL parser
with parameter hooks, if we wanted this feature to be available outside
plpgsql as well.


I am not fan to propagate this feature outside PLpgSQL - it is possible
new dependency between database object, and the cost is higher than
benefits.


I fail to see how it'd be a dependency. I'd expect it to look up the 
type when you run the command, just like plpgsql does. I think it'd be 
useful to have.


That said, I think that should be a completely separate patch and 
discussion. Lets at least get it into plpgsql first.


As for the array of element/element of array feature; I agree it would 
be nice, but we're pretty late in the game for that, and I don't see why 
that couldn't be added later.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Tom Lane
Alvaro Herrera  writes:
> Agreed -- we need this in tree as soon as realistically possible.

> There is a a bit a problem here, because this patch conflicts heavily
> with at least one other patch that's been in the queue for a long time,
> which is Kommi/Rowley's patch for parallel aggregation; the more we
> delay applying this one, the worse the deadlines for that one.

> I assume they are hard at work updating that patch to apply on top of
> Tom's patch.  It's not realistic to expect that we would apply any
> further planner changes before this one is in.

I don't think it's quite that bad: the patch doesn't touch scan/join
planning very much, so for instance I doubt that the pending unique-joins
patch is completely broken.  But yeah, anything having anything to do
with planning of grouping/aggregation or later stages is going to need
major revision to play with this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-03-02 Thread Peter Geoghegan
On Tue, Mar 1, 2016 at 6:51 PM, Robert Haas  wrote:
> I removed the pgstat stuff.  I'm not sure we want that stuff in that
> form; it doesn't seem to fit with the rest of what's in that view, and
> it wasn't reliable in my testing.  I did however throw together a
> little contrib module for testing, which I attach here.  I'm not sure
> we want to commit this, and at the least someone would need to write
> documentation.  But it's certainly handy for checking whether this
> works.

I think you should commit this. The chances of anyone other than you
and Masahiko recalling that you developed this tool in 3 years is
essentially nil. I think that the cost of committing a developer-level
debugging tool like this is very low. Modules like pg_freespacemap
currently already have no chance of being of use to ordinary users.
All you need to do is restrict the functions to throw an error when
called by non-superusers, out of caution.

It's a problem that modules like pg_stat_statements and
pg_freespacemap are currently lumped together in the documentation,
but we all know that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Stephen Frost
* Joe Conway (m...@joeconway.com) wrote:
> On 03/02/2016 12:54 PM, Stephen Frost wrote:
> > * Joe Conway (m...@joeconway.com) wrote:
> >> On 03/01/2016 08:00 AM, Tom Lane wrote:
> >>> Yes, we'd need some way to mark non-null ACLs as being "built-in
> >>> defaults".  I do not see the need to have SQL syntax supporting that
> >>> though.
> >>
> >> I was thinking the supporting syntax might be used by extensions, for
> >> example.
> > 
> > I tend to agree with Tom that we don't really need SQL syntax for this.
> 
> > I don't see any reason it couldn't be used by extensions also, though
> > we'd have to do a bit more work on pg_dump to make it actually dump
> > out any non-default ACLs for extension-owned objects.
> 
> Without any syntax, what does the extension do, directly insert into
> this special catalog table?

Perhaps nothing- we already are tracking everything they've created, at
the end of the extension's script, we could simply go over all of the
objects which are part of the extension and save off the non-NULL ACLs
which exist.

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 2, 2016 at 2:44 PM, Joe Conway  wrote:
> 
> > On 03/02/2016 12:54 PM, Stephen Frost wrote:
> > > * Joe Conway (m...@joeconway.com) wrote:
> > >> On 03/01/2016 08:00 AM, Tom Lane wrote:
> > >>> Yes, we'd need some way to mark non-null ACLs as being "built-in
> > >>> defaults".  I do not see the need to have SQL syntax supporting that
> > >>> though.
> > >>
> > >> I was thinking the supporting syntax might be used by extensions, for
> > >> example.
> > >
> > > I tend to agree with Tom that we don't really need SQL syntax for this.
> >
> > > I don't see any reason it couldn't be used by extensions also, though
> > > we'd have to do a bit more work on pg_dump to make it actually dump
> > > out any non-default ACLs for extension-owned objects.
> >
> > Without any syntax, what does the extension do, directly insert into
> > this special catalog table?
> >
> >
> ​The desire in the thread I linked was for the user, not the extension, to
> alter the permissions.  In that context (and possibly here as well)
> PostgreSQL would (somehow?) recognize the ​target as being special and thus
> requiring adding or updating an entry into the supplemental catalog table
> when the usual GRANT/REVOKE SQL command is issued.

Not quite.

The idea here is for us to track in the catalog what the ACLs were at
the "start" (being "initdb" time for the database as a whole, and
"CREATE EXTENSION" time for the extension) and then dump out any ACLs
which have been changed since then.

That strikes me as much simpler than having to track every GRANT/REVOKE
done against some special set of objects..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types

2016-03-02 Thread Pavel Stehule
Hi

2016-02-24 22:18 GMT+01:00 Peter Eisentraut :

> On 1/18/16 4:21 PM, Robert Haas wrote:
> > One idea that occurs to me is: If you can DECLARE BAR FOO%TYPE, but
> > then you want to make BAR an array of that type rather than a scalar,
> > why not write that as DECLARE BAR FOO%TYPE[]?  That seems quite
> > natural to me.
>
> Right, and it's arguably dubious that that doesn't already work.
> Unfortunately, these % things are just random plpgsql parser hacks, not
> real types.  Maybe this should be done in the main PostgreSQL parser
> with parameter hooks, if we wanted this feature to be available outside
> plpgsql as well.
>

I am not fan to propagate this feature outside PLpgSQL - it is possible new
dependency between database object, and the cost is higher than benefits.


>
> > I think the part of this patch that makes %TYPE work for more kinds of
> > types is probably a good idea, although I haven't carefully studied
> > exactly what it does.
>
> I agree that this should be more general.  For instance, this patch
> would allow you to get the element type of an array-typed variable, but
> there is no way to get the element type of just another type.  If we
> could do something like
>
> DECLARE
>   var ELEMENT OF point;
>

isn't it bug? What is sense of this construct? Our other manipulation with
a arrays we raise a error, when we try to to take a element from non array
value.

Today I did work on this patch and I am able to implement the syntax
proposed by you. It is proprietary, but similar to ADA anonymous types.

DECLARE x array() of type

Regards

Pavel


>
> (not necessary that syntax)
>
> then
>
> DECLARE
>   var ELEMENT OF othervar%TYPE;
>
> should just fall into place.
>
>


Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread David G. Johnston
On Wed, Mar 2, 2016 at 2:44 PM, Joe Conway  wrote:

> On 03/02/2016 12:54 PM, Stephen Frost wrote:
> > * Joe Conway (m...@joeconway.com) wrote:
> >> On 03/01/2016 08:00 AM, Tom Lane wrote:
> >>> Yes, we'd need some way to mark non-null ACLs as being "built-in
> >>> defaults".  I do not see the need to have SQL syntax supporting that
> >>> though.
> >>
> >> I was thinking the supporting syntax might be used by extensions, for
> >> example.
> >
> > I tend to agree with Tom that we don't really need SQL syntax for this.
>
> > I don't see any reason it couldn't be used by extensions also, though
> > we'd have to do a bit more work on pg_dump to make it actually dump
> > out any non-default ACLs for extension-owned objects.
>
> Without any syntax, what does the extension do, directly insert into
> this special catalog table?
>
>
​The desire in the thread I linked was for the user, not the extension, to
alter the permissions.  In that context (and possibly here as well)
PostgreSQL would (somehow?) recognize the ​target as being special and thus
requiring adding or updating an entry into the supplemental catalog table
when the usual GRANT/REVOKE SQL command is issued.

​In effect any object dependent upon an EXTENSION or that already exists in
this special catalog table would need to have the supplemental procedure
executed.

David J.
​


Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Joe Conway
On 03/02/2016 12:54 PM, Stephen Frost wrote:
> * Joe Conway (m...@joeconway.com) wrote:
>> On 03/01/2016 08:00 AM, Tom Lane wrote:
>>> Yes, we'd need some way to mark non-null ACLs as being "built-in
>>> defaults".  I do not see the need to have SQL syntax supporting that
>>> though.
>>
>> I was thinking the supporting syntax might be used by extensions, for
>> example.
> 
> I tend to agree with Tom that we don't really need SQL syntax for this.

> I don't see any reason it couldn't be used by extensions also, though
> we'd have to do a bit more work on pg_dump to make it actually dump
> out any non-default ACLs for extension-owned objects.

Without any syntax, what does the extension do, directly insert into
this special catalog table?

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Random inconsistencies in GiST support function declarations

2016-03-02 Thread Tom Lane
Jeff Janes  writes:
> This work (9ff60273e35cad6e9) seems have broken pg_upgrade when
> tsearch2 is installed.

> On an empty 9.4 instance with nothing but tsearch2 installed, using
> HEAD's pg_upgrade gives this error:

> pg_restore: creating OPERATOR CLASS "public.gin_tsvector_ops"
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 3037; 2616 19016
> OPERATOR CLASS gist_tp_tsquery_ops jjanes
> pg_restore: [archiver (db)] could not execute query: ERROR:  function
> gtsquery_consistent(internal, internal, integer, oid, internal) does
> not exist
> Command was: CREATE OPERATOR CLASS "gist_tp_tsquery_ops"
> FOR TYPE "pg_catalog"."tsquery" USING "gist" AS
> STORAGE bigint ,
> OPE...

Ah, drat.  I guess we'll have to continue to provide pg_proc entries with
the old signatures to support pg_upgrade.  Will fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-02 Thread Alvaro Herrera
Craig Ringer wrote:
> On 2 March 2016 at 07:07, Alvaro Herrera  wrote:
> 
> > Craig Ringer wrote:
> >
> > > diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
> > > index 3d11cbb..8c13655 100644
> > > --- a/src/test/perl/TestLib.pm
> > > +++ b/src/test/perl/TestLib.pm
> > > @@ -112,9 +112,11 @@ INIT
> > >  #
> > >  sub tempdir
> > >  {
> > > + my ($prefix) = @_;
> > > + $prefix = "tmp_test" if (!$prefix);
> >
> > This should be "unless defined $prefix".  Otherwise if you pass the
> > string literal "0" as prefix it will be ignored.
> >
> Ha. I thought something was funny with !$prefix when splitting that out of
> Kyotaro Horiguchi's patch but couldn't put my finger on what.
> 
> Will amend in the next push.

Pushed it with that fix.  I also added a further "data_" prefix, so it's
"data_${name}_" now.  Hopefully this is less problematic than
yesterday's ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Stephen Frost
* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 2, 2016 at 1:54 PM, Stephen Frost  wrote:
> > Rather than have two aclitem[] columns in every catalog, since this
> > information is only used by pg_dump and not during normal operation, we
> > could use the approach that pg_description took and have an independent
> > catalog table which just contains all non-NULL "system" ACLs.  We could
> > populate it at the bottom of system_views.sql, so that we don't have to
> > explicitly think about updating that table whenever there's a change to
> > what the default ACLs are.
> >
> > I don't see any reason it couldn't be used by extensions also, though
> > we'd have to do a bit more work on pg_dump to make it actually dump
> > out any non-default ACLs for extension-owned objects.
> >
> 
> ​It sounds like this train of thought would resolve this complaint?
> 
> ​
> http://www.postgresql.org/message-id/CADmxfmmz-ATwptaidTSAF0XE=cpeikmyc00sj6t9xf6kcv5...@mail.gmail.com
> 
> ​Namely allowing users to edit permissions on extension objects and have
> those changes dumped and then restored after the dependent CREATE EXTENSION
> command is executed during pg_restore.
> 
> Did I interpret that right?

Yes, I was following that thread also (as was Joe, I imagine) and that's
the idea.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread David G. Johnston
On Wed, Mar 2, 2016 at 1:54 PM, Stephen Frost  wrote:

> * Joe Conway (m...@joeconway.com) wrote:
> > On 03/01/2016 08:00 AM, Tom Lane wrote:
> > > Joe Conway  writes:
> > >> Would it be a terrible idea to add some attribute to ACLs which can be
> > >> used to indicate they should not be dumped (and supporting syntax)?
> > >
> > > Yes, we'd need some way to mark non-null ACLs as being "built-in
> > > defaults".  I do not see the need to have SQL syntax supporting that
> > > though.
> >
> > I was thinking the supporting syntax might be used by extensions, for
> > example.
>
> I tend to agree with Tom that we don't really need SQL syntax for this.
>
> > > Actually, wouldn't you need to mark individual aclitems as built-in
> > > or not?  Consider a situation where we have some function foo() that
> > > by default has EXECUTE permission granted to some built-in "pg_admin"
> > > role.  If a given installation then also grants EXECUTE to "joe",
> > > what you really want to have happen is for pg_dump to dump only the
> > > grant to "joe".  Mentioning pg_admin's grant would tie the dump to
> > > a particular major PG version's idea of what the built-in roles are,
> > > which is what I'm arguing we need to avoid.
> >
> > Yes, I guess it would need to be a per aclitem attribute.
>
> Agreed.
>
> > > I guess this could also be addressed by having two separate aclitem[]
> > > columns, one that is expected to be frozen after initdb and one for
> > > user-added grants.
> >
> > Yeah, that would work, but seems kind of ugly.
>
> Rather than have two aclitem[] columns in every catalog, since this
> information is only used by pg_dump and not during normal operation, we
> could use the approach that pg_description took and have an independent
> catalog table which just contains all non-NULL "system" ACLs.  We could
> populate it at the bottom of system_views.sql, so that we don't have to
> explicitly think about updating that table whenever there's a change to
> what the default ACLs are.
>
> I don't see any reason it couldn't be used by extensions also, though
> we'd have to do a bit more work on pg_dump to make it actually dump
> out any non-default ACLs for extension-owned objects.
>

​It sounds like this train of thought would resolve this complaint?

​
http://www.postgresql.org/message-id/CADmxfmmz-ATwptaidTSAF0XE=cpeikmyc00sj6t9xf6kcv5...@mail.gmail.com

​Namely allowing users to edit permissions on extension objects and have
those changes dumped and then restored after the dependent CREATE EXTENSION
command is executed during pg_restore.

Did I interpret that right?

David J.
​


[HACKERS] SP-GiST support for inet datatypes

2016-03-02 Thread Emre Hasegeli
Attached patches add SP-GiST support to the inet datatypes.  The operator
class comes with a small change on the SP-GiST framework to allow fixed
number of child nodes.

The index is like prefix tree except that it doesn't bother to split the
addresses into parts as text is split.  It also doesn't use labels to know
the part after the prefix, but relies on static node numbers.

The GiST index released with version 9.4 performs really bad with real
world data.  SP-GiST works much better with the query posted to the
performance list [1] a while ago:

> hasegeli=# SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn =
2914;
> SELECT 732
>
> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
>QUERY PLAN
>

>  Nested Loop  (cost=0.41..571742.27 rows=2248 width=7) (actual
time=12.643..20474.813 rows=8127 loops=1)
>->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
time=0.017..0.524 rows=732 loops=1)
>->  Index Only Scan using route_gist on routes  (cost=0.41..552.05
rows=22900 width=7) (actual time=4.851..27.948 rows=11 loops=732)
>  Index Cond: (route && (hmm.route)::inet)
>  Heap Fetches: 8127
>  Planning time: 1.507 ms
>  Execution time: 20475.605 ms
> (7 rows)
>
> hasegeli=# DROP INDEX route_gist;
> DROP INDEX
>
> hasegeli=# CREATE INDEX route_spgist ON routes USING spgist (route);
> CREATE INDEX
>
> hasegeli=# EXPLAIN ANALYZE SELECT routes.route FROM routes JOIN hmm ON
routes.route && hmm.route;
>   QUERY PLAN
>
-
>  Nested Loop  (cost=0.41..588634.27 rows=2248 width=7) (actual
time=0.081..16.961 rows=8127 loops=1)
>->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual
time=0.022..0.079 rows=732 loops=1)
>->  Index Only Scan using route_spgist on routes  (cost=0.41..575.13
rows=22900 width=7) (actual time=0.014..0.021 rows=11 loops=732)
>  Index Cond: (route && (hmm.route)::inet)
>  Heap Fetches: 8127
>  Planning time: 1.376 ms
>  Execution time: 15.936 ms

[1]
http://www.postgresql.org/message-id/flat/alpine.DEB.2.11.1508251504160.31004@pyrite#alpine.DEB.2.11.1508251504160.31004@pyrite


spgist-fixed-nnodes.patch
Description: Binary data


inet-spgist-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Stephen Frost
* Joe Conway (m...@joeconway.com) wrote:
> On 03/01/2016 08:00 AM, Tom Lane wrote:
> > Joe Conway  writes:
> >> Would it be a terrible idea to add some attribute to ACLs which can be
> >> used to indicate they should not be dumped (and supporting syntax)?
> > 
> > Yes, we'd need some way to mark non-null ACLs as being "built-in
> > defaults".  I do not see the need to have SQL syntax supporting that
> > though.
> 
> I was thinking the supporting syntax might be used by extensions, for
> example.

I tend to agree with Tom that we don't really need SQL syntax for this.

> > Actually, wouldn't you need to mark individual aclitems as built-in
> > or not?  Consider a situation where we have some function foo() that
> > by default has EXECUTE permission granted to some built-in "pg_admin"
> > role.  If a given installation then also grants EXECUTE to "joe",
> > what you really want to have happen is for pg_dump to dump only the
> > grant to "joe".  Mentioning pg_admin's grant would tie the dump to
> > a particular major PG version's idea of what the built-in roles are,
> > which is what I'm arguing we need to avoid.
> 
> Yes, I guess it would need to be a per aclitem attribute.

Agreed.

> > I guess this could also be addressed by having two separate aclitem[]
> > columns, one that is expected to be frozen after initdb and one for
> > user-added grants.
> 
> Yeah, that would work, but seems kind of ugly.

Rather than have two aclitem[] columns in every catalog, since this
information is only used by pg_dump and not during normal operation, we
could use the approach that pg_description took and have an independent
catalog table which just contains all non-NULL "system" ACLs.  We could
populate it at the bottom of system_views.sql, so that we don't have to
explicitly think about updating that table whenever there's a change to
what the default ACLs are.

I don't see any reason it couldn't be used by extensions also, though
we'd have to do a bit more work on pg_dump to make it actually dump
out any non-default ACLs for extension-owned objects.

Thoughts?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Alexander Korotkov
On Wed, Mar 2, 2016 at 9:53 PM, Josh berkus  wrote:

> On 02/24/2016 01:22 AM, Konstantin Knizhnik wrote:
>
>> Sorry, but based on this plan it is possible to make a conclusion that
>> there are only two possible cluster solutions for Postgres:
>> XC/XL and FDW-based.  From my point of view there are  much more
>> possible alternatives.
>>
>
> Definitely.
>
> Currently we have five approaches to sharding inside postgres in the
> field, in chronological order:
>
> 1. Greenplum's executor-based approach with motion nodes
>
> 2. Skype's function-based approach (PL/proxy)
>
> 3. XC/XL's approach, which I believe is also query executor-based
>
> 4. CitusDB's pg_shard which is based on query hooks
>
> 5. FDW-based (currently theoretical)
>
> One of the things which causes bad reactions and arguments, Bruce, is that
> a lot of your posts and presentations detailing plans for the FDW approach
> carry the subtext that all four of the other approaches are dead ends and
> not worth considering.  Given that the other approaches, whatever their
> limitations, have working code in the field and the FDW approach does not,
> that's more than a little offensive.
>
> If we want to move forwards on serious work on FDW-based sharding, the
> folks working on it should stop treating it as a "fait accompli" that this
> is the Chosen Way for the PostgreSQL project.  Otherwise, you'll spend all
> of your time arguing that point instead of working on features that matter.
>
> Bruce made a long comparison with built-in replication, but there's a big
> difference here.  We decided that WAL-based replication was the way to go
> for built-in as a community decision here on -hackers and at various
> conferences.  Both the plan and the implementation for replication
> transcended company backing, involving even active competitors, and
> involved discussions with maintainers of the older replication projects.
>
> In contrast, this FDW plan *still* feels very much like a small group made
> up of employees of only two companies came up with it in private and
> decided that it should be the plan for the whole project.  I know that
> Bruce and others have good reasons for starting the FDW project, but there
> hasn't been much of an attempt to obtain community consensus around it. If
> Bruce and others want contributors to work on FDWs instead of other
> sharding approaches, then they need to win over those people as to why they
> should do that.  It's how this community works.
>
> Alternately, you can just work on the individual FDW features, which
> *everyone* thinks are a good idea, and when most of them are done,
> FDW-based scaleout will be such an obvious solution that nobody will argue
> with it.


+1

Thank you, Josh. I think this is excellent summary for conversation about
FDW-based sharding.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Josh berkus

On 02/24/2016 01:22 AM, Konstantin Knizhnik wrote:

Sorry, but based on this plan it is possible to make a conclusion that
there are only two possible cluster solutions for Postgres:
XC/XL and FDW-based.  From my point of view there are  much more
possible alternatives.


Definitely.

Currently we have five approaches to sharding inside postgres in the 
field, in chronological order:


1. Greenplum's executor-based approach with motion nodes

2. Skype's function-based approach (PL/proxy)

3. XC/XL's approach, which I believe is also query executor-based

4. CitusDB's pg_shard which is based on query hooks

5. FDW-based (currently theoretical)

One of the things which causes bad reactions and arguments, Bruce, is 
that a lot of your posts and presentations detailing plans for the FDW 
approach carry the subtext that all four of the other approaches are 
dead ends and not worth considering.  Given that the other approaches, 
whatever their limitations, have working code in the field and the FDW 
approach does not, that's more than a little offensive.


If we want to move forwards on serious work on FDW-based sharding, the 
folks working on it should stop treating it as a "fait accompli" that 
this is the Chosen Way for the PostgreSQL project.  Otherwise, you'll 
spend all of your time arguing that point instead of working on features 
that matter.


Bruce made a long comparison with built-in replication, but there's a 
big difference here.  We decided that WAL-based replication was the way 
to go for built-in as a community decision here on -hackers and at 
various conferences.  Both the plan and the implementation for 
replication transcended company backing, involving even active 
competitors, and involved discussions with maintainers of the older 
replication projects.


In contrast, this FDW plan *still* feels very much like a small group 
made up of employees of only two companies came up with it in private 
and decided that it should be the plan for the whole project.  I know 
that Bruce and others have good reasons for starting the FDW project, 
but there hasn't been much of an attempt to obtain community consensus 
around it. If Bruce and others want contributors to work on FDWs instead 
of other sharding approaches, then they need to win over those people as 
to why they should do that.  It's how this community works.


Alternately, you can just work on the individual FDW features, which 
*everyone* thinks are a good idea, and when most of them are done, 
FDW-based scaleout will be such an obvious solution that nobody will 
argue with it.


--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Alvaro Herrera
Shulgin, Oleksandr wrote:

> Alright.  I'm attaching the latest version of this patch split in two
> parts: the first one is NULLs-related bugfix and the second is the
> "improvement" part, which applies on top of the first one.

So is this null-related bugfix supposed to be backpatched?  (I assume
it's not because it's very likely to change existing plans).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Convert pltcl from strings to objects

2016-03-02 Thread Tom Lane
Jim Nasby  writes:
> [ pltcl_objects_2.patch ]

I've pushed this with some minor fixes, as well as the followup work
mentioned in this thread.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create opclass documentation outdated

2016-03-02 Thread Alvaro Herrera
Tom Lane wrote:

> In create_opclass.sgml, not only do we have the list of AMs supporting
> STORAGE, but there is also a paragraph describing which AMs do what
> for input datatypes of FUNCTION members (around line 175).

I placed BRIN together with gist/gin/spgist here, namely that the optype
defaults to the opclass' datatype.

> xindex.sgml has that one list you mentioned, but there's not much
> point in fixing that when BRIN indexes fail to be described either in
> 35.14.2 or 35.14.3, where they certainly should be.  And there are
> other places throughout that chapter that say that such-and-such AMs
> support each feature as it's discussed.

In 35.14.2 I added the table of strategies in the "minmax" opclasses;
the list for the "inclusion" opclasses would be much longer.  Since GIN
doesn't document anything other than its array support, I'm not really
sure that we want/need to document the "inclusion" opclasses also.

In 35.14.3 I added the table of four basic support procs.  I think
that's all that's strictly necessary.  For inclusion we have a bunch of
additional support procs.

I also added a paragraph to 35.14.5, indicating that "minmax" is pretty
much the same as btree.  "Inclusion" opclasses are supposed to support
cross-datatype operators too (supposedly we could support
above/below/containment etc for points relative to boxes, for example,
if we got around to make a decision regarding floating point comparisons
in geometry operators), but we don't have any, so I hesitate to say
anything about this.

Patch attached.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/ref/create_opclass.sgml b/doc/src/sgml/ref/create_opclass.sgml
index 527f33c..7b9d55d 100644
--- a/doc/src/sgml/ref/create_opclass.sgml
+++ b/doc/src/sgml/ref/create_opclass.sgml
@@ -172,7 +172,7 @@ CREATE OPERATOR CLASS name [ DEFAUL
   the input data type(s) of the function (for B-tree comparison functions
   and hash functions)
   or the class's data type (for B-tree sort support functions and all
-  functions in GiST, SP-GiST and GIN operator classes).  These defaults
+  functions in GiST, SP-GiST, GIN and BRIN operator classes).  These defaults
   are correct, and so op_type need not be specified in
   FUNCTION clauses, except for the case of a B-tree sort
@@ -232,7 +232,7 @@ CREATE OPERATOR CLASS name [ DEFAUL
  
   The data type actually stored in the index.  Normally this is
   the same as the column data type, but some index methods
-  (currently GiST and GIN) allow it to be different.  The
+  (currently GiST, GIN and BRIN) allow it to be different.  The
   STORAGE clause must be omitted unless the index
   method allows a different type to be used.
  
diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml
index aaf9d8c..7a81ddd 100644
--- a/doc/src/sgml/xindex.sgml
+++ b/doc/src/sgml/xindex.sgml
@@ -323,6 +323,49 @@

 
   
+   BRIN indexes are similar to GiST, SP-GiST and GIN indexes in that they
+   don't have a fixed set of strategies either.  Instead the support routines
+   of each operator class interpret the strategy nnumbers according to the
+   operator class's definition. As an example, the strategy numbers used by
+   the built-in Minmax operator classes are shown in
+   .
+  
+
+   
+BRIN Minmax Strategies
+
+ 
+  
+   Operation
+   Strategy Number
+  
+ 
+ 
+  
+   less than
+   1
+  
+  
+   less than or equal
+   2
+  
+  
+   equal
+   3
+  
+  
+   greater than or equal
+   4
+  
+  
+   greater than
+   5
+  
+ 
+
+   
+
+  
Notice that all the operators listed above return Boolean values.  In
practice, all operators defined as index method search operators must
return type boolean, since they must appear at the top
@@ -602,6 +645,53 @@

 
   
+   BRIN indexes have four basic support functions, as shown in
+   ; those basic functions
+   may require additional support functions to be provided.
+   (For more information see .)
+  
+
+   
+GIN Support Functions
+
+ 
+  
+   Function
+   Description
+   Support Number
+  
+ 
+ 
+  
+   opcInfo
+   
+return internal information describing the indexed columns'
+summary data
+   
+   1
+  
+  
+   add_value
+   add a new value to an existing summary index tuple
+   2
+  
+  
+   consistent
+   determine whether value matches query condition
+   3
+  
+  
+   union
+   
+compute union of two summary tuples
+   
+   4
+  
+ 
+
+   
+
+  
Unlike search operators, support functions return whichever data
type the particular index method expects; for example in the case
of the comparis

Re: Commitfest Bug (was: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates)

2016-03-02 Thread Peter Geoghegan
On Wed, Mar 2, 2016 at 5:41 AM, Magnus Hagander  wrote:
> Ok, I've pushed a code that does that.

Thank you.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Shulgin, Oleksandr
On Wed, Mar 2, 2016 at 5:46 PM, David Steele  wrote:

> On 3/2/16 11:10 AM, Shulgin, Oleksandr wrote:
> > On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra
> > mailto:tomas.von...@2ndquadrant.com>>
> wrote:
> >
> > I think it'd be useful not to have all the changes in one lump, but
> > structure this as a patch series with related changes in separate
> > chunks. I doubt we'd like to mix the changes in a single commit, and
> > it makes the reviews and reasoning easier. So those NULL-handling
> > fixes should be in one patch, the MCV patches in another one.
> >
> >
> > OK, such a split would make sense to me.  Though, I'm a bit late as the
> > commitfest is already closed to new patches, I guess asking the CF
> > manager to split this might work (assuming I produce the patch files)?
>
> If the patch is broken into two files that gives the review/committer
> more options but I don't think it requires another CF entry.
>

Alright.  I'm attaching the latest version of this patch split in two
parts: the first one is NULLs-related bugfix and the second is the
"improvement" part, which applies on top of the first one.

--
Alex
From a6b1cb866f9374cdc893e9a318959eccaa5bfbc9 Mon Sep 17 00:00:00 2001
From: Oleksandr Shulgin 
Date: Wed, 2 Mar 2016 18:18:36 +0100
Subject: [PATCH 1/2] Account for NULLs in ANALYZE more strictly

Previously the ndistinct and avgcount calculation (for MCV list) could
be affected greatly by high fraction of NULLs in the sample.  Account
for that by subtracting the number of NULLs we've seen from the total
sample size explicitly.

At the same time, values that are considered "too wide" are accounted
for in ndistinct, but removed from sample size for MCV list
calculation.  In compute_distinct_stats() we need to do that manually,
in compute_scalar_stats() the value_cnt is already holding the number
of non-null, not too-wide values.
---
 src/backend/commands/analyze.c | 42 --
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 8a5f07c..f05b496 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -2085,17 +2085,21 @@ compute_distinct_stats(VacAttrStatsP stats,
 		denom,
 		stadistinct;
 
-			numer = (double) samplerows *(double) d;
+			double		samplerows_nonnull = samplerows - null_cnt;
+			double		totalrows_nonnull
+			= totalrows * (1.0 - stats->stanullfrac);
 
-			denom = (double) (samplerows - f1) +
-(double) f1 *(double) samplerows / totalrows;
+			numer = samplerows_nonnull * (double) d;
+
+			denom = (samplerows_nonnull - f1) +
+(double) f1 * samplerows_nonnull / totalrows_nonnull;
 
 			stadistinct = numer / denom;
 			/* Clamp to sane range in case of roundoff error */
 			if (stadistinct < (double) d)
 stadistinct = (double) d;
-			if (stadistinct > totalrows)
-stadistinct = totalrows;
+			if (stadistinct > totalrows_nonnull)
+stadistinct = totalrows_nonnull;
 			stats->stadistinct = floor(stadistinct + 0.5);
 		}
 
@@ -2124,16 +2128,17 @@ compute_distinct_stats(VacAttrStatsP stats,
 			/* Track list includes all values seen, and all will fit */
 			num_mcv = track_cnt;
 		}
-		else
+		else if (track_cnt != 0)
 		{
+			int			sample_cnt = nonnull_cnt - toowide_cnt;
 			double		ndistinct = stats->stadistinct;
 			double		avgcount,
 		mincount;
 
 			if (ndistinct < 0)
-ndistinct = -ndistinct * totalrows;
+ndistinct = -ndistinct * sample_cnt;
 			/* estimate # of occurrences in sample of a typical value */
-			avgcount = (double) samplerows / ndistinct;
+			avgcount = (double) sample_cnt / ndistinct;
 			/* set minimum threshold count to store a value */
 			mincount = avgcount * 1.25;
 			if (mincount < 2)
@@ -2434,17 +2439,21 @@ compute_scalar_stats(VacAttrStatsP stats,
 		denom,
 		stadistinct;
 
-			numer = (double) samplerows *(double) d;
+			double		samplerows_nonnull = samplerows - null_cnt;
+			double		totalrows_nonnull
+			= totalrows * (1.0 - stats->stanullfrac);
+
+			numer = samplerows_nonnull * (double) d;
 
-			denom = (double) (samplerows - f1) +
-(double) f1 *(double) samplerows / totalrows;
+			denom = (samplerows_nonnull - f1) +
+(double) f1 * samplerows_nonnull / totalrows_nonnull;
 
 			stadistinct = numer / denom;
 			/* Clamp to sane range in case of roundoff error */
 			if (stadistinct < (double) d)
 stadistinct = (double) d;
-			if (stadistinct > totalrows)
-stadistinct = totalrows;
+			if (stadistinct > totalrows_nonnull)
+stadistinct = totalrows_nonnull;
 			stats->stadistinct = floor(stadistinct + 0.5);
 		}
 
@@ -2480,21 +2489,18 @@ compute_scalar_stats(VacAttrStatsP stats,
 		}
 		else
 		{
-			double		ndistinct = stats->stadistinct;
 			double		avgcount,
 		mincount,
 		maxmincount;
 
-			if (ndistinct < 0)
-ndistinct = -ndistinct * totalrows;
 			/* estimate # of occurrences in sample of a typical value 

Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-03-02 Thread Marco Nenciarini
Hi Magnus,

I've finally found some time to take a look to the patch.

It applies with some fuzziness on master, but the result looks correct.
Unfortunately the OID of the new pg_stop_backup function conflicts with
"pg_blocking_pids()" patch (52f5d578d6c29bf254e93c69043b817d4047ca67).

After changing it the patch does not compile:

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv
-Wno-unused-command-line-argument -g -O0 -g -fno-omit-frame-pointer
-I../../../../src/include
-I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/libxml2
-I/usr/local/include -I/usr/local/opt/openssl/include -c -o xlog.o
xlog.c -MMD -MP -MF .deps/xlog.Po
xlog.c:1:19: error: use of undeclared identifier 'tblspc_mapfbuf';
did you mean 'tblspcmapfile'?
initStringInfo(&tblspc_mapfbuf);
^~
tblspcmapfile
xlog.c:9790:19: note: 'tblspcmapfile' declared here
StringInfo tblspcmapfile, bool infotbssize,
^
xlog.c:10073:22: error: use of undeclared identifier 'tblspc_mapfbuf';
did you mean 'tblspcmapfile'?
appendStringInfo(&tblspc_mapfbuf, "%s %s\n", ti->oid, ti->path);
^~
tblspcmapfile
xlog.c:9790:19: note: 'tblspcmapfile' declared here
StringInfo tblspcmapfile, bool infotbssize,
^
xlog.c:10092:19: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
initStringInfo(&labelfbuf);
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10099:21: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
appendStringInfo(&labelfbuf, "START WAL LOCATION: %X/%X (file %s)\n",
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10101:21: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
appendStringInfo(&labelfbuf, "CHECKPOINT LOCATION: %X/%X\n",
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10103:21: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
appendStringInfo(&labelfbuf, "BACKUP METHOD: %s\n",
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10105:21: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
appendStringInfo(&labelfbuf, "BACKUP FROM: %s\n",
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10107:21: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
appendStringInfo(&labelfbuf, "START TIME: %s\n", strfbuf);
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10108:21: error: use of undeclared identifier 'labelfbuf'; did
you mean 'labelfile'?
appendStringInfo(&labelfbuf, "LABEL: %s\n", backupidstr);
^
labelfile
xlog.c:9789:19: note: 'labelfile' declared here
StringInfo labelfile, DIR *tblspcdir, List **tablespaces,
^
xlog.c:10142:15: error: use of undeclared identifier 'labelfbuf'
if (fwrite(labelfbuf.data, labelfbuf.len, 1, fp) != 1 ||
^
xlog.c:10142:31: error: use of undeclared identifier 'labelfbuf'
if (fwrite(labelfbuf.data, labelfbuf.len, 1, fp) != 1 ||
^
xlog.c:10151:10: error: use of undeclared identifier 'labelfbuf'
pfree(labelfbuf.data);
^
xlog.c:10154:8: error: use of undeclared identifier 'tblspc_mapfbuf'
if (tblspc_mapfbuf.len > 0)
^
xlog.c:10178:16: error: use of undeclared identifier 'tblspc_mapfbuf'
if (fwrite(tblspc_mapfbuf.data, tblspc_mapfbuf.len, 1, fp) != 1 ||
^
xlog.c:10178:37: error: use of undeclared identifier 'tblspc_mapfbuf'
if (fwrite(tblspc_mapfbuf.data, tblspc_mapfbuf.len, 1, fp) != 1 ||
^
xlog.c:10189:10: error: use of undeclared identifier 'tblspc_mapfbuf'
pfree(tblspc_mapfbuf.data);
^
xlog.c:10193:17: error: use of undeclared identifier 'labelfbuf'
*labelfile = labelfbuf.data;
^
xlog.c:10194:8: error: use of undeclared identifier 'tblspc_mapfbuf'
if (tblspc_mapfbuf.len > 0)
^
xlog.c:10195:22: error: use of undeclared identifier 'tblspc_mapfbuf'
*tblspcmapfile = tblspc_mapfbuf.data;
^
19 errors generated.
make[4]: *** [xlog.o] Error 1
make[3]: *** [transam-recursive] Error 2
make[2]: *** [access-recursive] Error 2
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2

I've searched in past commits for any recent change that involves the
affected lines, but I have not found any.
Maybe some changes are missing?

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Publish autovacuum informations

2016-03-02 Thread Julien Rouhaud
On 02/03/2016 07:30, Michael Paquier wrote:
> On Tue, Mar 1, 2016 at 11:37 PM, Julien Rouhaud wrote:
>> I'm not sure what are the fancy things that Michael had in mind with
>> exposing the private structure.  Michael, was it something like having
>> the ability to change some of these data through an extension?
> 
> I was referring to you here :)
> I have witnessed already many fancy things coming out of brain, and I
> have no doubt you could make something out of just a sampling of data.

:D

> Jokes apart, what I mean with fancy things here is putting the
> sampling data in a shape that a user suits to him. It would be easy to
> exploit that for example in a background worker that scans
> periodically the shared memory, or have an extension that represents
> this shared memory data into something that can be queried at SQL
> level. Now, the main use case that I have with the data available in
> shared memory is more or less:
> - represent the current shared memory as SQL
> - Scan this data, reshape it and report it elsewhere, say a background
> worker printing out a file.
> 
> Now, take your set of hooks... There are 4 hooks here:
> - autovacuum_list_tables_hook, to do something with the list of tables
> a worker has collected, at the moment they have been collected.
> - autovacuum_end_table_hook, to do something when knowing that a table
> is skipped or cancelled
> - autovacuum_begin_table_hook, to trigger something when a relation is
> beginning to be processed.
> - autovacuum_database_finished_hook, to trigger something once a
> database is done with its processing.
> 
> The only use cases that I have in mind here for those hooks, which
> would help in decision-making to tune autovacuum-related parameters
> would be the following:
> - Log entries regarding those operations, then why not introducing a
> GUC parameter that is an on/off switch, like log_autovacuum (this is
> not a per-relation parameter), defaulting to off. Those could be used
> by pgbadger.

This would be nice, but the point if this proposal is to be able to have
this available at SQL level. (but big +1 on the feature)

> - Have system statistics with a new system relation like
> pg_stat_autovacuum, and make this information available to user.
> Are there other things that you think could make use those hooks? Your
> extension just does pg_stat_autovacuum and emulates the existing
> pg_stat_* facility when gathering information about the global
> autovacuum statistics. So it seems to me that those hooks are not that
> necessary, and that this may help in tuning a system, particularly the
> number of relations skipped would be interesting to have.
> 
> The stats could have a delay, the point being to have hints on how
> autovacuum workers are sorting things out. In short, I am doubtful
> about the need of hooks in those code paths, the thing that we should
> try to do instead is to improve native solutions to give user more
> information regarding how autovacuum works, which help in tuning it.
> 

Good point, I don't see a lot of information available with this hooks
that a native system statistics couldn't offer. To have the same amount
of information, I think we'd need a pg_stat_autovacuum view that shows a
realtime insight of the workers, and also add some aggregated counters
to PgStat_StatTabEntry. I wonder if adding counters to
PgStat_StatTabEntry would be accepted though.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Daniel Verite
Tomas Vondra wrote:

> My guess is this is a problem at the protocol level - the 'd' message is 
> CopyData, and all the messages use int32 to define length. So if there's 
> a 2GB row, it's likely to overflow.

Yes. Besides, the full message includes a negative length:

> postgres=# \copy big2 to /dev/null
> lost synchronization with server: got message type "d", length -1568669676

which happens to be the correct size if interpreted as an unsigned int32

-1568669676 = (int) (1300UL*1024*1024*2 + 3 + 3*4 + 1 + 4)

One interpretation would be that putting an unsigned length in
CopyData message is a protocol violation.

However it's not clear to me that Int32 in the doc necessarily designates
a signed integer.

Int32 is defined as:
   Intn(i) 

   An n-bit integer in network byte order (most significant byte
   first). If i is specified it is the exact value that will appear,
   otherwise the value is variable. Eg. Int16, Int32(42).

There's a least one example when we use Int16 as unsigned:
the number of parameters in Bind (F) can be up to 65535.
This maximum is tested explicitly and refered to at several
places in fe-exec.

In some instances, Int32 is clearly signed, because -1 is accepted
to indicate NULLness, such as again in Bind (F) for the length of
the parameter value.

From this it seems to me that Intn is to be interpreted as
signed or unsigned on a case by case basis.

Back to CopyData (F & B), it's documented as:

  Byte1('d')
  Identifies the message as COPY data.

  Int32
  Length of message contents in bytes, including self.

  Byten
  Data that forms part of a COPY data stream. Messages sent from the
  backend will always correspond to single data rows, but messages sent
  by frontends might divide the data stream arbitrarily.

I don't see any hint that this length is signed, nor any reason of having
it signed.

I guess before the patch it didn't matter, for the B case at least,
because the backend never sent more than 1GB.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics v10

2016-03-02 Thread Tomas Vondra

Hi,

On 03/02/2016 05:17 PM, Thom Brown wrote:
...

Well, firstly, the patches all apply.

But I have a question (which is coming really late, but I'll ask it
anyway).  Is it intended that CREATE STATISTICS will only be for
multivariate statistics?  Or do you think we could add support for
expression statistics in future too?

e.g.

CREATE STATISTICS stats_comment_length ON comments (length(comment));


Hmmm, that's not a use case I had in mind while working on the patch, 
but it sounds interesting. I don't see why the syntax would not support 
this - I'd like to add support for expressions into the multivariate 
patch, but that will still require at least 2 columns to build 
multivariate statistics. But perhaps it'd be possible to relax the "at 
least 2 columns" requirement, and collect regular statistics somewhere.


So I don't see why the syntax could not work for that case too, but I'm 
not going to work on that.





I also note that the docs contain this:

CREATE STATISTICS [ IF NOT EXISTS ] statistics_name ON table_name ( [
   { column_name } ] [, ...])
[ WITH ( statistics_parameter [= value] [, ... ] )

The open square bracket before WITH doesn't get closed.  Also, it
indicates that columns are entirely options, so () would be valid, but
that's not the case. Also, a space is missing after the first
ellipsis.  So I think this should read:

CREATE STATISTICS [ IF NOT EXISTS ] statistics_name ON table_name (
   { column_name } [, ... ])
[ WITH ( statistics_parameter [= value] [, ... ] ) ]


Yeah, will fix.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread David Steele
On 3/2/16 11:10 AM, Shulgin, Oleksandr wrote:
> On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra
> mailto:tomas.von...@2ndquadrant.com>> wrote:
> 
> I think it'd be useful not to have all the changes in one lump, but
> structure this as a patch series with related changes in separate
> chunks. I doubt we'd like to mix the changes in a single commit, and
> it makes the reviews and reasoning easier. So those NULL-handling
> fixes should be in one patch, the MCV patches in another one.
> 
> 
> OK, such a split would make sense to me.  Though, I'm a bit late as the
> commitfest is already closed to new patches, I guess asking the CF
> manager to split this might work (assuming I produce the patch files)?

If the patch is broken into two files that gives the review/committer
more options but I don't think it requires another CF entry.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] multivariate statistics v10

2016-03-02 Thread Thom Brown
On 2 March 2016 at 14:56, Tomas Vondra  wrote:
>
> Hi,
>
> Attached is v10 of the patch series. There are 9 parts at the moment:
>
>   0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patch
>   0002-shared-infrastructure-and-functional-dependencies.patch
>   0003-clause-reduction-using-functional-dependencies.patch
>   0004-multivariate-MCV-lists.patch
>   0005-multivariate-histograms.patch
>   0006-multi-statistics-estimation.patch
>   0007-multivariate-ndistinct-coefficients.patch
>   0008-change-how-we-apply-selectivity-to-number-of-groups-.patch
>   0009-fixup-of-regression-tests-plans-changes-by-group-by-.patch
>
> However, the first one is still just a temporary workaround that I plan to 
> address next, and the last 3 are all dealing with the ndistinct coefficients 
> (and shall be squashed into a single chunk).
>
>
> README docs
> ---
>
> Aside from fixing a few bugs, there are several major improvements, the main 
> one being that I've moved most of the comments explaining how it all works 
> into a set of regular README files, located in src/backend/utils/mvstats:
>
> 1) README.stats - Overview of available types of statistics, what
>clauses can be estimated, how multiple statistics are combined etc.
>This is probably the right place to start.
>
> 2) docs for each type of statistics currently available
>
>README.dependencies - soft functional dependencies
>README.mcv  - MCV lists
>README.histogram- histograms
>README.ndistinct- ndistinct coefficients
>
> The READMEs are added and modified through the patch series, so the best 
> thing to do is apply all the patches and start reading.
>
> I have not improved the user-oriented SGML documentation in this patch, 
> that's one of the tasks I'd lie to work on next. But the READMEs should give 
> you a good idea how it's supposed to work, and there are some examples of use 
> in the regression tests.
>
>
> Significantly simplified places
> ---
>
> The patch version also significantly simplifies several places that were 
> needlessly complex in the previous ones - firstly the function evaluating 
> clauses on multivariate histograms was rather needlessly bloated, so I've 
> simplified it a lot. Similarly for the code in clauselist_select() that 
> combines multiple statistics to estimate a list of clauses - that's much 
> simpler now too. And various other pieces.
>
> That being said, I still think the code in clausesel.c can be simplified. I 
> feel there's a lot of cruft, mostly due to unknowingly implementing something 
> that could be solved by an existing function.
>
> A prime example of that is inspecting the expression tree to check if we know 
> how to estimate the clauses using the multivariate statistics. That sounds 
> like a nice match for expression walker, but currently is done by custom 
> code. I plan to look at that next.
>
> Also, I'm not quite sure I understand what the varRelid parameter of 
> clauselist_selectivity is for, so the code may be handling that wrong (seems 
> to be working though).
>
>
> ndistinct coefficients
> --
>
> The one new piece in this patch is the GROUP BY estimation, based on the 
> ndistinct coefficients. So for example you can do this:
>
> CREATE TABLE t AS SELECT mod(i,1000) AS a, mod(i,1000) AS b
> FROM generate_series(1,100) s(i);
> ANALYZE t;
> EXPLAIN SELECT * FROM t GROUP BY a, b;
>
> which currently does this:
>
>   QUERY PLAN
> ---
>  Group  (cost=127757.34..135257.34 rows=6 width=8)
>Group Key: a, b
>->  Sort  (cost=127757.34..130257.34 rows=100 width=8)
>  Sort Key: a, b
>  ->  Seq Scan on t  (cost=0.00..14425.00 rows=100 width=8)
> (5 rows)
>
> but we know that there are only 1000 groups because the columns are 
> correlated. So let's create ndistinct statistics on the two columns:
>
> CREATE STATISTICS s1 ON t (a,b) WITH (ndistinct);
> ANALYZE t;
>
> which results in estimates like this:
>
>QUERY PLAN
> -
>  HashAggregate  (cost=19425.00..19435.00 rows=1000 width=8)
>Group Key: a, b
>->  Seq Scan on t  (cost=0.00..14425.00 rows=100 width=8)
> (3 rows)
>
> I'm not quite sure how to combine this type of statistics with MCV lists and 
> histograms, so for now it's used only for GROUP BY.

Well, firstly, the patches all apply.

But I have a question (which is coming really late, but I'll ask it
anyway).  Is it intended that CREATE STATISTICS will only be for
multivariate statistics?  Or do you think we could add support for
expression statistics in future too?

e.g.

CREATE STATISTICS stats_comment_length ON comments (length(comment));


I also note that the docs contain this:

CREATE STATISTICS [ IF NOT EXISTS ] statis

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-02 Thread Shulgin, Oleksandr
On Wed, Feb 24, 2016 at 12:30 AM, Tomas Vondra  wrote:

> Hi,
>
> On 02/08/2016 03:01 PM, Shulgin, Oleksandr wrote:
> >
> ...
>
>>
>> I've incorporated this fix into the v2 of my patch, I think it is
>> related closely enough.  Also, added corresponding changes to
>> compute_distinct_stats(), which doesn't produce a histogram.
>>
>
> I think it'd be useful not to have all the changes in one lump, but
> structure this as a patch series with related changes in separate chunks. I
> doubt we'd like to mix the changes in a single commit, and it makes the
> reviews and reasoning easier. So those NULL-handling fixes should be in one
> patch, the MCV patches in another one.


OK, such a split would make sense to me.  Though, I'm a bit late as the
commitfest is already closed to new patches, I guess asking the CF manager
to split this might work (assuming I produce the patch files)?

--
Alex


Re: [HACKERS] pl/pgsql exported functions

2016-03-02 Thread Marko Tiikkaja

On 11/02/16 18:29, Magnus Hagander wrote:

Most of the pl/pgsql functions and variables are prefixed plpgsql_, so they
don't risk conflicting with other shared libraries loaded.

There are a couple that are not prefixed. Attached patch fixes that. It's
mostly a cleanup, but I think it's something we should do since it's only 2
variables and 2 functions.

AFAICT these are clearly meant to be internal. (the plugin variable is
accessed through find_rendezvous_variable)

Comments?


Looks good to me.


.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Addition of extra commit fest entry to park future patches

2016-03-02 Thread Kevin Grittner
On Tue, Mar 1, 2016 at 7:47 PM, Michael Paquier
 wrote:

> 2016-09 has been created then:
> https://commitfest.postgresql.org/10/
> People, feel free to park future patches there.

I think that should be in status "open" rather than "future".

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: redo failed in physical streaming replication while stopping the master server

2016-03-02 Thread wcting
/* 
 * If at page start, we must skip over the page header.  But
we can't 
 * do that until we've read in the page, since the header
size is 
 * variable. 
 */ 

i don't know the meaning behind this comments,

 if ((RecPtr->xrecoff % XLogSegSize) == 0) 
it's a long page header, else a short page header,

so "the header size" can be calculated ? right?



--
View this message in context: 
http://postgresql.nabble.com/redo-failed-in-physical-streaming-replication-while-stopping-the-master-server-tp5889961p5890124.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-02 Thread Tom Lane
I wrote:
> Can't use string ("Test::Builder") as a HASH ref while "strict refs" in use 
> at /usr/share/perl5/Test/Builder.pm line 1798.

> The referenced line number is the end of the file,

Oh, scratch that; I was looking at the wrong file.  Actually,
/usr/share/perl5/Test/Builder.pm has

sub details {
my $self = shift;
return @{ $self->{Test_Results} };
}

and line 1798 is the "return" statement in that.  I still lack enough
Perl-fu to decipher this, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] create opclass documentation outdated

2016-03-02 Thread Alvaro Herrera
Tom Lane wrote:

> I'm inclined to feel that the right answer is to fix all these
> omissions, not to decide that we're not maintaining this documentation
> anymore.  Alvaro, I think this one's in your court.

Will fix.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Alvaro Herrera
Daniel Verite wrote:

> The cause of the crash turns out to be, in enlargeStringInfo():
> 
> needed += str->len + 1;   /* total space required now */
> 
> needed is an int and str->len is an int64, so it overflows when the
> size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
> overwrites memory after it.
> 
> When fixing it with a local int64 copy of the variable, the backend
> no longer crashes and COPY big2 TO 'file' appears to work.

Great, thanks for debugging.

> However, getting it to the client with \copy big2 to 'file'
> still produces the error in psql:
>lost synchronization with server: got message type "d"
> and leaves an empty file, so there are more problems to solve to
> go beyond 2GB text per row.

Well, the CopyData message has an Int32 field for the message length.
I don't know the FE/BE protocol very well but I suppose each row
corresponds to one CopyData message, or perhaps each column corresponds
to one CopyData message.  In either case, it's not possible to go beyond
2GB without changing the protocol ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Alvaro Herrera
Simon Riggs wrote:

> ISTM that we are clearly "going for it"; everybody agrees we should apply
> the patch now.
> 
> The longer we hold off on applying it, the longer we wait for dependent
> changes.

Agreed -- we need this in tree as soon as realistically possible.

There is a a bit a problem here, because this patch conflicts heavily
with at least one other patch that's been in the queue for a long time,
which is Kommi/Rowley's patch for parallel aggregation; the more we
delay applying this one, the worse the deadlines for that one.

I assume they are hard at work updating that patch to apply on top of
Tom's patch.  It's not realistic to expect that we would apply any
further planner changes before this one is in.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Tomas Vondra



On 03/02/2016 04:23 PM, Tom Lane wrote:

Tomas Vondra  writes:

On 03/02/2016 03:18 PM, Daniel Verite wrote:

However, getting it to the client with \copy big2 to 'file'
still produces the error in psql:
lost synchronization with server: got message type "d"
and leaves an empty file, so there are more problems to solve to
go beyond 2GB text per row.



My guess is this is a problem at the protocol level - the 'd' message is
CopyData, and all the messages use int32 to define length. So if there's
a 2GB row, it's likely to overflow.


I'm too lazy to check the exact wording, but I don't think there's a hard
and fast promise in the protocol doc that one CopyData message == one row.
So we could probably subdivide a very wide line into multiple messages.


Well, actually we claim this [1]:

Data that forms part of a COPY data stream. Messages sent from the
backend will always correspond to single data rows, but messages
sent by frontends might divide the data stream arbitrarily.

So that suggests 1:1 messages to rows, although I'm not sure how 
difficult would it be to relax this (or how much the clients might rely 
on this).


[1] http://www.postgresql.org/docs/9.5/static/protocol-message-formats.html

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Tom Lane
Tomas Vondra  writes:
> On 03/02/2016 03:18 PM, Daniel Verite wrote:
>> However, getting it to the client with \copy big2 to 'file'
>> still produces the error in psql:
>> lost synchronization with server: got message type "d"
>> and leaves an empty file, so there are more problems to solve to
>> go beyond 2GB text per row.

> My guess is this is a problem at the protocol level - the 'd' message is 
> CopyData, and all the messages use int32 to define length. So if there's 
> a 2GB row, it's likely to overflow.

I'm too lazy to check the exact wording, but I don't think there's a hard
and fast promise in the protocol doc that one CopyData message == one row.
So we could probably subdivide a very wide line into multiple messages.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Tomas Vondra

Hi,

On 03/02/2016 03:18 PM, Daniel Verite wrote:

I wrote:


postgres=# \copy big2 to /dev/null
lost synchronization with server: got message type "d", length -1568669676

The backend aborts with the following backtrace:

Program terminated with signal 6, Aborted.
#0  0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x007b5a89 in AllocSetDelete (context=0x)
at aset.c:643
#6  0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
mcxt.c:229
#7  0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967


The cause of the crash turns out to be, in enlargeStringInfo():

needed += str->len + 1;  /* total space required now */

needed is an int and str->len is an int64, so it overflows when the
size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
overwrites memory after it.

When fixing it with a local int64 copy of the variable, the backend
no longer crashes and COPY big2 TO 'file' appears to work.

However, getting it to the client with \copy big2 to 'file'
still produces the error in psql:
lost synchronization with server: got message type "d"
and leaves an empty file, so there are more problems to solve to
go beyond 2GB text per row.


My guess is this is a problem at the protocol level - the 'd' message is 
CopyData, and all the messages use int32 to define length. So if there's 
a 2GB row, it's likely to overflow.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 2016-03 Commitfest

2016-03-02 Thread David Steele
The 2016-03 commitfest is officially in progress!

There are currently a lot of patches waiting for review but with no
reviewers:

Needs review: 97
Needs *reviewer*: 58

Please check the "needs reviewer" list
(https://commitfest.postgresql.org/9/?reviewer=-2) for patches to
review.  The committers need our help to work through this enormous load
of patches.

If this is you:

Waiting on Author: 16

Then please post an updated patch as soon as possible so it can be
reviewed.  Some of these patches have not seen any activity from the
author in a long time.

The good news is we are already 14% done with the CF:

Committed: 17
Rejected: 2
Returned with Feedback: 1

I'll post a status update on this thread at least once a week and more
often as needed.  Going forward there will be more detail on individual
patches that are not making progress for whatever reason.

Let's get reviewing!

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

2016-03-02 Thread Daniel Verite
I wrote:

> postgres=# \copy big2 to /dev/null
> lost synchronization with server: got message type "d", length -1568669676
> 
> The backend aborts with the following backtrace:
> 
> Program terminated with signal 6, Aborted.
> #0  0x7f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x7f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x7f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #3  0x7f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #4  0x7f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
> #5  0x007b5a89 in AllocSetDelete (context=0x)
>at aset.c:643
> #6  0x007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
> mcxt.c:229
> #7  0x0055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967

The cause of the crash turns out to be, in enlargeStringInfo():

needed += str->len + 1; /* total space required now */

needed is an int and str->len is an int64, so it overflows when the
size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
overwrites memory after it.

When fixing it with a local int64 copy of the variable, the backend
no longer crashes and COPY big2 TO 'file' appears to work.

However, getting it to the client with \copy big2 to 'file'
still produces the error in psql:
   lost synchronization with server: got message type "d"
and leaves an empty file, so there are more problems to solve to
go beyond 2GB text per row.

Or maybe another approach would be to advertise that this is the maximum
for a row in text mode, and limit the backend's string buffer to this size
for the time being? Notwithstanding that there's still the other direction
client->server to test.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Tom Lane
Robert Haas  writes:
> On Tue, Mar 1, 2016 at 10:22 AM, Tom Lane  wrote:
>> I think the default pgbench queries are too simple to have any possible
>> benefit from this patch.  It does look like you're seeing some extra
>> planning time, which I think is likely due to redundant construction
>> of PathTargets.  The new function set_pathtarget_cost_width() is not
>> very cheap, and in order to minimize the delta in this patch I did
>> not worry much about avoiding duplicate calls of it.  That's another
>> thing in a long list of things to do later ;-).  There might be other
>> pain points I haven't recognized yet.

> Yikes.  The read-only test is an 0.5% hit which isn't great, but the
> read-write test is about 5% which I think is clearly not OK.  What's
> your plan for doing something about that?

I do plan to take a look at it.  Obviously, anything that *does* benefit
from this patch is going to see some planning slowdown as a consequence
of considering more Paths.  But ideally, a query that has no grouping/
aggregation/later steps wouldn't see any difference.  I think we can
get to that --- but I'd rather not complicate v1 with the hacks that
will probably be required.

(My first thought about how to fix that is to not force
set_pathtarget_cost_width to be done immediately on PathTarget
construction, but make it a decouplable step.  I believe that
set_pathtarget_cost_width is only expensive if it's run before
query_planner runs, and we can probably finagle things so that we do not
really care about the cost/width attached to targets made before that.
But this all depends on profiling that I've not done yet...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Simon Riggs
On 2 March 2016 at 13:47, Robert Haas  wrote:

> On Tue, Mar 1, 2016 at 10:22 AM, Tom Lane  wrote:
> > Teodor Sigaev  writes:
> >>> I do not think the patch will make a lot of performance difference
> as-is;
> >>> its value is more in what it will let us do later.  There are a couple
> of
> >
> >> Yep, for now on my notebook (best from 5 tries):
> >> % pgbench -i -s 3000
> >> % pgbench  -s 3000 -c 4 -j 4 -P 1 -T 60
> >> HEAD569 tps
> >> patched 542 tps
> >> % pgbench  -s 3000 -c 4 -j 4 -P 1 -T 60 -S
> >> HEAD9500 tps
> >> patched 9458 tps
> >
> >> Looks close to measurement error, but may be explained increased amount
> of work
> >> for planning. Including, may be, more complicated path tree.
> >
> > I think the default pgbench queries are too simple to have any possible
> > benefit from this patch.  It does look like you're seeing some extra
> > planning time, which I think is likely due to redundant construction
> > of PathTargets.  The new function set_pathtarget_cost_width() is not
> > very cheap, and in order to minimize the delta in this patch I did
> > not worry much about avoiding duplicate calls of it.  That's another
> > thing in a long list of things to do later ;-).  There might be other
> > pain points I haven't recognized yet.
>
> Yikes.  The read-only test is an 0.5% hit which isn't great, but the
> read-write test is about 5% which I think is clearly not OK.  What's
> your plan for doing something about that?
>

Whether artefact of test, or real problem, clearly something fixable.

ISTM that we are clearly "going for it"; everybody agrees we should apply
the patch now.

The longer we hold off on applying it, the longer we wait for dependent
changes.

My vote is apply it early (i.e. now!) and clean up as we go.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] WIP: Upper planner pathification

2016-03-02 Thread Robert Haas
On Tue, Mar 1, 2016 at 10:22 AM, Tom Lane  wrote:
> Teodor Sigaev  writes:
>>> I do not think the patch will make a lot of performance difference as-is;
>>> its value is more in what it will let us do later.  There are a couple of
>
>> Yep, for now on my notebook (best from 5 tries):
>> % pgbench -i -s 3000
>> % pgbench  -s 3000 -c 4 -j 4 -P 1 -T 60
>> HEAD569 tps
>> patched 542 tps
>> % pgbench  -s 3000 -c 4 -j 4 -P 1 -T 60 -S
>> HEAD9500 tps
>> patched 9458 tps
>
>> Looks close to measurement error, but may be explained increased amount of 
>> work
>> for planning. Including, may be, more complicated path tree.
>
> I think the default pgbench queries are too simple to have any possible
> benefit from this patch.  It does look like you're seeing some extra
> planning time, which I think is likely due to redundant construction
> of PathTargets.  The new function set_pathtarget_cost_width() is not
> very cheap, and in order to minimize the delta in this patch I did
> not worry much about avoiding duplicate calls of it.  That's another
> thing in a long list of things to do later ;-).  There might be other
> pain points I haven't recognized yet.

Yikes.  The read-only test is an 0.5% hit which isn't great, but the
read-write test is about 5% which I think is clearly not OK.  What's
your plan for doing something about that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Commitfest Bug (was: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates)

2016-03-02 Thread Magnus Hagander
On Wed, Mar 2, 2016 at 7:34 AM, Michael Paquier 
wrote:

> On Wed, Mar 2, 2016 at 9:19 PM, Magnus Hagander 
> wrote:
> > Needs Review -> Needs Review
> > Waiting on Author -> Refuse moving
> > Ready for committer -> Ready for Committer
> > Committed -> refuse moving
> > Moved to next cf -> refuse moving (if it's already set like this, it
> would
> > probably be a bug)
> > Returned with feedback -> Refuse moving
> > Which basically means we only move things that are in needs review or
> ready
> > for committer state, and that we never actually change the status of a
> patch
> > in the moving.
> >
> > Is that a correct summary?
>
> Yes, thanks for the summary. It looks like this is what people would
> expect based on what I read here.
>
>
Ok, I've pushed a code that does that.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Commitfest Bug (was: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates)

2016-03-02 Thread Michael Paquier
On Wed, Mar 2, 2016 at 9:19 PM, Magnus Hagander  wrote:
> Needs Review -> Needs Review
> Waiting on Author -> Refuse moving
> Ready for committer -> Ready for Committer
> Committed -> refuse moving
> Moved to next cf -> refuse moving (if it's already set like this, it would
> probably be a bug)
> Returned with feedback -> Refuse moving
> Which basically means we only move things that are in needs review or ready
> for committer state, and that we never actually change the status of a patch
> in the moving.
>
> Is that a correct summary?

Yes, thanks for the summary. It looks like this is what people would
expect based on what I read here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Commitfest Bug (was: [HACKERS] Re: Reusing abbreviated keys during second pass of ordered [set] aggregates)

2016-03-02 Thread Magnus Hagander
On Tue, Mar 1, 2016 at 10:27 AM, Tom Lane  wrote:

> Alvaro Herrera  writes:
> > Magnus Hagander wrote:
> >> That said, we can certainly reconsider that. Would we always copy the
> value
> >> over? Even if it was, say, rejected? (so it would be copied to the new
> CF
> >> but still marked rejected) Or is there a subset of behaviors you're
> looking
> >> for?
>
> > I think the states "Ready for Committer" and "Needs Review" ought to be
> > kept in the new CF.
>
> Definitely.
>
> > I'm unclear on what to do about "Returned with Feedback" and "Waiting on
> > Author"; my first instinct is that if a patch is in those states, then
> > it shouldn't be possible to move to the next CF.  On the other hand, if
> > we force the state to change to "Needs Review" before moving it, we
> > would lose the information of what state it was closed with.  So perhaps
> > for any patch in those two states, the state in the next CF should be
> > "needs review" too.
>
> +1 for not moving such patches to the new CF until the author does
> something --- at which point they'd change to "Needs Review" state.
> But we should not change them into that state without author input.
> And I don't see the value of having them in a new CF until the
> author does something.


Well, "waiting on author" is not considered to be a "closing". So as long
as we have patches in that state we want to do *something* with them in a
CF. Which currently can be "move to next cf", "reject" or "return with
feedback". So basically it means we have to decide if we want to reject it
or return-with-feedback it, if we say it should not be possible to "move to
next cf".

The problem is if the author already has something ready of course, in
which case it will become a new entry on the new CF instead of a moved one
-- which means we loose the historical connection between those two.



> > I am even more unclear on "Rejected".  My instinct says we should refuse
> > a move-to-next-cf for such patches.
>
> Right.  Rejected is dead, it shouldn't propagate forward.
>
>
Ok, so let's try to summarize. We want the following actinos when someone
clicks "move to next cf":

Needs Review -> Needs Review
Waiting on Author -> Refuse moving
Ready for committer -> Ready for Committer
Committed -> refuse moving
Moved to next cf -> refuse moving (if it's already set like this, it would
probably be a bug)
Returned with feedback -> Refuse moving

Which basically means we only move things that are in needs review or ready
for committer state, and that we never actually change the status of a
patch in the moving.

Is that a correct summary?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS][REVIEW]: Password identifiers, protocol aging and SCRAM protocol

2016-03-02 Thread Michael Paquier
On Wed, Mar 2, 2016 at 5:43 PM, Valery Popov  wrote:
>
>>>
>>> db_user_namespace causes the client's and
>>> server's user name representation to differ.
>>> Authentication checks are always done with the server's user name
>>> so authentication methods must be configured for the
>>> server's user name, not the client's.  Because
>>> md5 uses the user name as salt on both the
>>> client and server, md5 cannot be used with
>>> db_user_namespace.
>>>
>
> Also in doc/src/sgml/ref/create_role.sgml is should be instead of
>   PASSWORD VERIFIERS (  class="PARAMETER">verifier_type = ' class="PARAMETER">password'
> like this
>   PASSWORD VERIFIERS (  class="PARAMETER">verifier_type = ' class="PARAMETER">password'

So the  markup is missing. Thanks. I am taking note of it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [NOVICE] WHERE clause not used when index is used

2016-03-02 Thread Simon Riggs
On 1 March 2016 at 20:03, Tom Lane  wrote:


> In any event, I am now of the opinion that this patch needs to be reverted
> outright and returned to the authors for redesign.  There are too many
> things wrong with it and too little reason to think that we have to have
> it in 9.5.
>

Agreed; I'd already got to this thought while reading the thread.

I'll get on with that.

The patch is not of great importance to us? Since we are past 9.6 deadline
it seems just worth reverting and leaving for next release to come back
with a more isolated optimization. I don't want to add the last CF workload
with this.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] [NOVICE] WHERE clause not used when index is used

2016-03-02 Thread Simon Riggs
On 1 March 2016 at 17:22, Jeff Janes  wrote:

> On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane  wrote:
> > Tobias Florek  writes:
> >> When creating an index to use for an ORDER BY clause, a simple query
> >> starts to return more results than expected.  See the following detailed
> >> log.
> >
> > Ugh.  That is *badly* broken.  I thought maybe it had something to do
> with
> > the "abbreviated keys" work, but the same thing happens if you change the
> > numeric column to integer, so I'm not very sure where to look.  Who's
> > touched btree key comparison logic lately?
> >
> > (Problem is reproducible in 9.5 and HEAD, but not 9.4.)
>
>
> Bisects down to:
>
> 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit
> commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f
> Author: Simon Riggs 
> Date:   Tue Nov 18 10:24:55 2014 +
>
> Reduce btree scan overhead for < and > strategies
>
> For <, <=, > and >= strategies, mark the first scan key
> as already matched if scanning in an appropriate direction.
> If index tuple contains no nulls we can skip the first
> re-check for each tuple.
>
> Author: Rajeev Rastogi
> Reviewer: Haribabu Kommi
> Rework of the code and comments by Simon Riggs
>

Mea culpa.

Looks like we'll need a new release as soon as we can lock down a fix.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Konstantin Knizhnik



On 01.03.2016 22:02, Bruce Momjian wrote:

On Tue, Mar  1, 2016 at 07:56:58PM +0100, Petr Jelinek wrote:

Note that I am not saying that other discussed approaches are any
better, I am saying that we should know approximately what we
actually want and not just beat FDWs with a hammer and hope sharding
will eventually emerge and call that the plan.

I will say it again --- FDWs are the only sharding method I can think of
that has a chance of being accepted into Postgres core.  It is a plan,
and if it fails, it fails.  If is succeeds, that's good.  What more do
you want me to say?  I know of no other way to answer the questions you
asked above.


I do not understand why it can fail.
FDW approach may be not flexible enough for building optimal distributed 
query execution plans for complex OLAP queries.
But for simple queries it should work fine. Simple queries corresponds  
OLTP and simple OLAP.
For OLTP we definitely need transaction manager to provide global 
consistency.
And we have actually prototype of integration postgres_fdw with out 
pg_dtm and pg_tsdtm transaction managers.

The results are quite IMHO promising (see attached diagram).

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



DTM-pgconf.pdf
Description: Adobe PDF document

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue with NULLS LAST, with postgres_fdw sort pushdown

2016-03-02 Thread Rajkumar Raghuwanshi
Thanks Ashutosh. Retested the issue after applying given patch,It is fine
now.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

On Wed, Mar 2, 2016 at 2:35 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Thanks Rajkumar for your report. Let me know if the attached patch fixes
> the issue.
>
> The code did not add NULL LAST clause the case when pk_nulls_first is
> false in pathkey. PFA the fix for the same. I have also added few tests to
> postgres_fdw.sql for few combinations of asc/desc and nulls first/last.
>
> On Mon, Feb 29, 2016 at 3:49 PM, Rajkumar Raghuwanshi <
> rajkumar.raghuwan...@enterprisedb.com> wrote:
>
>> Hi,
>>
>> I am testing postgres_fdw sort pushdown feature for PostgreSQL 9.6 DB,
>> and I observed below issue.
>>
>> *Observation: *If giving nulls last option with the order by clause as
>> 'desc nulls last', remote query is not considering nulls last and giving
>> wrong result in 9.6 version. while in 9.5 it is giving proper result.
>>
>> for testing, I have a table "fdw_sort_test" in foreign server for which
>> postgres_fdw, foreign table created in local server.
>>
>>  db2=# select * from fdw_sort_test ;
>>  id | name
>> +--
>>   1 | xyz
>>   3 |
>>   2 | abc
>>   4 | pqr
>> (4 rows)
>>
>>on version 9.6 :
>>
>>  db1=# select * from fdw_sort_test order by name
>> desc nulls last;
>>   id | name
>>  +--
>>3 |
>>1 | xyz
>>4 | pqr
>>2 | abc
>>  (4 rows)
>>
>>  db1=# explain verbose select * from fdw_sort_test
>> order by name desc nulls last;
>> QUERY
>> PLAN
>>  --
>> --
>>   Foreign Scan on public.fdw_sort_test
>> (cost=100.00..129.95 rows=561 width=122)
>> Output: id, name
>> Remote SQL: SELECT id, name FROM
>> public.fdw_sort_test ORDER BY name DESC
>>  (3 rows)
>>
>>
>> on version 9.5 :
>>  db1=# select * from fdw_sort_test order by name
>> desc nulls last;
>>id | name
>>   +--
>> 1 | xyz
>> 4 | pqr
>> 2 | abc
>> 3 |
>>   (4 rows)
>>
>>  db1=# explain verbose select * from fdw_sort_test
>> order by name desc nulls last;
>> QUERY
>> PLAN
>>  --
>> 
>>   Sort  (cost=152.44..153.85 rows=561 width=122)
>> Output: id, name
>> Sort Key: fdw_sort_test.name DESC NULLS LAST
>> ->  Foreign Scan on public.fdw_sort_test
>> (cost=100.00..126.83 rows=561 width=122)
>>   Output: id, name
>>   Remote SQL: SELECT id, name FROM
>> public.fdw_sort_test
>>
>> *steps to reproduce : *
>>
>> --connect to sql
>> \c postgres postgres
>> --create role and database db1, will act as local server
>> create role db1 password 'db1' superuser login;
>> create database db1 owner=db1;
>> grant all on database db1 to db1;
>>
>> --create role and database db2, will act as foreign server
>> create role db2 password 'db2' superuser login;
>> create database db2 owner=db2;
>> grant all on database db2 to db2;
>>
>> --connect to db2 and create a table
>> \c db2 db2
>> create table fdw_sort_test (id integer, name varchar(50));
>> insert into fdw_sort_test values (1,'xyz');
>> insert into fdw_sort_test values (3,null);
>> insert into fdw_sort_test values (2,'abc');
>> insert into fdw_sort_test values (4,'pqr');
>>
>> --connect to db1 and create postgres_fdw
>> \c db1 db1
>> create extension postgres_fdw;
>> create server db2_link_server foreign data wrapper postgres_fdw options
>> (host 'db2_machine_ip', dbname 'db2', port 'db_machine_port_no');
>> create user mapping for db1 server db2_link_server options (user 'db2',
>> password 'db2');
>>
>> --create a foreign table
>> create foreign table fdw_sort_test (id integer, name varchar(50)) server
>> db2_link_server;
>>
>> --run the below query and c

Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Alexander Korotkov
On Tue, Mar 1, 2016 at 10:11 PM, Bruce Momjian  wrote:

> On Tue, Mar  1, 2016 at 02:02:44PM -0500, Bruce wrote:
> > On Tue, Mar  1, 2016 at 07:56:58PM +0100, Petr Jelinek wrote:
> > > Note that I am not saying that other discussed approaches are any
> > > better, I am saying that we should know approximately what we
> > > actually want and not just beat FDWs with a hammer and hope sharding
> > > will eventually emerge and call that the plan.
> >
> > I will say it again --- FDWs are the only sharding method I can think of
> > that has a chance of being accepted into Postgres core.  It is a plan,
> > and if it fails, it fails.  If is succeeds, that's good.  What more do
> > you want me to say?  I know of no other way to answer the questions you
> > asked above.
>
> I guess all I can say is that if FDWs existed when Postgres XC/XL were
> being developed, that they likely would have been used or at least
> considered.  I think we are basically making that attempt now.


If FDWs existed then Postgres XC/XL were being developed then I believe
they would try to build full-featured prototype of FDW based sharding. If
this prototype succeed then we could make a full roadmap.
For now, we don't have a full roadmap, we have only some pieces. This is
why people doubt. When you're speaking about advances that are natural to
FDW, then no problem, nobody is against FDW advances. However, other things
are unclear.
You can try to build full-featured prototype to convince people. Despite it
would take some resources it will save more resources because it would save
us from errors.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Alexander Korotkov
On Tue, Mar 1, 2016 at 7:03 PM, Robert Haas  wrote:

> On Tue, Mar 1, 2016 at 10:37 AM, Bruce Momjian  wrote:
> > On Tue, Mar  1, 2016 at 10:19:45AM -0500, Robert Haas wrote:
> >> > Two reasons:
> >> > 1. There is no ideal implementation of DTM which will fit all
> possible needs
> >> > and be  efficient for all clusters.
> >>
> >> Hmm, what is the reasoning behind that statement?  I mean, it is
> >> certainly true that there are some places where we have decided that
> >> one-size-fits-all is not the right approach.  Indexing, for example.
> >
> > Uh, is that even true of indexing?  While the plug-in nature of indexing
> > allows for easier development and testing, does anyone create plug-in
> > indexing that isn't shipped by us?  I thought WAL support was something
> > that prevented external indexing solutions from working.
>
> True.  There is an API, though, and having pluggable WAL support seems
> desirable too.  At the same time, I don't think we know of anyone
> maintaining a non-core index AM ... and there are probably good
> reasons for that.


It's because we didn't offer legal mechanism for pluggable AMs.


> We end up revising the index AM API pretty
> regularly every time somebody wants to do something new, so it's not
> really a stable API that extensions can just tap into.


I can't buy this argument. One may say this about any single API. Thinking
so will lead you to rejecting any extendability. And that would be in
direct contradiction to original postgres concept.
During last 5 years we add 2 new AMs: SP-GiST and BRIN. And BRIN is very
different from any other AM we have before.
And I wouldn't say that AM API have dramatical changes during that time.
There were some changes. But it would be normal work for extension
maintainers to adopt these changes like they do for other API changes.

There is simple example where we lack of extensible AMs: fast full-text
search. We can't provide it with current GIN, because we lack of positional
information in it. And we can't push these advances into core because
current implementation has not perfect design. Ideal design would be push
all required functionality into btree, then make GIN wrapper over btree
then add required functionality. But this is roadmap for 5-10 years. These
5-10 years uses will suffer from having 3-rd party solutions for fast FTS
instead of in core one. But our design questions is actually not something
that users care about. It's not reliability questions. And having pluggable
AMs would be really chance in this situation. Users could use extension
right now. And then when after many years we finally implement the right
design, they could migrate to in-core solution. But 5-10 years of fast FTS
does matter.


> I suspect that
> a transaction manager API would end up similarly situated.
>

I disagree with you about AM API. But I agree that TM API should be in
similar situation to AM API.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Oleg Bartunov
On Wed, Mar 2, 2016 at 4:36 AM, Tomas Vondra 
wrote:

Hi,
>
> On 03/01/2016 08:02 PM, Bruce Momjian wrote:
>
>> On Tue, Mar  1, 2016 at 07:56:58PM +0100, Petr Jelinek wrote:
>>
>>> Note that I am not saying that other discussed approaches are any
>>> better, I am saying that we should know approximately what we
>>> actually want and not just beat FDWs with a hammer and hope sharding
>>> will eventually emerge and call that the plan.
>>>
>>
>> I will say it again --- FDWs are the only sharding method I can think
>> of that has a chance of being accepted into Postgres core.
>>
>
>
>
> While I disagree with Simon on various things, I absolutely understand why
> he was asking about a prototype, and some sort of analysis of what usecases
> we expect to support initially/later/never, and what pieces are missing to
> get the sharding working. IIRC at the FOSDEM Dev Meeting you've claimed
> you're essentially working on a prototype - once we have the missing FDW
> pieces, we'll know if it works. I disagree that - it's not a prototype if
> it takes several years to find the outcome.
>
>
fully agree. Probably, we all need to help to build prototype in
between-releases period. I see no legal way to resolve the situation.


>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] The plan for FDW-based sharding

2016-03-02 Thread Oleg Bartunov
On Tue, Mar 1, 2016 at 7:03 PM, Robert Haas  wrote:

> On Tue, Mar 1, 2016 at 10:37 AM, Bruce Momjian  wrote:
> > On Tue, Mar  1, 2016 at 10:19:45AM -0500, Robert Haas wrote:
> >> > Two reasons:
> >> > 1. There is no ideal implementation of DTM which will fit all
> possible needs
> >> > and be  efficient for all clusters.
> >>
> >> Hmm, what is the reasoning behind that statement?  I mean, it is
> >> certainly true that there are some places where we have decided that
> >> one-size-fits-all is not the right approach.  Indexing, for example.
> >
> > Uh, is that even true of indexing?  While the plug-in nature of indexing
> > allows for easier development and testing, does anyone create plug-in
> > indexing that isn't shipped by us?  I thought WAL support was something
> > that prevented external indexing solutions from working.
>
> True.  There is an API, though, and having pluggable WAL support seems
> desirable too.  At the same time, I don't think we know of anyone
> maintaining a non-core index AM ... and there are probably good
> reasons for that.  We end up revising the index AM API pretty
>

We'd love to develop new special index AM, that's why we all are for
pluggable WAL. I think there are will be other AM developers, once we open
the door for that.


> regularly every time somebody wants to do something new, so it's not
> really a stable API that extensions can just tap into.  I suspect that
> a transaction manager API would end up similarly situated.
>

I don't expect many other TM developers, so there is no problem with
improving API. We started from practical needs and analyses of many
academical papers. We spent a year to play with several prototypes to prove
our proposed API (expect more in several months). Everybody could download
them a test. Wish we can do that with FDW-based sharding solution.

Of course, we can fork postgres as XC/XL people did and certainly
eventually will do, if community don't accept our proposal, since it's very
difficult to work on cross-releases projects. But then there are will be no
winners, so why do we all are aggressively don't understand each other ! I
was watching  XC/XL for years and thought I don't want to go this way of
isolation from the community, so we decided to let TM pluggable to stay
with community and let everybody prove their concepts. if you have ideas
how to improve TM API, we are open, if you know it's broken by design,
let's help us to fix it.  I have my understanding about FDW, but I
deliberately don't participate in some very hot discussion, just because I
feel myself not commited to work on. Your group is very enthusiastic on
FDW, it's ok until you improve FDW in general way, I'm very happy on
current work.  I prefer you show prototype of sharding solution, which
convince us in functionality and perfromance. I agree with Thomas Vondra,
that we don't want to wait for years to see the result, we want to expect
results, based on prototype, which should be done between releases. If you
don't have enough resources for this, let's do together with community.
Nobody as I've seen are against FDW sharding, people complained about "the
only sharding solution" in postgres, without proving so.





>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Issue with NULLS LAST, with postgres_fdw sort pushdown

2016-03-02 Thread Ashutosh Bapat
Thanks Rajkumar for your report. Let me know if the attached patch fixes
the issue.

The code did not add NULL LAST clause the case when pk_nulls_first is false
in pathkey. PFA the fix for the same. I have also added few tests to
postgres_fdw.sql for few combinations of asc/desc and nulls first/last.

On Mon, Feb 29, 2016 at 3:49 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> Hi,
>
> I am testing postgres_fdw sort pushdown feature for PostgreSQL 9.6 DB, and
> I observed below issue.
>
> *Observation: *If giving nulls last option with the order by clause as
> 'desc nulls last', remote query is not considering nulls last and giving
> wrong result in 9.6 version. while in 9.5 it is giving proper result.
>
> for testing, I have a table "fdw_sort_test" in foreign server for which
> postgres_fdw, foreign table created in local server.
>
>  db2=# select * from fdw_sort_test ;
>  id | name
> +--
>   1 | xyz
>   3 |
>   2 | abc
>   4 | pqr
> (4 rows)
>
>on version 9.6 :
>
>  db1=# select * from fdw_sort_test order by name desc
> nulls last;
>   id | name
>  +--
>3 |
>1 | xyz
>4 | pqr
>2 | abc
>  (4 rows)
>
>  db1=# explain verbose select * from fdw_sort_test
> order by name desc nulls last;
> QUERY
> PLAN
>  --
> --
>   Foreign Scan on public.fdw_sort_test
> (cost=100.00..129.95 rows=561 width=122)
> Output: id, name
> Remote SQL: SELECT id, name FROM
> public.fdw_sort_test ORDER BY name DESC
>  (3 rows)
>
>
> on version 9.5 :
>  db1=# select * from fdw_sort_test order by name desc
> nulls last;
>id | name
>   +--
> 1 | xyz
> 4 | pqr
> 2 | abc
> 3 |
>   (4 rows)
>
>  db1=# explain verbose select * from fdw_sort_test
> order by name desc nulls last;
> QUERY
> PLAN
>  --
> 
>   Sort  (cost=152.44..153.85 rows=561 width=122)
> Output: id, name
> Sort Key: fdw_sort_test.name DESC NULLS LAST
> ->  Foreign Scan on public.fdw_sort_test
> (cost=100.00..126.83 rows=561 width=122)
>   Output: id, name
>   Remote SQL: SELECT id, name FROM
> public.fdw_sort_test
>
> *steps to reproduce : *
>
> --connect to sql
> \c postgres postgres
> --create role and database db1, will act as local server
> create role db1 password 'db1' superuser login;
> create database db1 owner=db1;
> grant all on database db1 to db1;
>
> --create role and database db2, will act as foreign server
> create role db2 password 'db2' superuser login;
> create database db2 owner=db2;
> grant all on database db2 to db2;
>
> --connect to db2 and create a table
> \c db2 db2
> create table fdw_sort_test (id integer, name varchar(50));
> insert into fdw_sort_test values (1,'xyz');
> insert into fdw_sort_test values (3,null);
> insert into fdw_sort_test values (2,'abc');
> insert into fdw_sort_test values (4,'pqr');
>
> --connect to db1 and create postgres_fdw
> \c db1 db1
> create extension postgres_fdw;
> create server db2_link_server foreign data wrapper postgres_fdw options
> (host 'db2_machine_ip', dbname 'db2', port 'db_machine_port_no');
> create user mapping for db1 server db2_link_server options (user 'db2',
> password 'db2');
>
> --create a foreign table
> create foreign table fdw_sort_test (id integer, name varchar(50)) server
> db2_link_server;
>
> --run the below query and checkout the output
> select * from fdw_sort_test order by name desc nulls last;
>
> --check the explain plan
> explain plan select * from fdw_sort_test order by name desc nulls last;
>
> Thanks & Regards,
> Rajkumar Raghuwanshi
> QMG, EnterpriseDB Corporation
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


pg_nulls_la

  1   2   >