date:20170412

Re: [HACKERS] SCRAM authentication, take three

2017-04-12 Thread Noah Misch

On Tue, Apr 11, 2017 at 08:10:23AM +0300, Heikki Linnakangas wrote:
> On 04/11/2017 04:52 AM, Peter Eisentraut wrote:
> >On 4/10/17 04:27, Heikki Linnakangas wrote:
> >>One thing to consider is that we just made the decision that "md5"
> >>actually means "md5 or scram-sha-256". Extrapolating from that, I think
> >>we'll want "scram-sha-256" to mean "scram-sha-256 or scram-sha-256-plus"
> >>(i.e. the channel-bonding variant) in the future. And if we get support
> >>for scram-sha-512, "scram-sha-256" would presumably allow that too.
> >
> >But how would you choose between scram-sha-256-plus and scram-sha-512?
> 
> Good question. We would need to decide the order of preference for those.
> 
> That question won't arise in practice. Firstly, if the server can do
> scram-sha-256-plus, it presumably can also do scram-sha-512-plus. Unless
> there's a change in the way the channel binding works, such that the
> scram-sha-512-plus variant needs a newer version of OpenSSL or something.
> Secondly, the user's pg_authid row will contain a SCRAM-SHA-256 or
> SCRAM-SHA-512 verifier, not both, so that will dictate which one to use.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Heikki,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

2017-04-12 Thread Noah Misch

On Sat, Apr 08, 2017 at 07:30:59AM -0700, Andres Freund wrote:
> On 2017-04-08 16:29:10 +0200, Erik Rijkers wrote:
> > On 2017-04-08 15:56, Andres Freund wrote:
> > > On 2017-04-08 09:51:39 -0400, David Steele wrote:
> > > > On 3/2/17 7:54 PM, Petr Jelinek wrote:
> > > > >
> > > > > Yes the copy patch needs rebase as well. But these ones are fine.
> > > > 
> > > > This bug has been moved to CF 2017-07.
> > > 
> > > FWIW, as these are bug-fixes that need to be backpatched, I do plan to
> > > work on them soon.
> > > 
> > 
> > CF 2017-07 pertains to postgres 11, is that right?
> > 
> > But I hope you mean to commit these snapbuild patches before the postgres 10
> > release?  As far as I know, logical replication is still very broken without
> > them (or at least some of that set of 5 patches - I don't know which ones
> > are essential and which may not be).
> 
> Yes, these should go into 10 *and* earlier releases, and I don't plan to
> wait for 2017-07.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Peter,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-04-12 Thread David Rowley

On 10 March 2017 at 17:33, Ashutosh Bapat
 wrote:
> Yes and I also forgot to update the function prologue to refer to the
> fpinfo_o/i instead of inner and outer relations. Attached patch
> corrects it.

Hi Ashutosh,

This seems to have conflicted with 28b04787. Do you want to rebase, or should I?


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Supporting huge pages on Windows

2017-04-12 Thread Tsunakawa, Takayuki

Hello, Magnus
cc: Andres

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Magnus Hagander
> I think what you'd need to do is enumerate what privileges the user has
> *before* calling CreateRestrictedToken(), using GetTokenInformation().
> And then pass those into PrivilegesToDelete (except for
> SeChangeNotifyPrivilege) in the call to CreateRestrictedToken(), instead
> of using DISABLE_MAX_PRIVILEGE. (and add the privilege needed for huge pages
> before you start that whole process -- that needs to be added in the token
> used *before* we create the restricted one).
> 
> At least that's my guess from reading the docs and trying to remember :)

Oh, I got it now.  Thanks.  The revised patch is attached.  The only modified 
file is pg_ctl.c.  The patch worked as expected.

It is regrettable that I could not make it in time for PG 10, but I'd 
appreciate it if you could review and commit this patch early in PG 11 while 
our memory is fresh.  Thank you for your patience.  I'll create an entry in the 
next CF soon.

Regards
Takayuki Tsunakawa





win_large_pages_v13.patch
Description: win_large_pages_v13.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Tab completion support for ALTER SUBSCRIPTION REFRESH PUBLICATION

2017-04-12 Thread Masahiko Sawada

Hi,

I've attached a patch for $subject. Please check it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


fix_tab_completion_for_alter_sub.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Heikki Linnakangas


On 04/12/2017 11:22 AM, Magnus Hagander wrote:

On Wed, Apr 12, 2017 at 3:25 AM, Bruce Momjian  wrote:


And which enterprises are using SSL without certificates?  And I thought
channel binding required certificates anyway, e.g.:

https://en.wikipedia.org/wiki/Salted_Challenge_Response_
Authentication_Mechanism#Channel_binding

For instance, for the tls-server-end-point channel binding, it is
the
server's TLS certificate.


AFAIK it does require the TLS certifificates, but it does not require TLS
certificate *validation*. You can use channel binding with just self-signed
certs.


tls-server-end-point channel binding type relies on certificates. But 
SCRAM uses "tls-unique" by default, and it does not use certificates. 
It's a bit weird that the wikipedia article uses tls-server-end-point as 
the example, I don't know why anyone would use tls-server-end-point with 
SCRAM.



That said, I stand by my comment that I don't think it's the enterprises
that need or want the channel binding. If they care about it, they have
already put certificate validation in place, and it won't buy them anything.

Because channel binding also only secures the authentication (SCRAM), not
the actual contents and commands that are then sent across the channel,
AFAIK?


TLS protects the contents and the commands. The point of channel binding 
is to defeat a MITM attack, where the client connects to a malicious 
server, using TLS, which then connects to the real server, using another 
TLS connection. Channel binding will detect that the client and the real 
server are not communicating over the same TLS connection, but two 
different TLS connections, and make the authentication fail.


SSL certificates, with validation, achieves the same, but channel 
binding achieves it without the hassle of certificates.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-04-12 Thread Ashutosh Bapat

On Wed, Apr 12, 2017 at 12:18 PM, David Rowley
 wrote:
> On 10 March 2017 at 17:33, Ashutosh Bapat
>  wrote:
>> Yes and I also forgot to update the function prologue to refer to the
>> fpinfo_o/i instead of inner and outer relations. Attached patch
>> corrects it.
>
> Hi Ashutosh,
>
> This seems to have conflicted with 28b04787. Do you want to rebase, or should 
> I?

Here you go.



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


0001-Fix-reporting-of-violation-in-ExecConstraints-again.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-04-12 Thread Amit Langote

Hi Stephen,

On 2017/04/11 22:12, Stephen Frost wrote:
> Amit,
> 
> * Amit Langote (langote_amit...@lab.ntt.co.jp) wrote:
>> On 2017/04/11 0:26, Robert Haas wrote:
>>> Children can have constraints (including NOT NULL constraints) which
>>> parents lack, and can have a different column order, but must have
>>> exactly the same column names and types.
>>
>> Also, what is different in the partitioned parent case is that NOT NULL
>> constraints must be inherited.  That is, one cannot add it only to the 
>> parent.
> 
> If I'm following, children can have additional constraints but any
> constraints on the parent must also exist on all the children.  Is that
> correct?

Yes.  As shown in the examples I posted, in traditional inheritance, only
CHECK constraints are treated that way, whereas NOT NULL constraints are
not.  With partitioning, even the NOT NULL constraints are inherited.  So
if a particular column is set NOT NULL in the parent, all partitions in
the tree must have that constraint and it cannot be dropped from a
partition without dropping it from the parent as well.  Traditional
inheritance applies that rule only to attributes and CHECK constraints.

>> --
>> -- partitioning inheritance
>> --
>> create table parted_parent (a int) partition by list (a);
>> create table part partition of parted_parent for values in (1);
>>
>> -- this is same as traditional inheritance
>> alter table only parted_parent add constraint chka check (a > 0);
>> -- ERROR:  constraint must be added to child tables too
> 
> Ok, this makes sense, but surely that constraint does, in fact, exist on
> the child already or we wouldn't be trying to dump out this constraint
> that exists on the parent?

Yes, that's right.

Now that I have thought about this a bit more, I think I can articulate
the problem statement more clearly.  The problem we have here is about
non-inherited constraints on partitions, specifically, the NOT NULL
constraints which need to be emitted per attribute. We have two options of
doing it: emit it inline with CREATE TABLE or separately with ALTER TABLE
ONLY.

I have been saying that it is preferable to use CREATE TABLE, because
ALTER TABLE ONLY will fail if the partition is itself partitioned.  Why
does it cause an error?  Because of the ONLY part.  It is a user error to
specify ONLY when adding a constraint to a partitioned tables, but...

>>> In Amit's example from the original post, the child has an implicit
>>> NOT NULL constraint that does not exist in the parent.  p1.b isn't
>>> declared NOT NULL, but the fact that it is range-partitioned on b
>>> requires it to be so, just as we would do if b were declared as the
>>> PRIMARY KEY.  Somehow that's not playing nice with pg_dump, but I'm
>>> still fuzzy on the details.
>>
>> Actually, I would like to change the problem definition from "ALTER TABLE
>> ONLY partitioned_table should be avoided" to "Emitting partition's
>> attributes separately should be avoided".
> 
> I don't follow why we think doing:
> 
> CREATE TABLE t1 (c1 int);
> ALTER TABLE ONLY t1 SET c1 NOT NULL;
> 
> is really different from:
> 
> CREATE TABLE t1 (c1 int NOT NULL);
> 
> or why we should teach pg_dump that it's "correct" to consider those two
> to be different.  There are specific cases where they have to be done
> independently, but that's for views because we don't have a way to set a
> default on a view column during CREATE VIEW, or to deal with dropped
> columns or traditionally inheirited columns.
> 
> What isn't clear to me is why the CREATE TABLE + ALTER TABLE isn't
> working, when apparently a CREATE TABLE with the NOT NULL included would
> work.  The issue here seems like it's the order in which the operations
> are happening in, and not that CREATE TABLE + ALTER TABLE is somehow
> different than just the CREATE TABLE.

I think you are right.  CREATE TABLE + ALTER TABLE ONLY should work just
fine in this particular case (i.e. the way pg_dump outputs it).

>> create table p (a int, b int) partition by list (a);
>> create table p1 partition of p for values in (1) partition by range (b);
>> create table p11 partition of p1 (
>> a not null default '1'
>> ) for values from (1) to (10);
> 
> Using the above example, doing a pg_dump and then a restore (into a
> clean initdb'd cluster), I get the following:
> 
> =# CREATE TABLE p (
> -# a integer,
> -# b integer
> -# )
> -# PARTITION BY LIST (a);
> CREATE TABLE
> 
> =*# CREATE TABLE p1 PARTITION OF p
> -*# FOR VALUES IN (1)
> -*# PARTITION BY RANGE (b);
> CREATE TABLE
> 
> =*# ALTER TABLE ONLY p1 ALTER COLUMN b SET NOT NULL;
> ERROR:  constraint must be added to child tables too
> 
> Now, perhaps I'm confused, but isn't p1 the child here?  Which is
> supposed to be able to have constraints that the parent doesn't?

Actually, p1 is a partitioned table, so the error.  And I realize that
that's a wrong behavior.  Currently the check is performed using only the
relkind, which is bogus.  Specifying ONLY should cause an error

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Magnus Hagander

On Wed, Apr 12, 2017 at 11:13 AM, Heikki Linnakangas 
wrote:

> On 04/12/2017 11:22 AM, Magnus Hagander wrote:
>
>> On Wed, Apr 12, 2017 at 3:25 AM, Bruce Momjian  wrote:
>>
>> And which enterprises are using SSL without certificates?  And I thought
>>> channel binding required certificates anyway, e.g.:
>>>
>>> https://en.wikipedia.org/wiki/Salted_Challenge_Response_
>>> Authentication_Mechanism#Channel_binding
>>>
>>> For instance, for the tls-server-end-point channel binding, it is
>>> the
>>> server's TLS certificate.
>>>
>>
>> AFAIK it does require the TLS certifificates, but it does not require TLS
>> certificate *validation*. You can use channel binding with just
>> self-signed
>> certs.
>>
>
> tls-server-end-point channel binding type relies on certificates. But
> SCRAM uses "tls-unique" by default, and it does not use certificates. It's
> a bit weird that the wikipedia article uses tls-server-end-point as the
> example, I don't know why anyone would use tls-server-end-point with SCRAM.


Interesting. But we don't support TLS without certificates, do we? We
support it without client certificates, but we need a server certificate.
So the TLS connection itself still relies on the certificates, justn ot the
channel binding.



> That said, I stand by my comment that I don't think it's the enterprises
>> that need or want the channel binding. If they care about it, they have
>> already put certificate validation in place, and it won't buy them
>> anything.
>>
>> Because channel binding also only secures the authentication (SCRAM), not
>> the actual contents and commands that are then sent across the channel,
>> AFAIK?
>>
>
> TLS protects the contents and the commands. The point of channel binding
> is to defeat a MITM attack, where the client connects to a malicious
> server, using TLS, which then connects to the real server, using another
> TLS connection. Channel binding will detect that the client and the real
> server are not communicating over the same TLS connection, but two
> different TLS connections, and make the authentication fail.
>
> SSL certificates, with validation, achieves the same, but channel binding
> achieves it without the hassle of certificates.


Right. It also achieves some more things, but definitely with more hassle.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Andrew Gierth

> "Alexander" == Alexander Kuzmenkov  writes:

 Alexander> Structurally, the patch consists of two major parts: a
 Alexander> specialized executor node

Why?

It strikes me that the significant fact here is not that we're doing
count(*), but that we don't need any columns from the bitmap heap scan
result.  Rather than creating a whole new node, can't the existing
bitmap heapscan be taught to skip fetching the actual table page in
cases where it's all-visible, not lossy, and no columns are needed?

(this would also have the advantage of getting parallelism for free)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Korotkov

On Wed, Apr 12, 2017 at 12:40 AM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Tue, Apr 11, 2017 at 8:24 PM, Alexander Kuzmenkov <
> a.kuzmen...@postgrespro.ru> wrote:
>
>> I would like to propose a patch that speeds up the queries of the form
>> 'select
>> count(*) ... where ...',  where the restriction clauses can be satisfied
>> by some
>> indexes. At the moment, such queries use index-only scans followed by
>> aggregation. Index-only scans are only possible for indexes that are
>> capable of
>> returning indexed tuples, that is, support the 'amgettuple' access
>> method. They
>> are not applicable to indexes such as GIN and RUM. However, it is
>> possible to
>> improve count(*) performance for indexes that support bitmap scans. Having
>> performed a bitmap index scan or a combination of such, the bits in
>> bitmap can
>> be counted to obtain the final result. Of course, the bitmap pages that
>> are
>> lossy or not visible to all existing transactions will still require heap
>> access.
>>
>
> That's a cool feature for FTS users! Please, register this patch to the
> next commitfest.
>
> This patch has some important limitations:
>> * It only supports targetlist consisting of a single expression that can
>> be
>> projected from count(*).
>> * count(expr) is not supported. We could support it for cases where the
>> "expr is not null" restriction can be satisfied with an index.
>> * The current implementation does not support parallel execution. It
>> could be
>> implemented during the PostgreSQL 11 release cycle.
>> * For some indexes, the bitmap index scan will always require rechecking
>> all
>> the tuples. Bitmap count plans should not be used in such cases. For now,
>> this
>> check is not implemented.
>>
>
> Does this limitation cause a performance drawback?  When bitmap index scan
> returns all rechecks, alternative to Bitmap Count is still Aggregate +
> Bitmap Heap Scan.  Thus, I think Bitmap Count would have the same
> performance or even slightly faster.  That's worth testing.
>
> Also, what is planning overhead of this patch?  That's worth testing too.
>

Another thing catch my eye.  The estimated number of rows in Bitmap Count
node is the same as in Bitmap Index Scan node.  Should it be 1 because it
always returns single row?

test1=# explain analyze select count(*) from pglist where fts @@
> to_tsquery( 'tom & lane' );
>  QUERY
> PLAN
>
> 
>  Bitmap Count on pglist  (cost=550.65..1095.68 rows=54503 width=8) (actual
> time=1120.281..1120.281 rows=1 loops=1)
>Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
>Heap Fetches: 0
>Heap Blocks: exact=105992
>->  Bitmap Index Scan on idx_pglist_rum_fts  (cost=0.00..537.02
> rows=54503 width=0) (actual time=1056.060..1056.060 rows=222813 loops=1)
>  Index Cond: (fts @@ to_tsquery('tom & lane'::text))
>  Planning time: 119.568 ms
>  Execution time: 1121.409 ms
> (8 rows)


--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Vacuum full stats reporting

2017-04-12 Thread Robert Haas

On Mon, Apr 10, 2017 at 4:36 AM, Magnus Hagander  wrote:
> Right now, VACUUM FULL are not reported in pgstat. That seems bad:ish. I can
> see two reasonable ways to proceed:
>
> 1. Start reporting VACUUM FULL as regular vacuums, so they count up
> vacuum_count and last_vacuum in pg_stat_*_tables.
>
> 2. Create a new set of counters for CLUSTER and VACUUM FULL (which are
> arguably the same in this regard, in how they behave wrt the system), and
> add counters cluster_count and last_cluster in pg_stat_*_tables.
>
> Option 2 clearly adds more information and I can see quite a few cases where
> this would be useful. But what do people think of the potential overhead
> there? Worth it?
>
> Due to this, it also does not (AFAICT) reset the counters for things like
> dead tuples. This part seems like a pure bug. Perhaps we should at least do
> something like (1) as a backpatch, even if we decide to do (2) in future
> versions (11+ i guess)?

Maybe we should have done #1 originally, but I think it's too late to
back-patch a behavior change.  We don't really want the behavior in
this area to be different depending on which minor release somebody is
running, or at least I don't.  I think it's better to wait and deal
with this in v11 in whatever way we think is best.  I'd favor #2
unless the overhead is unacceptable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP tests take a long time

2017-04-12 Thread Amit Kapila

On Tue, Apr 11, 2017 at 9:38 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Apr 11, 2017 at 11:32 AM, Andrew Dunstan
>>  wrote:
>>> This buildfarm run as you can see takes 33m32s, and the Tap tests take a
>>> combined 19m52s of that time.
>
>> I don't think it's quite fair to complain about the TAP tests taking a
>> long time to run as a general matter.  Many people here have long
>> wanted a separate test-suite for long running tests that can be run to
>> really check everything possible, and apparently, TAP tests are it.
>
>> What I think would be more useful is to break down where the time is
>> getting spent.  It may be that some of those tests are not adding
>> value proportionate to their runtime.
>
> The other thing that might be useful here is to push on parallelizing
> buildfarm runs.  Admittedly that will do nothing for the oldest and
> slowest buildfarm critters, but for reasonably modern hardware the
> serialization of the tests really handicaps you.  We seem to have
> fixed that for manual application of "make check-world", at least
> if you know the right magic incantations to parallelize it; but
> AFAIK the buildfarm script is entirely serial.
>
> BTW, speaking of "value proportionate to runtime", the hash_index
> regression script is now visibly a bottleneck in the core regression
> tests.  Could we take another look at whether that needs to run
> quite so long?
>

I have looked into the tests and I think we can do some optimization
without losing much on code coverage.  First is we are doing both
Vacuum Full and Vacuum on hash_split_heap in the same test after
executing few statements, it seems to me that we can avoid doing
Vacuum Full.  Also, I think we can try by reducing the number of
inserts in hash_split_heap to see if we can achieve the code coverage
of split bucket code path.  Finally, I am not sure if Reindex Index
hash_split_index is required for code coverage, but we might need it
for the sake of testing that operation.

Mithun, as you are the original author of these tests, can you please
try some of the above optimizations and any others you can think of
and see if we can reduce the time for hash index tests?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-04-12 Thread David Rowley

On 12 April 2017 at 21:45, Ashutosh Bapat
 wrote:
> On Wed, Apr 12, 2017 at 12:18 PM, David Rowley
>  wrote:
>> On 10 March 2017 at 17:33, Ashutosh Bapat
>>  wrote:
>>> Yes and I also forgot to update the function prologue to refer to the
>>> fpinfo_o/i instead of inner and outer relations. Attached patch
>>> corrects it.
>>
>> Hi Ashutosh,
>>
>> This seems to have conflicted with 28b04787. Do you want to rebase, or 
>> should I?
>
> Here you go.

Looks like the wrong patch.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-12 Thread Andreas Karlsson


On 04/12/2017 04:12 PM, Tom Lane wrote:

1. The best thing would still be to make genbki.pl do the conversion,
and write numeric OIDs into postgres.bki.  The core stumbling block
here seems to be that for most catalogs, Catalog.pm and genbki.pl
never really break down a DATA line into fields --- and we certainly
have got to do that, if we're going to replace the values of regproc
fields.  The places that do need to do that approximate it like this:

# To construct fmgroids.h and fmgrtab.c, we need to inspect some
# of the individual data fields.  Just splitting on whitespace
# won't work, because some quoted fields might contain internal
# whitespace.  We handle this by folding them all to a simple
# "xxx". Fortunately, this script doesn't need to look at any
# fields that might need quoting, so this simple hack is
# sufficient.
$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
@{$row}{@attnames} = split /\s+/, $row->{bki_values};

We would need a bullet-proof, non-hack, preferably not too slow way to
split DATA lines into fields properly.  I'm one of the world's worst
Perl programmers, but surely there's a way?

2. Alternatively, we could teach bootstrap mode to build a hashtable
mapping proname to OID while it reads pg_proc data from postgres.bki,
and then make regprocin's bootstrap path consult the hashtable instead
of looking directly at the pg_proc file.  That I'm quite sure is do-able,
but it seems like it's leaving money on the table compared to doing
the transformation earlier.

Thoughts?


Looked at this an option 1 seems simple enough if I am not missing 
something. I might hack something up later tonight. Either way I think 
this improvement can be done separately from the proposed replacement of 
the catalog header files. Trying to fix everything at once often leads 
to nothing being fixed at all.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical replication and inheritance

2017-04-12 Thread Robert Haas

On Wed, Apr 5, 2017 at 8:25 AM, Peter Eisentraut
 wrote:
> After thinking about it some more, I think the behavior we want would be
> that changes to inheritance would reflect in the publication membership.
>  So if you have a partitioned table, adding more partitions over time
> would automatically add them to the publication.

I think the threshold question here is: is the partitioned table a
member of the publication, or are the partitions members of the
publication?  Probably both should be allowed.  If you add a partition
to the publication by name, then we should publish exactly that
partition, full stop.  If you add the partitioned table, then there is
a choice of behaviors, including:

(1) That's an error; the user should publish the partitions instead.
(2) That's a shorthand for all partitions, resolved at the time the
table is added to the publication.  Later changes affect nothing.
(3) All partitions are published, and changes in the set of partitions
change what is published.

In my view, (3) is more desirable than (1) which is more desirable than (2).

> We could implement this either by updating pg_publication_rel as
> inheritance changes, or perhaps more easily by checking and expanding
> inheritance in the output plugin/walsender.  There might be subtle
> behavioral differences between those two approaches that we should think
> through.  But I think one of these two should be done.

The latter approach sounds better to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP tests take a long time

2017-04-12 Thread Tom Lane

Mithun Cy  writes:
> I have tried to optimize the tests maintaining the same coverage we were
> able to get with it.

This patch looks good to me: on my workstation, it reduces the total
runtime of the parallel regression tests from ~20.5 to ~16.5 seconds.
I concur that it doesn't look like it would reduce test coverage
significantly.  It gets rid of a VACUUM FULL and a REINDEX, both of
which should be equivalent to an ordinary index build so far as hash is
concerned, and it trims the volume of data being run through the test.

If no objections, I'll push this soon.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tab completion support for ALTER SUBSCRIPTION REFRESH PUBLICATION

2017-04-12 Thread Fujii Masao

On Wed, Apr 12, 2017 at 5:12 PM, Masahiko Sawada  wrote:
> Hi,
>
> I've attached a patch for $subject. Please check it.

+ COMPLETE_WITH_LIST8("WITH", "CONNECTION", "SET PUBLICATION", "ENABLE",
+ "DISABLE", "OWNER TO", "RENAME TO", "REFRESH PUBLICATION WITH");

"WITH" should be "WITH ("?
Also "REFRESH PUBLICATION WITH" should be "REFRESH PUBLICATION WITH ("?

BTW, I found that ALTER SUBSCRIPTION ... RENAME TO exists in tab-complete.c,
but not in the documentation. ALTER PUBLICATION has the same issue, too.
So I think that we need to apply the attached patch which improves the
documentation
for ALTER PUBLICATION and SUBSCRIPTION.

Regards,

-- 
Fujii Masao

bugfix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Undefined psql variables

2017-04-12 Thread Robert Haas

On Sun, Apr 2, 2017 at 3:56 PM, Tom Lane  wrote:
> So my view of this is that "send the expression to the server" ought
> to be just one option for \if, not the only way to do it.

I heartily agree.  There should be some kind of client-side expression
language, and one thing it should allow is calling out to the server.
Then people who only want to call out to the server can do that, but
people who want to do something else have the option.  Insisting that
this facility isn't allowed to do anything other than consult the
server is (1) inconsistent with what we've already got in v10 and (2)
boxing ourselves into a corner for no very good reason.

Now, the optimal shape for that client-side expression language is not
very clear to me.  Do we want to invent our own language, or maybe
consider using something that already exists?  It's been previously
suggested that we should somehow embed Lua, and this might not be a
bad place to consider doing something like that.  That might be a way
to add a lot of power without having to invent an entirely new
programming language one bit at a time.  If we want to invent our own
expression language, what kind of syntax should it use?  Upon what
kind of design principles should it be based?  There's likely to be
vigorous debate on these topics, and probably also complaints that the
good designs are too much work and the easy-to-implement designs are
too limiting.  (Regular readers of this mailing list will likely be
able to guess which side of those debates I'll be on, but we need to
have them all the same.)

Regarding the ostensible topic of this thread, one thought I had while
reading through these various responses is that the original need
would be well-served by the (somewhat dubious) syntax that bash uses
for variable substitution.  Obviously, we aren't going to change the
interpolate-this-variable character from : to $, but bash has
${parameter:-word} to substitute a default for an unset parameter,
${parameter:=word} to substitute a default for an unset parameter and
also set the parameter to that value, ${parameter:?word} to error out
with word as the error mesage if parameter is not set, and so forth.
If we decide to roll our own, we might consider taking inspiration
from those constructs.

I think that one of the general problems of language design is, as
Larry Wall once said, that a good language should make simple things
simple and complex things possible.  But simple is not an absolute; it
depends on context.  The things which a language needs to make simple
are those things which will be done frequently *in that language*.  So
for example in this case, out-calls to SQL need to be very easy to
write.  Maybe the empty-parameter thing needs to be easy; not sure.
Coming up with a good solution here will involve understanding what
people typically want to do with a language of this type and then
making sure that stuff can be done succinctly - and ideally also
making sure that other stuff is also possible if you're willing to put
in more legwork.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GCC 7 warnings

2017-04-12 Thread Peter Eisentraut

On 4/12/17 00:12, Tom Lane wrote:
> The change in setup_formatted_log_time seems a bit weird:
> 
> - charmsbuf[8];
> + charmsbuf[10];
> 
> The associated call is
> 
>   sprintf(msbuf, ".%03d", (int) (saved_timeval.tv_usec / 1000));
> 
> Now a human can see that saved_timeval.tv_usec must be 0..99, so
> that the %d format item must always emit exactly 3 characters, which
> means that really 5 bytes would be enough.  I wouldn't expect a
> compiler to know that, but if it's making a generic assumption about
> the worst-case width of %d, shouldn't it conclude that we might need
> as many as 13 bytes for the buffer?  Why does msbuf[10] satisfy it
> if msbuf[8] doesn't?

Because the /1000 takes off three digits?

The full message from an isolated test case is

test.c: In function 'f':
test.c:11:15: warning: '%03d' directive writing between 3 and 8 bytes
into a region of size 7 [-Wformat-overflow=]
  sprintf(buf, ".%03d", (int) (tv.tv_usec / 1000));
   ^
test.c:11:15: note: directive argument in the range [-2147483, 2147483]
test.c:11:2: note: '__builtin___sprintf_chk' output between 5 and 10
bytes into a destination of size 8
  sprintf(buf, ".%03d", (int) (tv.tv_usec / 1000));
  ^

(This is with -O2.  With -O0 it only asks for 5 bytes.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interval for launching the table sync worker

2017-04-12 Thread Peter Eisentraut

On 4/12/17 00:48, Masahiko Sawada wrote:
> On Wed, Apr 12, 2017 at 1:28 PM, Peter Eisentraut
>> Perhaps instead of a global last_start_time, we store a per relation
>> last_start_time in SubscriptionRelState?
> 
> I was thinking the same. But a problem is that the list of
> SubscriptionRelState is refreshed whenever the syncing table state
> becomes invalid (table_state_valid = false). I guess we need to
> improve these logic including GetSubscriptionNotReadyRelations().

The table states are invalidated on a syscache callback from
pg_subscription_rel, which happens roughly speaking when a table
finishes the initial sync.  So if we're worried about failing tablesync
workers relaunching to quickly, this would only be a problem if a
tablesync of another table finishes right in that restart window.  That
doesn't seem a terrible issue to me.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP tests take a long time

2017-04-12 Thread Mithun Cy

On Wed, Apr 12, 2017 at 6:24 PM, Amit Kapila 
wrote:
>
> I have looked into the tests and I think we can do some optimization
> without losing much on code coverage.  First is we are doing both
> Vacuum Full and Vacuum on hash_split_heap in the same test after
> executing few statements, it seems to me that we can avoid doing
> Vacuum Full.  Also, I think we can try by reducing the number of
> inserts in hash_split_heap to see if we can achieve the code coverage
> of split bucket code path.  Finally, I am not sure if Reindex Index
> hash_split_index is required for code coverage, but we might need it
> for the sake of testing that operation.
>
> Mithun, as you are the original author of these tests, can you please
> try some of the above optimizations and any others you can think of
> and see if we can reduce the time for hash index tests?

I have tried to optimize the tests maintaining the same coverage we were
able to get with it.

Original coverage

Lines:  1466 2341  62.6 %
Functions: 79101   78.2 %

After test changes

Lines:   15342341 65.5 %
Functions: 79 101 78.2 %

changes done
Earlier :

CREATE INDEX hash_split_index on hash_split_heap USING HASH (keycol);

+ Time: 8.950 ms

INSERT INTO hash_split_heap SELECT 1 FROM generate_series(1, 7) a;
+ Time: 3135.098 ms (00:03.135)

VACUUM FULL hash_split_heap;

+ Time: 2748.146 ms (00:02.748)
REINDEX INDEX hash_split_index;

+ Time: 114.290 ms

Insertion and vacuum full were taking most of the time in tests. I have
removed the vacuum full as suggested by you, which is taken care by vacuum
statement at the end, And reduced the number of rows inserted to trigger a
split, by inserting some records before creating hash index. This way we
start with less number of buckets and thus splits will happen much early.

Now :

+ INSERT INTO hash_split_heap SELECT 1 FROM generate_series(1, 500) a;

+ Time: 2.946 ms

CREATE INDEX hash_split_index on hash_split_heap USING HASH (keycol);

! Time: 8.933 ms

! INSERT INTO hash_split_heap SELECT 1 FROM generate_series(1, 5000) a;

! Time: 81.674 ms

Also removed REINDEX INDEX as suggested which was not adding for coverage.
With this, I think tests should run significantly faster now.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


hash_index_optimize_testruns.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication apply to run with sync commit off by default

2017-04-12 Thread Peter Eisentraut

On 4/9/17 22:17, Noah Misch wrote:
> [Action required within three days.  This is a generic notification.]

I'm expecting Petr to post an updated patch by the end of this week.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Craig Ringer

On 12 Apr. 2017 17:27, "Magnus Hagander"  wrote:

On Wed, Apr 12, 2017 at 11:13 AM, Heikki Linnakangas 
wrote:

> On 04/12/2017 11:22 AM, Magnus Hagander wrote:
>
>> On Wed, Apr 12, 2017 at 3:25 AM, Bruce Momjian  wrote:
>>
>> And which enterprises are using SSL without certificates?  And I thought
>>> channel binding required certificates anyway, e.g.:
>>>
>>> https://en.wikipedia.org/wiki/Salted_Challenge_Response_
>>> Authentication_Mechanism#Channel_binding
>>>
>>> For instance, for the tls-server-end-point channel binding, it is
>>> the
>>> server's TLS certificate.
>>>
>>
>> AFAIK it does require the TLS certifificates, but it does not require TLS
>> certificate *validation*. You can use channel binding with just
>> self-signed
>> certs.
>>
>
> tls-server-end-point channel binding type relies on certificates. But
> SCRAM uses "tls-unique" by default, and it does not use certificates. It's
> a bit weird that the wikipedia article uses tls-server-end-point as the
> example, I don't know why anyone would use tls-server-end-point with SCRAM.

Interesting. But we don't support TLS without certificates, do we? We
support it without client certificates, but we need a server certificate.
So the TLS connection itself still relies on the certificates, justn ot the
channel binding.

Actually, by default we just use the server certificate as a public key
container. libpq pretty much ignores the rest of the certificate's ASN.1
data (or rather, doesn't ask OpenSSL to care about it). It doesn't do any
trust chain checking to find a path to a root cert. It doesn't check the
host name/IP. It doesn't remember the key for later connections, SSH
known-hosts style.

You can freely MITM a sslmode=require libpq connection with your own self
signed cert based proxy and nobody will be the wiser. It provides only
transport encryption. Or just have your MITM not offer SSL; libpq silently
degrades to unencrypted in its default sslmode=prefer mode :(

If you use sslmode=verify-full then you get the behaviour you'd normally
expect - validation of server key against local trusted cert, and
ip/hostname check.

Even then we don't do any sort of smart cert trust path finding like libnss
or Java's JSSE do. We just look at the cert file and expect it to have
signed the server's cert. But at least we can do some kind of validation.
It's painful if you talk to multiple different servers with different
certs, but it works.

I'm not taking a position on the urgency of channel binding by pointing
this out. Just trying to set out current behaviour and limitations.

Re: [HACKERS] Logical replication and inheritance

2017-04-12 Thread Peter Eisentraut

On 4/9/17 22:16, Noah Misch wrote:
> On Wed, Apr 05, 2017 at 08:25:56AM -0400, Peter Eisentraut wrote:
>> After thinking about it some more, I think the behavior we want would be
>> that changes to inheritance would reflect in the publication membership.
>>  So if you have a partitioned table, adding more partitions over time
>> would automatically add them to the publication.
>>
>> We could implement this either by updating pg_publication_rel as
>> inheritance changes, or perhaps more easily by checking and expanding
>> inheritance in the output plugin/walsender.  There might be subtle
>> behavioral differences between those two approaches that we should think
>> through.  But I think one of these two should be done.
> 
> [Action required within three days.  This is a generic notification.]

Relative to some of the other open items I'm involved in, I consider
this a low priority, so I'm not working on it right now.  I would also
appreciate some guidance from those who are more involved in inheritance
and partitioning to determine the best behavior.  It could be argued
that the current behavior is correct, and also that there are several
other correct behaviors.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

2017-04-12 Thread Peter Eisentraut

On 4/12/17 02:31, Noah Misch wrote:
>>> But I hope you mean to commit these snapbuild patches before the postgres 10
>>> release?  As far as I know, logical replication is still very broken without
>>> them (or at least some of that set of 5 patches - I don't know which ones
>>> are essential and which may not be).
>>
>> Yes, these should go into 10 *and* earlier releases, and I don't plan to
>> wait for 2017-07.
> 
> [Action required within three days.  This is a generic notification.]

I'm hoping for a word from Andres on this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] the need to finish

2017-04-12 Thread Erik Rijkers


Logical replication emits logmessages like these:

DETAIL:  90 transactions need to finish.
DETAIL:  87 transactions need to finish.
DETAIL:  70 transactions need to finish.

Could we get rid of that 'need'?   It strikes me as a bit off; something 
that people would say but not a mechanical message by a computer. I 
dislike it strongly.


I would prefer the line to be more terse:

DETAIL:  90 transactions to finish.

Am I the only one who is annoyed by this phrase?


Thanks,


Erik Rijkers















--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] the need to finish

2017-04-12 Thread Tom Lane

Erik Rijkers  writes:
> Logical replication emits logmessages like these:
> DETAIL:  90 transactions need to finish.
> DETAIL:  87 transactions need to finish.
> DETAIL:  70 transactions need to finish.

> Could we get rid of that 'need'?   It strikes me as a bit off; something 
> that people would say but not a mechanical message by a computer. I 
> dislike it strongly.

> I would prefer the line to be more terse:

> DETAIL:  90 transactions to finish.

> Am I the only one who is annoyed by this phrase?

Our style guidelines say that detail messages should be complete
sentences, so I don't like your proposal too much.

Maybe "N transactions remain to finish." ?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Bruce Momjian

On Wed, Apr 12, 2017 at 12:13:03PM +0300, Heikki Linnakangas wrote:
> >That said, I stand by my comment that I don't think it's the enterprises
> >that need or want the channel binding. If they care about it, they have
> >already put certificate validation in place, and it won't buy them anything.
> >
> >Because channel binding also only secures the authentication (SCRAM), not
> >the actual contents and commands that are then sent across the channel,
> >AFAIK?
> 
> TLS protects the contents and the commands. The point of channel binding is
> to defeat a MITM attack, where the client connects to a malicious server,
> using TLS, which then connects to the real server, using another TLS
> connection. Channel binding will detect that the client and the real server
> are not communicating over the same TLS connection, but two different TLS
> connections, and make the authentication fail.
> 
> SSL certificates, with validation, achieves the same, but channel binding
> achieves it without the hassle of certificates.

How does it do that?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does logical replication launcher set application_name?

2017-04-12 Thread Craig Ringer

On 12 April 2017 at 13:34, Kuntal Ghosh  wrote:

> For backend_type=background worker, application_name shows the name of
> the background worker (BackgroundWorker->bgw_name). I think we need
> some way to distinguish among different background workers. But,
> application_name may not be the correct field to show the information.

It's better than (ab)using 'query' IMO.

I'd rather an abbreviated entry to address Tom's concerns about
format. 'lrlaunch' or whatever.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP tests take a long time

2017-04-12 Thread Craig Ringer

On 12 April 2017 at 13:27, Craig Ringer  wrote:
> On 12 April 2017 at 08:22, Michael Paquier  wrote:
>> On Wed, Apr 12, 2017 at 9:12 AM, Craig Ringer
>>  wrote:
>>> Well, you can get a lot of information on timings in crake's latest
>>> builds for example, or nightjar's.
>>>
>>> Here's an interesting fact: almost all the time taken up in the scripts
>>> test set seems to be consumed in running initdb over and over.
>>>
>>> Is it worth adding an init(cache => 1) option to PostgresNode, where we
>>> stash an initdb'd db in tmp_check/  and just do a simple fs copy of it ?
>>
>> This looks like a good idea to me, but I don't like much the idea of
>> an option in the init() routine as that's hard to maintain.
>
> How so?
>
>> It would
>> make sense to me to be able to override the initialization with an
>> environment variable instead
>
> Yeah, that's reasonable.
>
> Patch attached. It supports setting CACHE_INITDB=0 to disable the cache.
>
> With cache: 3m55.063s (user 7.7s, sys 0m11.833s)
>
> Without cache:  4m11.229s (user 0m16.963s, sys 0m12.384s)
>
> so in a serial run it doesn't make a ton of difference.
>
> Adding some timing on initdb runs shows that initdb takes about 1s, a
> cached copy about 0.1s with our Perl based recursive copy code. So
> it's hardly an overwhelming benefit, but might be more beneficial with
> prove -j .

Of course, it'll need locking for prove -j. Added. Patch attached,
applies on top of prior patch.

With

make PROVE_FLAGS="--timer -j 9" check

I don't see much difference with/without caching initdb results -
saves about 4 seconds, from 74 to 70 seconds, but hard to tell with
the error margins.

So if we're going to do anything, defaulting to parallel prove seems a
good start, and I'm not sure caching initdb results is worth it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From db1aadf450a15cbd4041c5f9bf5a0b16f66c2ad5 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 12 Apr 2017 14:03:25 +0800
Subject: [PATCH] Add locking for initdb caching to allow parallel prove

---
 src/test/perl/PostgresNode.pm | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index eef9f66..0eacd41 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -94,6 +94,7 @@ use Socket;
 use Test::More;
 use TestLib ();
 use Scalar::Util qw(blessed);
+use Fcntl qw(:flock);
 
 our @EXPORT = qw(
   get_new_node
@@ -477,15 +478,27 @@ sub _initdb_cached
 	my $pgdata = $self->data_dir;
 
 	my $cachedir = TestLib::tmp_check_dir() . "/initdb_cache";
+	my $lockfile = TestLib::tmp_check_dir() . "/.initdb_cache.lock";
 
 	if (! -d $cachedir)
 	{
 		print(STDERR "initializing initdb cache\n");
-		# initdb into a tempdir, then rename the result so we don't risk
-		# leaving failed initdb results around
-		my $temp_dbdir = TestLib::tempdir("initdb_cache_");
-		$self->_initdb($temp_dbdir, %params);
-		rename($temp_dbdir, $cachedir);
+		open my $lockfh, ">", $lockfile
+			or die "cannot open $lockfile for writing: $1\n";
+		flock($lockfh, LOCK_EX)
+			or die "cannot flock($lockfile) in exclusive mode: $!\n";
+		# If running in parallel with other tests, someone else might've got
+		# the lock first and have swapped an initdb'd directory in already.
+		if (! -d $cachedir)
+		{
+			# initdb into a tempdir, then rename the result so we don't risk
+			# leaving failed initdb results around
+			my $temp_dbdir = TestLib::tempdir("initdb_cache_");
+			$self->_initdb($temp_dbdir, %params);
+			rename($temp_dbdir, $cachedir);
+		}
+		flock($lockfh, LOCK_UN)
+			or die "cannot unlock $lockfile: $!";
 	}
 
 	RecursiveCopy::copypath($cachedir, $pgdata);
-- 
2.5.5


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.

2017-04-12 Thread Noah Misch

On Thu, Apr 06, 2017 at 10:45:26AM +0530, Ashutosh Sharma wrote:
> >> Based on the earlier discussions, I have prepared a patch that would
> >> allow pgstathashindex() to show the number of unused pages in hash
> >> index. Please find the attached patch for the same. Thanks.
> >
> > My idea is that we shouldn't end up with both a zero_pages column and
> > an unused_pages column.  Instead, we should end up with just an
> > unused_pages column, which will include both pages that are all-zeroes
> > and pages that have a valid special space marked as LH_UNUSED.
> >
> > Also, I don't see why it's correct to test PageIsEmpty() here instead
> > of just testing the page type as we do in pageinspect.
> >
> > Attached is a revised patch that shows what I have in mind; please
> > review.  Along the way I made the code for examining the page type
> > more similar to what pageinspect does, because I see no reason for
> > those things to be different, and I think the pageinspect code is
> > better.
> 
> I have reviewed the patch and it looks good to me. Also, the idea of
> including both zero and unused pages in a single 'unused' column looks
> better. Thanks.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Robert,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] 
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Magnus Hagander

On Wed, Apr 12, 2017 at 3:25 AM, Bruce Momjian  wrote:

> On Tue, Apr 11, 2017 at 08:58:30PM -0400, Stephen Frost wrote:
> > > > But just a bit more is needed to make it really a big
> announcement and
> > > > provide real value to (I guess, mostly but very interesting)
> enterprise
> > > > customers, for which MITM and impersonating are big things. The good
> news is
> > > > that adding channel binding is like inverse Paretto: a 20% of extra
> effort
> > > > (I bet significantly less) leads to 80% improvement.
> > >
> > > I don't see why channel binding is a big deal for enterprises because I
> > > assume they are already using SSL:
> >
> > Channel binding should be used with SSL to ensure that there is no
> > man-in-the-middle attack being performed.  It's necessary when the
> > end-points aren't performing full, mutual, certificate-based
> > verification.
>
> And which enterprises are using SSL without certificates?  And I thought
> channel binding required certificates anyway, e.g.:
>
> https://en.wikipedia.org/wiki/Salted_Challenge_Response_
> Authentication_Mechanism#Channel_binding
>
> For instance, for the tls-server-end-point channel binding, it is
> the
> server's TLS certificate.
>

AFAIK it does require the TLS certifificates, but it does not require TLS
certificate *validation*. You can use channel binding with just self-signed
certs.

That said, I stand by my comment that I don't think it's the enterprises
that need or want the channel binding. If they care about it, they have
already put certificate validation in place, and it won't buy them anything.

Because channel binding also only secures the authentication (SCRAM), not
the actual contents and commands that are then sent across the channel,
AFAIK?

> > > I think the big win for SCRAM is the inability to replay md5 packets
> > > after recording 16k sessions (salt was only 32-bit, so a 50% chance of
> > > replay after 16 sessions), and storage of SHA256 hashes instead of MD5
> > > in pg_authid, though the value of that is mostly a check-box item
> > > because collisions are not a problem for the way we use MD5.
> >
> > There are a lot of wins to having SCRAM implemented.  I disagree
> > strongly that securing PG from attacks based on latent information
> > gathering (backups which include pg_authid) is just a "check-box" item.
>
> Well, they have the entire backup so I don't think cracking the password
> is a huge win, though the password does potentially give them access to
> future data.  And it prevents them from reusing the password on another
> server, but _again_, I still think the big win is to prevent replay
> after 16k packets are sniffed.
>

In particular in multi-installation systems, it's quite likely that people
at least use the same password across multiple postgres instances for
example and it would help against those.

It would also help against somebody who stole a backup, and then wants to
use that hash to log into the production system and *modify* things. From
the backup they can read all the data, but they can't modify anything --
but with  a replayable hash, they can connect and modify data. If the
superuser has a password they can also use that password to escalate to OS
level privileges.

I think these are both big wins. And both of them more important than
channel binding.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] Merge join for GiST

2017-04-12 Thread Sergey Mirvoda

Thank you, Alexander!

This is definitely the example we are looking for!
Hat tip to Dmitry especially for this commit
https://github.com/akorotkov/pgsphere/commit/971d2c5d61f17774a6d8d137ca3ad87e2883048f


Regards,
Sergey Mirvoda

On Tue, Apr 11, 2017 at 2:17 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Mon, Apr 10, 2017 at 2:53 PM, Andrew Borodin 
> wrote:
>
>> ==Spatial joins==
>> Scientific papers from the dawn of R-trees and multidimensional
>> indexes feature a lot of algorithms for spatial joins.
>> I.e. you have two sets of geometries s1 and s2, you need to produce
>> all colliding pairs (p1,p2) where p1 in s1 and p2 in s2. For 2 R-trees
>> of equal heights with roots R and S most straightforward[1] algorithm
>> look like:
>>
>> SpatialJoin (R,S: Node)
>> {
>>   FOR (all E in  S)
>> FOR (all ER in R with ER.rect intersecting  E.rect)
>>IF (R is a leaf page)
>>{
>>  Output ER,ES
>>}
>>ELSE
>>{
>>  SpatialJoin (ER.ref, ES.ref)
>>}
>> }
>>
>
> FYI, I've implemented this algorithm for pgsphere.  See following branch.
> https://github.com/akorotkov/pgsphere/tree/experimental
> It's implemented as crossmatch() function which takes as arguments names
> of two indexes over spoint and maximum distance (it checks not overlapping
> but proximity of points).  This function returns setof pairs of TIDs.
>
> Later, Dmitry Ivanov made it a custom scan node.
> https://github.com/akorotkov/pgsphere/tree/crossmatch_cnode
>
> You also can find some experimental evaluation here:
> http://www.adass2016.inaf.it/images/presentations/10_Korotkov.pdf
>
> Feel free to experiment with this code and/or ask any question.
>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>



-- 
--Regards, Sergey Mirvoda

Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Amit Kapila

On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>
>> However, the worker will
>> never execute such a plan as we don't generate a plan where unsafe
>> sublan/initplan is referenced in the node passed to the worker.  If we
>> want to avoid passing parallel-unsafe subplans to workers, then I
>> think we can maintain a list of parallel safe subplans along with
>> subplans in PlannerGlobal and PlannedStmt or maybe keep a parallel
>> safe flag in Plan so that we can pass only parallel safe plans to
>> workers.
>
> Right, we could, say, leave a hole in the subplan list corresponding
> to any subplan that's not parallel-safe.  That seems like a good idea
> anyway because right now there's clearly no cross-check preventing
> a worker from trying to run such a subplan.
>
> Anyone want to draft a patch for this?
>

Please find patch attached based on above discussion.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


push-parallel-safe-subplans-workers_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical rep worker for table sync can start and stop very frequently unexpectedly

2017-04-12 Thread Fujii Masao

On Wed, Apr 12, 2017 at 2:40 AM, Masahiko Sawada  wrote:
> On Wed, Apr 12, 2017 at 2:05 AM, Fujii Masao  wrote:
>> Hi,
>>
>> I observed $subject when I ran the following steps.
>>
>> 1. start the publisher server
>> 2. start the subscriber server
>> 3. create table with primary key and publication for it
>> 4. insert several records into the table
>> 5. create table with primary key and subscription
>> 6. wait for table sync to finish
>> 7. drop the subscription and recreate the subscription
>>
>> Then the launcher starts the worker, and the worker starts
>> another worker for table sync. But that worker for table sync
>> exits immediately because of primary key violation.
>> Then the worker tries to start new worker for table sync immediately
>> and it also exits immediately because of primary key violation.
>> This sequence repeats very fast...
>>
>> Attached is the script that I used to reproduce the issue.
>>
>
> I encountered the same situation[1] and am proposing to put interval
> of launching table sync worker after failed to attempt.
>
> [1] 
> https://www.postgresql.org/message-id/CAD21AoDCnyRJDUY%3DESVVe68AukvOP2dFomTeBFpAd1TiFbjsGg%40mail.gmail.com

Okay, let's discuss this in that thread. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_statistic_ext.staenabled might not be the best column name

2017-04-12 Thread David Rowley

I'd been thinking that staenabled is not the most suitable column name
for storing the types of statistics that are defined for the extended
statistics.  For me, this indicates that something can be disabled,
but there's no syntax for that, and even if there was, if we were to
enable/disable the kinds, we'd likely want to keep tabs on which kinds
were originally defined, otherwise it's more of an ADD and DROP than
an ENABLE/DISABLE.

"stakind" seems like a better name. I'd have personally gone with
"statype" but pg_statistic already thinks stakind is better.

A patch which changes this is attached

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


ext_stats_rename_staenabled.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] my open items vs. my vacation

2017-04-12 Thread Jan de Visser

On Tuesday, April 11, 2017 1:36:44 PM EDT Robert Haas wrote:
> I apologize for any disruption this may cause, but I'm hopeful that it
> won't be too bad.

Spoken like a true American - apologizing for taking vacation :-)

Enjoy your time off. You probably deserved it, and more than the week you're 
taking now.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-12 Thread Bruce Momjian

On Wed, Apr 12, 2017 at 01:31:51PM +0200, Magnus Hagander wrote:
> I think that only leaves the change to the javascript code that Bruce sent.
> Let's see if we can figure out a way to do that one without requiring
> javascript, but after that we have covered all listed issues I think?

Well, we have been using JavaScript for years, so my patch only fixes
the JavaScript we already have.  Seems applying the fix now makes sense
and then we can ponder other methods.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Fujii Masao

Hi,

When I shut down the publisher while I repeated creating and dropping
the subscription in the subscriber, the publisher emitted the following
PANIC error during shutdown checkpoint.

PANIC:  concurrent transaction log activity while database system is
shutting down

The cause of this problem is that walsender for logical replication can
generate WAL records even during shutdown checkpoint.

Firstly walsender keeps running until shutdown checkpoint finishes
so that all the WAL including shutdown checkpoint record can be
replicated to the standby. This was safe because previously walsender
could not generate WAL records. However this assumption became
invalid because of logical replication. That is, currenty walsender for
logical replication can generate WAL records, for example, by executing
CREATE_REPLICATION_SLOT command. This is an oversight in
logical replication patch, I think.

To fix this issue, we should terminate walsender for logical replication
before shutdown checkpoint starts. Of course walsender for physical
replication still needs to keep running until shutdown checkpoint ends,
though.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Kuzmenkov


On 12.04.2017 15:04, Tom Lane wrote:

Andrew Gierth  writes:

"Alexander" == Alexander Kuzmenkov  writes:
  Alexander> Structurally, the patch consists of two major parts: a
  Alexander> specialized executor node
Why?
It strikes me that the significant fact here is not that we're doing
count(*), but that we don't need any columns from the bitmap heap scan
result.  Rather than creating a whole new node, can't the existing
bitmap heapscan be taught to skip fetching the actual table page in
cases where it's all-visible, not lossy, and no columns are needed?

+1 ... while I hadn't actually looked at the code, it seemed to me that
anything like the optimization-as-described would be impossibly klugy
from the planner's standpoint.  Your formulation sounds lots nicer.

Detecting that no columns are needed in the executor might be a bit tricky
because of the planner's habit of inserting a "physical tlist" to avoid a
projection step.  (See also nearby discussion about custom scan planning.)
But we could fix that.  I think one rule that would make sense is to
just disable the physical-tlist substitution if the relation's targetlist
is empty --- it wouldn't be buying much in such a case anyhow.  Then the
runtime tlist for the scan node would also be empty, and away you go.
When making an early prototype of this optimization, I did what you are 
describing with the bitmap heap scan executor node. In an internal 
review, it was said that the bitmap heap scan node is already 
complicated enough and shouldn't have more logic added to it, so I 
rewrote it a separate node. To me, your approach looks good too, so if 
the community is generally in favor of this, I could rewrite the 
executor as such.


With planner, the changes are more complex. Two things must be done 
there. First, when the tlist is empty, we must use a different cost 
function for bitmap heap scan, because the heap access pattern is 
different. Second, choose_bitmap_and() must use a different algorithm 
for choosing the right combination of paths. A standard algorithm 
chooses the combination based on cost. For count(*) purposes, the 
decisive factor is that the path has to check all the restrictions, or 
else we'll need heap access to recheck some of them, which defeats the 
purpose of having this optimization. The planner code that builds and 
costs the index path is fairly complex, and integrating this additional 
behavior into it didn't look good to me. Instead, I created a 
specialized path node and isolated the logic that handles it. The 
resulting implementation adds several functions, but it is mostly 
self-contained and has a single entry point in grouping_planner(). It 
handles the specific case of bitmap count plans and doesn't complicate 
the existing code any further.


The planner part is to some extent independent of whether we use a 
separate executor node or not. If we choose not to, the same count(*) 
optimization code I proposed could create plain bitmap heap scan paths.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-12 Thread Tom Lane

Andres mentioned, and I've confirmed locally, that a large chunk of
initdb's runtime goes into regprocin's brute-force lookups of function
OIDs from function names.  The recent discussion about cutting TAP test
time prompted me to look into that question again.  We had had some
grand plans for getting genbki.pl to perform the name-to-OID conversion
as part of a big rewrite, but since that project is showing few signs
of life, I'm thinking that a more localized performance fix would be
a good thing to look into.  There seem to be a couple of plausible
routes to a fix:

1. The best thing would still be to make genbki.pl do the conversion,
and write numeric OIDs into postgres.bki.  The core stumbling block
here seems to be that for most catalogs, Catalog.pm and genbki.pl
never really break down a DATA line into fields --- and we certainly
have got to do that, if we're going to replace the values of regproc
fields.  The places that do need to do that approximate it like this:

# To construct fmgroids.h and fmgrtab.c, we need to inspect some
# of the individual data fields.  Just splitting on whitespace
# won't work, because some quoted fields might contain internal
# whitespace.  We handle this by folding them all to a simple
# "xxx". Fortunately, this script doesn't need to look at any
# fields that might need quoting, so this simple hack is
# sufficient.
$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
@{$row}{@attnames} = split /\s+/, $row->{bki_values};

We would need a bullet-proof, non-hack, preferably not too slow way to
split DATA lines into fields properly.  I'm one of the world's worst
Perl programmers, but surely there's a way?

2. Alternatively, we could teach bootstrap mode to build a hashtable
mapping proname to OID while it reads pg_proc data from postgres.bki,
and then make regprocin's bootstrap path consult the hashtable instead
of looking directly at the pg_proc file.  That I'm quite sure is do-able,
but it seems like it's leaving money on the table compared to doing
the transformation earlier.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Tom Lane

Alexander Kuzmenkov  writes:
> With planner, the changes are more complex. Two things must be done 
> there. First, when the tlist is empty, we must use a different cost 
> function for bitmap heap scan, because the heap access pattern is 
> different. Second, choose_bitmap_and() must use a different algorithm 
> for choosing the right combination of paths. A standard algorithm 
> chooses the combination based on cost. For count(*) purposes, the 
> decisive factor is that the path has to check all the restrictions, or 
> else we'll need heap access to recheck some of them, which defeats the 
> purpose of having this optimization. The planner code that builds and 
> costs the index path is fairly complex, and integrating this additional 
> behavior into it didn't look good to me.

TBH, I'm not sure you need to do any of that work.  Have you got evidence
that the planner will fail to choose the right plan regardless?  I'm
particularly unconvinced that choose_bitmap_and is a critical problem,
because once you're into having to AND multiple indexes, you're talking
about an expensive query anyhow.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.

2017-04-12 Thread Magnus Hagander

On Wed, Apr 12, 2017 at 2:36 PM, Robert Haas  wrote:

> On Tue, Apr 11, 2017 at 4:05 AM, Magnus Hagander 
> wrote:
> > On Tue, Apr 11, 2017 at 3:26 AM, Michael Paquier <
> michael.paqu...@gmail.com> wrote:
> >> On Mon, Apr 10, 2017 at 5:47 PM, Magnus Hagander 
> >> wrote:
> >> > Based on that we seem to agree here, should we add this as an open
> item?
> >> > Clearly if we want to change this, we should do so before 10.
> >>
> >> This really is a new feature, so as the focus is to stabilize things I
> >> think that we should not make the code more complicated because...
> >
> > The part I'm talking about is the potential adjustment of the patch
> that's
> > already committed. That's not a new feature, that's exactly the sort of
> > thing we'd want to adjust before we get to release. Because once
> released we
> > really can't change it.
>
> I don't really agree.  I think if we go and install a GUC_REPORT GUC
> now, we're much less likely to flush out the bugs in the 'show
> transaction_read_only' mechanism.  Also, now that I think about, the
> reason why we settled on 'show transaction_read_only' against
> alternate queries is because there's some ability for the DBA to make
> that return 'false' rather than 'true' even when not in recovery, so
> that if for example you are using logical replication rather than
> physical replication, you have a way to make
> target_session_attrs=read-write still do something useful.  If you add
> an in_hot_standby GUC that's used instead, you lose that.
>
> Now, we can decide what we want to do about that, but I don't see that
> a change in this area *must* go into v10.  Maybe the answer is that
> target_session_attrs grows additional values like 'primary' and
> 'standby' alongside 'read-write' (and Simon's suggested 'read-only').
> Or maybe we have another idea.  But I don't really see the urgency of
> whacking this around right this minute.  There's nothing broken here;
> there's just more stuff people would like to have.  It can be added
> next time around.
>
>
Fair enough, sounds reasonable. I wasn't engaged in the original thread, so
you clearly have thought about this more than I have. I just wanted to make
sure we're not creating something that's going to cause a head-ache for
such a feature in the future.

(And this is why I was specifically asking you if you wanted it on the open
items list or not!)


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] error handling in RegisterBackgroundWorker

2017-04-12 Thread Robert Haas

On Tue, Apr 11, 2017 at 10:13 PM, Noah Misch  wrote:
> On Tue, Apr 11, 2017 at 11:33:34AM -0400, Tom Lane wrote:
>> Peter Eisentraut  writes:
>> > I think there is no clear agreement here, and no historically consistent
>> > behavior.  I'm prepared to let it go and cross it off the list of open
>> > items.  I believe we should keep thinking about it, but it's not
>> > something that has to hold up beta.
>>
>> Agreed, this doesn't seem like a must-fix-for-beta consideration.
>
> Definitely not a beta blocker, agreed.  Would it be okay to release v10 final
> with the logical replication launcher soft-failing this way?

I'm not really in favor of making a behavior change here, so it would
be fine with me.  Obviously people who think the behavior should be
changed are more likely to disagree with that idea.  Maybe in the long
run we should have a command-line option (not a GUC) that's like...

postgres --soldier-on-valiently

...and then when that's not supplied we can die but when it is
supplied we persist in spite of all failures that are in any way
recoverable.  However, I think that's really a new development effort,
not a cleanup item for v10.  I share Tom's concern to the effect that
single-user mode is a really awful way to try to recover from
failures, so we should avoid decisions that force people into that
recovery mode more often.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

2017-04-12 Thread Simon Riggs

On 3 March 2017 at 00:30, Petr Jelinek  wrote:

>> 0004 - Changes handling of the xl_running_xacts in initial snapshot
>> build to what I wrote above and removes the extra locking from
>> LogStandbySnapshot introduced by logical decoding.

This seems OK and unlikely to have wider impact.

The "race condition" we're speaking about is by design, not a bug.

I think the initial comment could be slightly better worded; if I
didn't already know what is being discussed, I wouldnt be too much
further forwards from those comments.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] the need to finish

2017-04-12 Thread Simon Riggs

On 12 April 2017 at 16:26, Tom Lane  wrote:
> Erik Rijkers  writes:
>> Logical replication emits logmessages like these:
>> DETAIL:  90 transactions need to finish.
>> DETAIL:  87 transactions need to finish.
>> DETAIL:  70 transactions need to finish.
>
>> Could we get rid of that 'need'?   It strikes me as a bit off; something
>> that people would say but not a mechanical message by a computer. I
>> dislike it strongly.
>
>> I would prefer the line to be more terse:
>
>> DETAIL:  90 transactions to finish.
>
>> Am I the only one who is annoyed by this phrase?
>
> Our style guidelines say that detail messages should be complete
> sentences, so I don't like your proposal too much.
>
> Maybe "N transactions remain to finish." ?

waiting for N transactions to complete

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 2:28 AM, Noah Misch  wrote:
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.

Oops.  I had forgotten about this one; committed now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Stephen Frost

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> ... which the user can't tell apart from having fat-fingered the password,
> I suppose?  Doesn't sound terribly friendly.  A report of a certificate
> mismatch is far more likely to lead people to realize there's a MITM.

We might be able to improve on that.

> So this seems more like a hack than like a feature we need so desperately
> as to push it into v10 post-freeze.

Channel binding certainly isn't a 'hack' and is something we should
support, but I agree that it doesn't need to go into v10.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Partitioned tables and relfilenode

2017-04-12 Thread Robert Haas

On Tue, Apr 11, 2017 at 10:15 PM, Amit Langote
 wrote:
> Alright.  So I made it into two patches instead: 0001 fixes the bug that
> validateCheckConstraint() tries to scan partitioned tables and 0002 makes
> trying to convert a partitioned table to a view a user error.

Committed together, after updating the regression test outputs to make
the tests pass.  You forgot to update the expected output file in
0002.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Heikki Linnakangas


On 04/12/2017 06:26 PM, Bruce Momjian wrote:

On Wed, Apr 12, 2017 at 12:13:03PM +0300, Heikki Linnakangas wrote:

That said, I stand by my comment that I don't think it's the enterprises
that need or want the channel binding. If they care about it, they have
already put certificate validation in place, and it won't buy them anything.

Because channel binding also only secures the authentication (SCRAM), not
the actual contents and commands that are then sent across the channel,
AFAIK?


TLS protects the contents and the commands. The point of channel binding is
to defeat a MITM attack, where the client connects to a malicious server,
using TLS, which then connects to the real server, using another TLS
connection. Channel binding will detect that the client and the real server
are not communicating over the same TLS connection, but two different TLS
connections, and make the authentication fail.

SSL certificates, with validation, achieves the same, but channel binding
achieves it without the hassle of certificates.


How does it do that?


Good question, crypto magic? I don't know the details, but the basic 
idea is that you extract a blob of data that uniquely identifies the TLS 
connection. Using some OpenSSL functions, in this case. I think it's a 
hash of some of the TLS handshake messages that were used when the TLS 
connection was established (that's what "tls-unique" means). That data 
is then incorporated in the hash calculations of the SCRAM 
authentication. If the client and the server are not speaking over the 
same TLS connection, they will use different values for the TLS data, 
and the SCRAM computations will not match, and you get an authentication 
failure.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible problem in Custom Scan API

2017-04-12 Thread Tom Lane

Dmitry Ivanov  writes:
>> Uh, no, construction of a custom plan node is entirely driven by the
>> PlanCustomPath method as far as I can see.  You're free to ignore what
>> create_scan_plan did and insert your own tlist.

> Are you sure? Even if it's true, this targetlist should still contain each 
> and every Var that's been requested. If I'm correct, the only way to ensure 
> that is to call build_path_tlist(), which is static (oops!). Perhaps I 
> could make my own, but it uses replace_nestloop_params() (again, static), 
> and the problem goes further and further.

True.  We could expose build_path_tlist(), perhaps, but then you soon
reach the conclusion that PlanCustomPath should also be given access to
the tlist-related "flags" that are being passed around in createplan.c,
and that's a conclusion I don't want to reach.  That whole business is
a bit of a hack that I hope to redesign someday, but if we embed it in
the custom-scan API it will get very hard to change.

I'm coming around to the idea that it'd be better to disable physical
tlists for custom scans.  That optimization is built on the assumption
that providing all the columns is free (or at least, cheaper than doing
ExecProject), but it's easy to think of custom-scan applications where
that's not true.  It's a disastrous choice for a custom-scan provider
that's trying to interface to a column store, for instance.

However, I'm hesitant to make such a change in the back branches; if
there's anyone using custom scans who's negatively affected, they'd be
rightfully unhappy.  Are you satisfied if we change this in HEAD?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible problem in Custom Scan API

2017-04-12 Thread Dmitry Ivanov


Tom Lane wrote:

I'm coming around to the idea that it'd be better to disable physical
tlists for custom scans.


I've been thinking about this all along, and it seems that this is a decent 
decision. However, I've made a tiny bugfix patch which allows CustomScans 
to notify the core code that they can handle physical targetlists. 
Extension authors won't have to change anything provided that CustomPath is 
allocated using palloc0().



However, I'm hesitant to make such a change in the back branches; if
there's anyone using custom scans who's negatively affected, they'd be
rightfully unhappy.  Are you satisfied if we change this in HEAD?


It's kind of bittersweet. I'm really glad that you've changed your mind and 
this is going to be fixed in HEAD, but this change is crucial for our 
extension (if we use it with vanilla postgres).


Maybe you'll like my patch more.


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 71678d08dc..4b0322530b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -761,6 +761,9 @@ use_physical_tlist(PlannerInfo *root, Path *path, int flags)
 	if (rel->reloptkind != RELOPT_BASEREL)
 		return false;
 
+	if (IsA(path, CustomPath) && ((CustomPath *) path)->forbid_physical_tlist)
+		return false;
+
 	/*
 	 * Can't do it if any system columns or whole-row Vars are requested.
 	 * (This could possibly be fixed but would take some fragile assumptions
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 48862d93ae..5a1f6cee47 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -1089,6 +1089,7 @@ typedef struct CustomPath
  * nodes/extensible.h */
 	List	   *custom_paths;	/* list of child Path nodes, if any */
 	List	   *custom_private;
+	bool		forbid_physical_tlist;
 	const struct CustomPathMethods *methods;
 } CustomPath;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Robert Haas

On Tue, Apr 11, 2017 at 9:20 AM, Álvaro Hernández Tortosa
 wrote:
> LOL. Do you really want a half-baked Java programmer writing a patch in
> C for PostgreSQL? I once tried that and Magnus said my code was Java code
> that happened to compile with a C compiler ^_^
>
> Having said that, I take the bait, I like challenges and putting my
> words behind my code :)

Excellent, because that's how stuff gets done around here.  Saying
that you want something and hoping other people will do it is fine,
but being will to put some effort into it is a lot more likely to get
it done.  Not to be harsh, but showing up 3 days after feature freeze
to complain that a feature pending commit for 14 months is missing
something that you really need isn't really the right way to make
something happen.  I'm pretty sure that the lack of channel binding
was discussed quite a bit earlier than now, so I think there was
adequate opportunity to protest, contribute a patch, etc.  It's not
that I don't have sympathy for the way you feel about this: seeing
features you care about fall out of a release sucks, and I've
experienced a lot of that suckage quite recently, so the pain is fresh
in my mind.  But it's something we all have to live with for the
overall good of the product.  We used to not have a firm feature
freeze, and that was way worse.

In this case, I think it is abundantly clear that SCRAM without
channel binding is still a good feature.  One piece of evidence for
that is that the standard uses the suffix -PLUS to denote the
availability of channel binding.  That clearly conveys that channel
binding has value, but also that not having it does not make the whole
thing useless.  Otherwise, they would have required it to be part of
every implementation, or they would have made you add -CRAPPY if you
didn't have it.  The discussion elsewhere on this thread has
adequately underlined the value of what we've already got, so I won't
further belabor the point here.

Furthermore, I think that the state of this feature as it currently
exists in the tree is actually kind of concerning.  There are
currently four open items pertaining to SCRAM at least two of which
look to my mind an awful lot like stuff that should have ideally been
handled pre-feature-freeze: \password support, and protocol
negotiation.  I'm grateful for the hard work that has gone into this
feature, but these are pretty significant loose ends.  \password
support is a basic usability issue.  Protocol negotiation affects
anyone who may want to make their PG driver work with this feature,
and certainly can't be changed after final release, and ideally not
even after beta.  We really, really need to get that stuff nailed down
ASAP or we're going to have big problems.  So I think we should focus
on those things, not this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GCC 7 warnings

2017-04-12 Thread Tom Lane

Peter Eisentraut  writes:
> On 4/12/17 00:12, Tom Lane wrote:
>> Now a human can see that saved_timeval.tv_usec must be 0..99, so
>> that the %d format item must always emit exactly 3 characters, which
>> means that really 5 bytes would be enough.  I wouldn't expect a
>> compiler to know that, but if it's making a generic assumption about
>> the worst-case width of %d, shouldn't it conclude that we might need
>> as many as 13 bytes for the buffer?  Why does msbuf[10] satisfy it
>> if msbuf[8] doesn't?

> Because the /1000 takes off three digits?

Huh ... I would not really have expected it to figure that out, but
this makes it pretty clear that it did:

> test.c:11:15: note: directive argument in the range [-2147483, 2147483]


> (This is with -O2.  With -O0 it only asks for 5 bytes.)

But that reinforces my suspicion that the intelligence brought to bear
here will be variable.  I'd still go for msbuf[16]; it's not like
anybody's going to notice the extra stack consumption.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Tom Lane

Heikki Linnakangas  writes:
> On 04/12/2017 06:26 PM, Bruce Momjian wrote:
>> How does it do that?

> Good question, crypto magic? I don't know the details, but the basic 
> idea is that you extract a blob of data that uniquely identifies the TLS 
> connection. Using some OpenSSL functions, in this case. I think it's a 
> hash of some of the TLS handshake messages that were used when the TLS 
> connection was established (that's what "tls-unique" means). That data 
> is then incorporated in the hash calculations of the SCRAM 
> authentication. If the client and the server are not speaking over the 
> same TLS connection, they will use different values for the TLS data, 
> and the SCRAM computations will not match, and you get an authentication 
> failure.

... which the user can't tell apart from having fat-fingered the password,
I suppose?  Doesn't sound terribly friendly.  A report of a certificate
mismatch is far more likely to lead people to realize there's a MITM.

So this seems more like a hack than like a feature we need so desperately
as to push it into v10 post-freeze.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Tom Lane

Andrew Gierth  writes:
> "Alexander" == Alexander Kuzmenkov  writes:
>  Alexander> Structurally, the patch consists of two major parts: a
>  Alexander> specialized executor node

> Why?

> It strikes me that the significant fact here is not that we're doing
> count(*), but that we don't need any columns from the bitmap heap scan
> result.  Rather than creating a whole new node, can't the existing
> bitmap heapscan be taught to skip fetching the actual table page in
> cases where it's all-visible, not lossy, and no columns are needed?

+1 ... while I hadn't actually looked at the code, it seemed to me that
anything like the optimization-as-described would be impossibly klugy
from the planner's standpoint.  Your formulation sounds lots nicer.

Detecting that no columns are needed in the executor might be a bit tricky
because of the planner's habit of inserting a "physical tlist" to avoid a
projection step.  (See also nearby discussion about custom scan planning.)
But we could fix that.  I think one rule that would make sense is to
just disable the physical-tlist substitution if the relation's targetlist
is empty --- it wouldn't be buying much in such a case anyhow.  Then the
runtime tlist for the scan node would also be empty, and away you go.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.

2017-04-12 Thread Robert Haas

On Tue, Apr 11, 2017 at 4:05 AM, Magnus Hagander  wrote:
> On Tue, Apr 11, 2017 at 3:26 AM, Michael Paquier  
> wrote:
>> On Mon, Apr 10, 2017 at 5:47 PM, Magnus Hagander 
>> wrote:
>> > Based on that we seem to agree here, should we add this as an open item?
>> > Clearly if we want to change this, we should do so before 10.
>>
>> This really is a new feature, so as the focus is to stabilize things I
>> think that we should not make the code more complicated because...
>
> The part I'm talking about is the potential adjustment of the patch that's
> already committed. That's not a new feature, that's exactly the sort of
> thing we'd want to adjust before we get to release. Because once released we
> really can't change it.

I don't really agree.  I think if we go and install a GUC_REPORT GUC
now, we're much less likely to flush out the bugs in the 'show
transaction_read_only' mechanism.  Also, now that I think about, the
reason why we settled on 'show transaction_read_only' against
alternate queries is because there's some ability for the DBA to make
that return 'false' rather than 'true' even when not in recovery, so
that if for example you are using logical replication rather than
physical replication, you have a way to make
target_session_attrs=read-write still do something useful.  If you add
an in_hot_standby GUC that's used instead, you lose that.

Now, we can decide what we want to do about that, but I don't see that
a change in this area *must* go into v10.  Maybe the answer is that
target_session_attrs grows additional values like 'primary' and
'standby' alongside 'read-write' (and Simon's suggested 'read-only').
Or maybe we have another idea.  But I don't really see the urgency of
whacking this around right this minute.  There's nothing broken here;
there's just more stuff people would like to have.  It can be added
next time around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-04-12 Thread Ashutosh Bapat

Sorry, here's the right one.

On Wed, Apr 12, 2017 at 6:27 PM, David Rowley
 wrote:
> On 12 April 2017 at 21:45, Ashutosh Bapat
>  wrote:
>> On Wed, Apr 12, 2017 at 12:18 PM, David Rowley
>>  wrote:
>>> On 10 March 2017 at 17:33, Ashutosh Bapat
>>>  wrote:
 Yes and I also forgot to update the function prologue to refer to the
 fpinfo_o/i instead of inner and outer relations. Attached patch
 corrects it.
>>>
>>> Hi Ashutosh,
>>>
>>> This seems to have conflicted with 28b04787. Do you want to rebase, or 
>>> should I?
>>
>> Here you go.
>
> Looks like the wrong patch.
>
>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


foreign_outerjoin_pushdown_fix_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reversed sync check in pg_receivewal

2017-04-12 Thread Magnus Hagander

On Tue, Apr 11, 2017 at 6:51 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > Something like the attached?
>
> Not sure about
>
> + * All methods that have a failure path will set errno on failure.
>
> Given that you've got a getlasterror method, I don't think that's really
> the API invariant is it?  If it were, you'd just have the callers doing
> strerror(errno) for themselves.  I think maybe a more accurate statement
> would be
>

Hmm. You're right, this is what I get for rushing a patch before plane
departure :/


>  * All methods that have a failure return indicator will set state
>  * allowing the getlasterror() method to return a suitable message.
>  * Commonly, errno is this state (or part of it); so callers must take
>  * care not to clobber errno between a failed method call and use of
>  * getlasterror() to retrieve the message.
>


Yeah, that's much better.



> Also:
>
> * the arguments of open_for_write() could stand to be explained
>
> * what's the return value of finish()?
>

Both fixed.


>
> * wouldn't it be better to declare getlasterror() as returning
>   "const char *"?
>

Yeah, probably is. Will do and push.

Thanks!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

[HACKERS] TAP: fix undefined var warnings in PostgresNode with timeout

2017-04-12 Thread Craig Ringer

If PostgresNode::psql (from the TAP framework) is called with a
timeout set and a timed_out reference, it will attempt to do bitwise
AND and bitshifts on the $ret value from IPC::Run, which is undef if
the command timed out.

This produces annoying errors in the logs.

Fix attached. Should be applied to 9.6 and pg 10.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 2745f52455c4a0c187ec879f009aac8f7d0e2ce1 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 12 Apr 2017 20:04:34 +0800
Subject: [PATCH] Fix undefined var warnings in PostgresNode.pm with timeout

---
 src/test/perl/PostgresNode.pm | 33 ++---
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index bd627b2..84515ad 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -1151,22 +1151,25 @@ sub psql
 	# We don't use IPC::Run::Simple to limit dependencies.
 	#
 	# We always die on signal.
-	my $core = $ret & 128 ? " (core dumped)" : "";
-	die "psql exited with signal "
-	  . ($ret & 127)
-	  . "$core: '$$stderr' while running '@psql_params'"
-	  if $ret & 127;
-	$ret = $ret >> 8;
+	if (defined($ret))
+{
+		my $core = $ret & 128 ? " (core dumped)" : "";
+		die "psql exited with signal "
+		  . ($ret & 127)
+		  . "$core: '$$stderr' while running '@psql_params'"
+		  if $ret & 127;
+		$ret = $ret >> 8;
 
-	if ($ret && $params{on_error_die})
-	{
-		die "psql error: stderr: '$$stderr'\nwhile running '@psql_params'"
-		  if $ret == 1;
-		die "connection error: '$$stderr'\nwhile running '@psql_params'"
-		  if $ret == 2;
-		die "error running SQL: '$$stderr'\nwhile running '@psql_params'"
-		  if $ret == 3;
-		die "psql returns $ret: '$$stderr'\nwhile running '@psql_params'";
+		if ($ret && $params{on_error_die})
+		{
+			die "psql error: stderr: '$$stderr'\nwhile running '@psql_params'"
+			  if $ret == 1;
+			die "connection error: '$$stderr'\nwhile running '@psql_params'"
+			  if $ret == 2;
+			die "error running SQL: '$$stderr'\nwhile running '@psql_params'"
+			  if $ret == 3;
+			die "psql returns $ret: '$$stderr'\nwhile running '@psql_params'";
+		}
 	}
 
 	if (wantarray)
-- 
2.5.5

From 80a2b93b4e3ab3ad132fdcbf68b027d4e1726539 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 12 Apr 2017 20:08:54 +0800
Subject: [PATCH] Fix undefined var warnings in PostgresNode.pm with timeout

---
 src/test/perl/PostgresNode.pm | 33 ++---
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 0eacd41..5bb4f40 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -1217,22 +1217,25 @@ sub psql
 	# We don't use IPC::Run::Simple to limit dependencies.
 	#
 	# We always die on signal.
-	my $core = $ret & 128 ? " (core dumped)" : "";
-	die "psql exited with signal "
-	  . ($ret & 127)
-	  . "$core: '$$stderr' while running '@psql_params'"
-	  if $ret & 127;
-	$ret = $ret >> 8;
-
-	if ($ret && $params{on_error_die})
+	if (defined($ret))
 	{
-		die "psql error: stderr: '$$stderr'\nwhile running '@psql_params'"
-		  if $ret == 1;
-		die "connection error: '$$stderr'\nwhile running '@psql_params'"
-		  if $ret == 2;
-		die "error running SQL: '$$stderr'\nwhile running '@psql_params' with sql '$sql'"
-		  if $ret == 3;
-		die "psql returns $ret: '$$stderr'\nwhile running '@psql_params'";
+		my $core = $ret & 128 ? " (core dumped)" : "";
+		die "psql exited with signal "
+		  . ($ret & 127)
+		  . "$core: '$$stderr' while running '@psql_params'"
+		  if $ret & 127;
+		$ret = $ret >> 8;
+
+		if ($ret && $params{on_error_die})
+		{
+			die "psql error: stderr: '$$stderr'\nwhile running '@psql_params'"
+			  if $ret == 1;
+			die "connection error: '$$stderr'\nwhile running '@psql_params'"
+			  if $ret == 2;
+			die "error running SQL: '$$stderr'\nwhile running '@psql_params' with sql '$sql'"
+			  if $ret == 3;
+			die "psql returns $ret: '$$stderr'\nwhile running '@psql_params'";
+		}
 	}
 
 	if (wantarray)
-- 
2.5.5


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Tom Lane

Amit Kapila  writes:
> On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
>> Anyone want to draft a patch for this?

> Please find patch attached based on above discussion.

Thanks, I'll look at this later today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-12 Thread Magnus Hagander

On Tue, Apr 11, 2017 at 4:30 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 4/11/17 08:49, Magnus Hagander wrote:
> > At the risk of being proven wrong again, won't this affect  tags in
> > the old documentation as well? And if so, is that something we actually
> > want?
>
> Right.  New patch with more refined selectors attached.
>
> > It does? The output on my laptop for that produces  > summary="Backslash Escape Sequences" border="1"> (for example). It's
> > wrapped in a div with class=table. But the old code had  > border="1" class="CALSTABLE">. And AFAICT, it's the CALSTABLE that the
> > css is actually matching on?
> >
> > Could this be a toolchain version thing? I'm using jessie on my laptop,
> > which is the same one that the website is built with.
>
> Seems so.  Patch for that attached as well.
>
>
Thanks, I've applied all three patches. Your version of the color patch
also seemed slightly more appropriate than Bruce's since it only hit the
docs.css and not the global css.

Once difference I notice is that for example the "note boxes" are no longer
centered, but they do now get the new formatting. Tables now look a lot
better.

I think that only leaves the change to the javascript code that Bruce sent.
Let's see if we can figure out a way to do that one without requiring
javascript, but after that we have covered all listed issues I think?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] TAP tests take a long time

2017-04-12 Thread Tom Lane

Craig Ringer  writes:
> With
> make PROVE_FLAGS="--timer -j 9" check
> I don't see much difference with/without caching initdb results -
> saves about 4 seconds, from 74 to 70 seconds, but hard to tell with
> the error margins.

> So if we're going to do anything, defaulting to parallel prove seems a
> good start, and I'm not sure caching initdb results is worth it.

Yeah, the results shown in this thread are not very good :-(.  I'm
suspicious that the results are basically I/O-bound in your environment.
There might be some value in the cached-initdb-tree solution for slower,
more CPU-bound machines, but cutting out the regprocin overhead would
probably do just as much good there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP tests take a long time

2017-04-12 Thread Andrew Dunstan



On 04/12/2017 01:27 AM, Craig Ringer wrote:
>
> BTW, I suggest adding --timer to our default PROVE_FLAGS, so we can
> collect more data from the buildfarm on what takes a while. Sample
> output:
>


I'll add that to the commandline the buildfarm uses in the upcoming release.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v1] Add and report the new "in_hot_standby" GUC pseudo-variable.

2017-04-12 Thread Magnus Hagander

On Tue, Apr 11, 2017 at 2:38 PM, Simon Riggs  wrote:

> On 11 April 2017 at 09:05, Magnus Hagander  wrote:
> > On Tue, Apr 11, 2017 at 3:26 AM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >>
> >> On Mon, Apr 10, 2017 at 5:47 PM, Magnus Hagander 
> >> wrote:
> >> > Based on that we seem to agree here, should we add this as an open
> item?
> >> > Clearly if we want to change this, we should do so before 10.
> >>
> >> This really is a new feature, so as the focus is to stabilize things I
> >> think that we should not make the code more complicated because...
> >
> >
> > The part I'm talking about is the potential adjustment of the patch
> that's
> > already committed. That's not a new feature, that's exactly the sort of
> > thing we'd want to adjust before we get to release. Because once
> released we
> > really can't change it.
>
> I agree if we introduce target_session_attrs it would be better to
> have a complete feature in one release.
>
> It does seem quite strange to have
>   target_session_attrs=read-write
> but not
>   target_session_attrs=read-only
>
> And it would be even better to have these session attrs as well
>   notify-on-promote - sent when standby is promoted
>   notify-on-write - sent when an xid is assigned
>

Well, one of those could come automatically with a GUC_REPORT variable of
the correct type, no? So if we were to use the transaction_read_only one,
you'd get a notification on promotion because your transaction became
read/write, wouldn't it?



>
> "notify-on-promotion" being my suggested name for the feature being
> discussed here. In terms of the feature as submitted, I wonder whether
> having a GUC parameter like this makes sense, but I think its ok for
> us to send a protocol message, maybe a NotificationResponse, but there
> isn't any material difference between those two protocol messages.
>
> Rather than the special case code in the patch, I imagine more generic
> code like this...
>
> if (sessionInterruptPending)
>   ProcessSessionInterrupt();
>
> I'm happy to work on the patch, if that's OK.
>

I think going through all those steps is moving the goalposts a bit too far
for the 10 release.

But if adjustment to the already applied patch is needed to make sure we
can improve on it to get to this point in 11, that's more on topic I think.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-12 Thread Robert Haas

On Thu, Apr 6, 2017 at 1:17 AM, Rushabh Lathia  wrote:
> I like the idea about having DEFAULT partition for the range partition. With
> the
> way partition is designed it can have holes into range partition. I think
> DEFAULT
> for the range partition is a good idea, generally when the range having
> holes. When
> range is serial then of course DEFAULT partition doen't much sense.

Yes, I like that idea, too.  I think the DEFAULT partition should be
allowed to be created for either range or list partitioning regardless
of whether we think there are any holes, but if you create a DEFAULT
partition when there are no holes, you just won't be able to put any
data into it.  It's silly, but it's not worth the code that it would
take to try to prevent it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Inadequate parallel-safety check for SubPlans

2017-04-12 Thread Tom Lane

While poking at the question of parallel_safe marking for Plans,
I noticed that max_parallel_hazard_walker() does this:

/* We can push the subplans only if they are parallel-safe. */
else if (IsA(node, SubPlan))
return !((SubPlan *) node)->parallel_safe;

This is 100% wrong.  It's failing to recurse into the subexpressions of
the SubPlan, which could very easily include parallel-unsafe function
calls.  Example:

regression=# explain select * from tenk1 where int4(unique1 + random()) not in 
(select ten from tenk2);
 QUERY PLAN  
-
 Gather  (cost=470.00..884.25 rows=5000 width=244)
   Workers Planned: 4
   ->  Parallel Seq Scan on tenk1  (cost=470.00..884.25 rows=1250 width=244)
 Filter: (NOT (hashed SubPlan 1))
 SubPlan 1
   ->  Seq Scan on tenk2  (cost=0.00..445.00 rows=1 width=4)

random() is parallel-restricted so it's not cool that the SubPlan was
allowed to be passed down to workers.

I tried to fix it like this:

else if (IsA(node, SubPlan))
{
if (!((SubPlan *) node)->parallel_safe &&
max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
return true;
}

but got some failures in the regression tests due to things not getting
parallelized when expected.  Poking into that, I remembered that for some
SubPlans, the testexpr contains Params representing the output columns
of the sub-select.  So the recursive call of max_parallel_hazard_walker
visits those and deems the expression parallel-restricted.

Basically, therefore, this logic is totally inadequate.  I think what
we need to do is improve matters so that while looking at the testexpr
of a SubPlan, we pass down a whitelist saying that the Params having
the numbers indicated by the SubLink's paramIds list are OK.

I'm also suspicious that the existing dumb treatment of Params here
may be blocking recognition that correlated subplans are parallel-safe.
But that would only be a failure to apply a possible optimization,
not a failure to generate a valid plan.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

2017-04-12 Thread Andres Freund

On 2017-04-11 17:42:42 -0400, Tom Lane wrote:
> Now, that old behavior matches what you got in the RangeFunction case:
> 
> regression96=# select * from int4_tbl, cast(case when f1>0 then 
> generate_series(1,2) else null end as int);
>  f1  | int4 
> -+--
>0 | 
>   123456 |1
>   123456 |2
>  -123456 | 
>   2147483647 |1
>   2147483647 |2
>  -2147483647 | 
> (7 rows)
> 
> So it would make sense to me for our new behavior to still match the
> targetlist case.
> 
> Not sure if that's exactly the same as what you're saying or not.

The patch now indeed returns

regression[20994][1]=# select * from int4_tbl, cast(case when f1>0 then 
generate_series(1,2) else null end as int);
WARNING:  01000: replacing
LOCATION:  frobble_rtefunc, createplan.c:3102
(as you can see, this ain't quite ready)
┌─┬┐
│ f1  │  int4  │
├─┼┤
│   0 │ (null) │
│   0 │ (null) │
│  123456 │  1 │
│  123456 │  2 │
│ -123456 │ (null) │
│ -123456 │ (null) │
│  2147483647 │  1 │
│  2147483647 │  2 │
│ -2147483647 │ (null) │
│ -2147483647 │ (null) │
└─┴┘
(10 rows)

The basic approach seems quite workable.  It's not super extensible to
allow SRFs deeper inside generic ROWS FROM arguments however - I'm not
sure there's any need to work towards that however, I've not heard
demands so far.   Any arguments against that?

One other thing where it'd currently affect behaviour is something like:
SELECT * FROM CAST(generate_series(1,0) * 5 as int);

which, in < v10 would return 1 row, but with my patch returns no rows.
That makes a lot more sense in my opinion, given that a plain FROM
generate_series(1,0) doesn't return any rows in either version.

Right now I'm mopping up corner cases where it'd *expand* the set of
currently valid commands in an inconsistent manner. Namely FROM
int4mul(generate_series(..), 5) works, but FROM
composite_returning(somesrf()) wouldn't without additional work.  I plan
to continue to error out in either...

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-12 Thread Andreas Karlsson


On 04/12/2017 05:00 PM, Andreas Karlsson wrote:

Looked at this an option 1 seems simple enough if I am not missing
something. I might hack something up later tonight. Either way I think
this improvement can be done separately from the proposed replacement of
the catalog header files. Trying to fix everything at once often leads
to nothing being fixed at all.


Here is my proof of concept patch. It does basically the same thing as 
Andres's patch except that it handles quoted values a bit better and 
does not try to support anything other than the regproc type.


The patch speeds up initdb without fsync from 0.80 seconds to 0.55 
seconds, which is a nice speedup, while adding a negligible amount of 
extra work on compilation.


Two things which could be improved in this patch if people deem it 
important:


- Refactor the code to be more generic, like Andres patch, so it is 
easier to add lookups for other data types.


- Write something closer to a real .bki parser to actually understand 
quoted values and _null_ when parsing the data. Right now this patch 
only splits each row into its fields (while being aware of quotes) but 
does not attempt to remove quotes. The PoC patch treats "foo" and foo as 
different.


Andreas
commit bc8ede661b12407c0b6a9e4c0932d21028f87fc1
Author: Andreas Karlsson 
Date:   Wed Apr 12 21:00:49 2017 +0200

WIP: Resolve regproc entires when generating .bki

diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 6e9d57aa8d..f918c9ef8a 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -102,6 +102,7 @@ print $bki "# PostgreSQL $major_version\n";
 # vars to hold data needed for schemapg.h
 my %schemapg_entries;
 my @tables_needing_macros;
+my %procs;
 our @types;
 
 # produce output, one catalog at a time
@@ -164,20 +165,41 @@ foreach my $catname (@{ $catalogs->{names} })
 			$row->{bki_values} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g;
 			$row->{bki_values} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g;
 
+			# Split values into tokens without interpreting their meaning.
+			my %bki_values;
+			@bki_values{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g;
+
+			# Substitute regproc entires with oids
+			foreach my $att (keys %bki_values)
+			{
+next if $bki_attr{$att}->{type} ne 'regproc';
+next if $bki_values{$att} =~ /\A(\d+|-|_null_)\z/;
+
+$bki_values{$att} = $procs{$bki_values{$att}};
+			}
+
+			# Save pg_proc oids for use by later catalogs. This relies on the
+			# order we process the files, but the bootstrap code also relies on
+			# pg_proc being loaded first.
+			if ($catname eq 'pg_proc')
+			{
+$procs{$bki_values{proname}} = $row->{oid};
+			}
+
 			# Save pg_type info for pg_attribute processing below
 			if ($catname eq 'pg_type')
 			{
-my %type;
+my %type = %bki_values;
 $type{oid} = $row->{oid};
-@type{@attnames} = split /\s+/, $row->{bki_values};
 push @types, \%type;
 			}
 
 			# Write to postgres.bki
 			my $oid = $row->{oid} ? "OID = $row->{oid} " : '';
-			printf $bki "insert %s( %s)\n", $oid, $row->{bki_values};
+			printf $bki "insert %s( %s)\n", $oid,
+			  join(' ', @bki_values{@attnames});
 
-		   # Write comments to postgres.description and postgres.shdescription
+			# Write comments to postgres.description and postgres.shdescription
 			if (defined $row->{descr})
 			{
 printf $descr "%s\t%s\t0\t%s\n", $row->{oid}, $catname,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index 2af9b355e7..10d9c6abc7 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -58,16 +58,9 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} })
 my $data = $catalogs->{pg_proc}->{data};
 foreach my $row (@$data)
 {
-
-	# To construct fmgroids.h and fmgrtab.c, we need to inspect some
-	# of the individual data fields.  Just splitting on whitespace
-	# won't work, because some quoted fields might contain internal
-	# whitespace.  We handle this by folding them all to a simple
-	# "xxx". Fortunately, this script doesn't need to look at any
-	# fields that might need quoting, so this simple hack is
-	# sufficient.
-	$row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
-	@{$row}{@attnames} = split /\s+/, $row->{bki_values};
+	# Split values into tokens without interpreting their meaning.
+	# This is safe becease we are going to use the values as-is.
+	@{$row}{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g;
 
 	# Select out just the rows for internal-language procedures.
 	# Note assumption here that INTERNALlanguageId is 12.
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index 702924a958..a4237b0661 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -87,51 +87,11 @@ regprocin(PG_FUNCTION_ARGS)
 	/* Else it's a name, possibly schema-qualified */
 
 	/*
-	 * In bootstrap mode we assume the given name is not schema-qualified, and
-	 * just

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-12 Thread Robert Haas

On Thu, Apr 6, 2017 at 7:30 AM, Rahila Syed  wrote:
> Thanks a lot for testing and reporting this. Please find attached an updated
> patch with the fix. The patch also contains a fix
> regarding operator used at the time of creating expression as default
> partition constraint. This was notified offlist by Amit Langote.

I think that the syntax for this patch should probably be revised.
Right now the proposal is for:

CREATE TABLE .. PARTITION OF ... FOR VALUES IN (DEFAULT);

But that's not a good idea for several reasons.  For one thing, you
can also write FOR VALUES IN (DEFAULT, 5) or which isn't sensible.
For another thing, this kind of syntax won't generalize to range
partitioning, which we've talked about making this feature support.
Maybe something like:

CREATE TABLE .. PARTITION OF .. DEFAULT;

This patch makes the assumption throughout that any DefElem represents
the word DEFAULT, which is true in the patch as written but doesn't
seem very future-proof.  I think the "def" in "DefElem" stands for
"definition" or "define" or something like that, so this is actually
pretty confusing.  Maybe we should introduce a dedicated node type to
represent a default-specification in the parser grammar.  If not, then
let's at least encapsulate the test a little better, e.g. by adding
isDefaultPartitionBound() which tests not only IsA(..., DefElem) but
also whether the name is DEFAULT as expected.  BTW, we typically use
lower-case internally, so if we stick with this representation it
should really be "default" not "DEFAULT".

Useless hunk:

+boolhas_def;/* Is there a default partition?
Currently false
+ * for a range partitioned table */
+intdef_index;/* Index of the default list
partition. -1 for
+ * range partitioned tables */

Why abbreviate "default" to def here?  Seems pointless.

+if (found_def)
+{
+if (mapping[def_index] == -1)
+mapping[def_index] = next_index++;
+}

Consider &&

@@ -717,7 +754,6 @@ check_new_partition_bound(char *relname, Relation
parent, Node *bound)
 }
 }
 }
-
 break;
 }

+ * default partiton for rows satisfying the new partition

Spelling.

+ * constraint. If found dont allow addition of a new partition.

Missing apostrophe.

+defrel = heap_open(defid, AccessShareLock);
+tupdesc = CreateTupleDescCopy(RelationGetDescr(defrel));
+
+/* Build expression execution states for partition check quals */
+partqualstate = ExecPrepareCheck(partConstraint,
+estate);
+
+econtext = GetPerTupleExprContext(estate);
+snapshot = RegisterSnapshot(GetLatestSnapshot());

Definitely not safe against concurrency, since AccessShareLock won't
exclude somebody else's update.  In fact, it won't even cover somebody
else's already-in-flight transaction.

+errmsg("new default partition constraint is violated
by some row")));

Normally in such cases we try to give more detail using
ExecBuildSlotValueDescription.

+boolis_def = true;

This variable starts out true and is never set to any value other than
true.  Just get rid of it and, in the one place where it is currently
used, write "true".  That's shorter and clearer.

+inhoids = find_inheritance_children(RelationGetRelid(parent), NoLock);

If it's actually safe to do this with no lock, there ought to be a
comment with a very compelling explanation of why it's safe.

+boundspec = (Node *) stringToNode(TextDatumGetCString(datum));
+bspec = (PartitionBoundSpec *)boundspec;

There's not really a reason to cast the result of stringToNode() to
Node * and then turn around and cast it to PartitionBoundSpec *.  Just
cast it directly to whatever it needs to be.  And use the new castNode
macro.

+foreach(cell1, bspec->listdatums)
+{
+Node *value = lfirst(cell1);
+if (IsA(value, DefElem))
+{
+def_elem = true;
+*defid = inhrelid;
+}
+}
+if (def_elem)
+{
+ReleaseSysCache(tuple);
+continue;
+}
+foreach(cell3, bspec->listdatums)
+{
+Node *value = lfirst(cell3);
+boundspecs = lappend(boundspecs, value);
+}
+ReleaseSysCache(tuple);
+}
+foreach(cell4, spec->listdatums)
+{
+Node *value = lfirst(cell4);
+boundspecs = lappend(boundspecs, value);
+}

cell1, cell2, cell3, and cell4 are not very clear variable names.
Between that and the lack of comments, this is not easy to understand.
It's sort of spaghetti logic, too.  The if (def_elem) test continues
early, but if the point is that the loop using

Re: [HACKERS] [sqlsmith] ERROR: badly formatted node string "RESTRICTINFO...

2017-04-12 Thread Tom Lane

Amit Kapila  writes:
> On Wed, Apr 12, 2017 at 12:40 AM, Tom Lane  wrote:
>> Anyone want to draft a patch for this?

> Please find patch attached based on above discussion.

This patch seems fairly incomplete: you can't just whack around major data
structures like PlannedStmt and PlannerGlobal without doing the janitorial
work of cleaning up support logic such as outfuncs/readfuncs.

Also, when you think about what will happen when doing a copyObject()
on a completed plan, it seems like a pretty bad idea to link subplans
into two independent lists.  We'll end up with two separate copies of
those subtrees in places like the plan cache.

I'm inclined to think the other approach of adding a parallel_safe
flag to struct Plan is a better answer in the long run.  It's basically
free to have it there because of alignment considerations, and it's
hard to believe that we're not going to need that labeling at some
point in the future anyway.

I had been a bit concerned about having to have some magic in outfuncs
and/or readfuncs to avoid transferring the unsafe subplan(s), but I see
from your patch there's a better way: we can have ExecSerializePlan modify
the subplan list as it transfers it into its private PlannedStmt struct.
But I think iterating over the list and examining each subplan's
parallel_safe marking is a better way to do that.

Will work on this approach.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Tom Lane

Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> ... which the user can't tell apart from having fat-fingered the password,
>> I suppose?  Doesn't sound terribly friendly.  A report of a certificate
>> mismatch is far more likely to lead people to realize there's a MITM.

> We might be able to improve on that.

I'd sure be happier about this being a useful feature if we can.  But
unless someone has a design for that ready-to-go, that's another good
reason to put it off.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Álvaro Hernández Tortosa




On 12/04/17 18:09, Tom Lane wrote:

Heikki Linnakangas  writes:

On 04/12/2017 06:26 PM, Bruce Momjian wrote:

How does it do that?

Good question, crypto magic? I don't know the details, but the basic
idea is that you extract a blob of data that uniquely identifies the TLS
connection. Using some OpenSSL functions, in this case. I think it's a
hash of some of the TLS handshake messages that were used when the TLS
connection was established (that's what "tls-unique" means). That data
is then incorporated in the hash calculations of the SCRAM
authentication. If the client and the server are not speaking over the
same TLS connection, they will use different values for the TLS data,
and the SCRAM computations will not match, and you get an authentication
failure.


I believe the above is not correct. Channel binding data is *not* 
used for any hash computations. It is simply a byte array that is 
received as an extra user parameter, and the server then gets it by its 
own way (its own TLS data) and do a byte comparison. That's what the 
RFCs say about it.

... which the user can't tell apart from having fat-fingered the password,
I suppose?  Doesn't sound terribly friendly.  A report of a certificate
mismatch is far more likely to lead people to realize there's a MITM.


So given what I said before, that won't happen. Indeed, SCRAM RFC 
contains a specific error code for this: "channel-bindings-dont-match".



Álvaro


--

Álvaro Hernández Tortosa


---
<8K>data



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-04-12 Thread Nicolas Barbier

2017-04-11 Robert Haas :

> There's a nasty trade-off here between XID consumption (and the
> aggressive vacuums it eventually causes) and preserving performance in
> the face of errors - e.g. if you make k = 100,000 you consume 100x
> fewer XIDs than if you make k = 1000, but you also have 100x the work
> to redo (on average) every time you hit an error.

You could make it dynamic: Commit the subtransaction even when not
encountering any error after N lines (N starts out at 1), then double
N and continue. When encountering an error, roll back the current
subtransaction back and re-insert all the known good rows that have
been rolled back (plus maybe the erroneous row into a separate table
or whatever) in one new subtransaction and commit; then reset N to 1
and continue processing the rest of the file.

That would work reasonable well whenever the ratio of erroneous rows
is not extremely high: whether the erroneous rows are all clumped
together, entirely randomly spread out over the file, or a combination
of both.

> If the data quality is poor (say, 50% of lines have errors) it's
> almost impossible to avoid runaway XID consumption.

Yup, that seems difficult to work around with anything similar to the
proposed. So the docs might need to suggest not to insert a 300 GB
file with 50% erroneous lines :-).

Greetings,

Nicolas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] the need to finish

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 11:41 AM, Simon Riggs  wrote:
> On 12 April 2017 at 16:26, Tom Lane  wrote:
>> Erik Rijkers  writes:
>>> Logical replication emits logmessages like these:
>>> DETAIL:  90 transactions need to finish.
>>> I would prefer the line to be more terse:
>>> DETAIL:  90 transactions to finish.
>> Our style guidelines say that detail messages should be complete
>> sentences, so I don't like your proposal too much.
>> Maybe "N transactions remain to finish." ?
> waiting for N transactions to complete

+1 for something along those lines, but perhaps it needs a capital
letter and a period.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Kuzmenkov


On 12.04.2017 17:24, Tom Lane wrote:

TBH, I'm not sure you need to do any of that work.  Have you got evidence
that the planner will fail to choose the right plan regardless? I'm
particularly unconvinced that choose_bitmap_and is a critical problem,
because once you're into having to AND multiple indexes, you're talking
about an expensive query anyhow.
The most expensive part would probably be accessing the heap in the 
bitmap heap scan. It may be worth trying to choose an index path that 
checks all the restriction and therefore allows us to skip this heap 
access. This path might not be the cheapest one, though. The standard 
AND selection procedure would have discarded it based on cost.
I've seen this happen on the regression database. Somehow I can't seem 
to reproduce it on my earlier full-text search example.


An example:

regression=# explain select count(*) from tenk1 where hundred < 90 and 
thousand < 31;

QUERY PLAN
---
 Bitmap Count on tenk1  (cost=182.69..185.56 rows=1 width=8)
   Recheck Cond: ((thousand < 31) AND (hundred < 90))
   ->  BitmapAnd  (cost=182.69..182.69 rows=287 width=0)
 ->  Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..6.68 
rows=319 width=0)

   Index Cond: (thousand < 31)
 ->  Bitmap Index Scan on tenk1_hundred (cost=0.00..175.62 
rows=8978 width=0)

   Index Cond: (hundred < 90)
(7 rows)

regression=# set enable_bitmapcount to off;
SET
regression=# explain select count(*) from tenk1 where hundred < 90 and 
thousand < 31;

QUERY PLAN
---
 Aggregate  (cost=375.34..375.35 rows=1 width=8)
   ->  Bitmap Heap Scan on tenk1  (cost=6.75..374.62 rows=287 width=0)
 Recheck Cond: (thousand < 31)
 Filter: (hundred < 90)
 ->  Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..6.68 
rows=319 width=0)

   Index Cond: (thousand < 31)
(6 rows)

--

Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-12 Thread Robert Haas

On Tue, Apr 11, 2017 at 1:20 PM, Pavan Deolasee
 wrote:
> I don't know why you say that regressions are not addressed. Here are a few
> things I did to address the regressions/reviews/concerns, apart from fixing
> all the bugs discovered, but please let me know if there are things I've not
> addressed.

I'm making statements based on my perception of the discussion on the
thread.  Perhaps you did some work which you either didn't mention or
I missed you mentioning it, but it sure didn't feel like all of the
things reported got addressed.

> 1. Improved the interesting attrs patch that Alvaro wrote to address the
> regression discovered in fetching more heap attributes. The patch that got
> committed in fact improved certain synthetic workloads over then master.

Yep, though it was not clear that all of the regressing cases were
actually addressed, at least not to me.

> 2. Based on Petr and your feedback, disabled WARM on toasted attributes to
> reduce overhead of fetching/decompressing the attributes.

But that's not necessarily the right fix, as per
http://postgr.es/m/ca+tgmoyufxy1lsedzsw8uuulujhh0r8ncd-up-hzmc1fydp...@mail.gmail.com
and subsequent discussion.  It's not clear to me from that discussion
that we've got to a place where the method used to identify whether a
WARM update happened during a scan is exactly identical to the method
used to decide whether to perform one in the first place.

> 3. Added code to avoid doing second index scan when the index does not
> contain any WARM pointers. This should address the situation Amit brought up
> where only one of the indexes receive WARM inserts
> 4. Added code to kill wrong index pointers to do online cleanup.

Good changes.

> 5. Added code to set a CLEAR pointer to a WARM pointer when we know that the
> entire chain is WARM. This should address the workload Dilip ran and found
> regression (I don't think he got chance to confirm that)

Which is clearly a thing that should happen before commit, and really,
you ought to be leading the effort to confirm that, not him.  It's
good for him to verify that your fix worked, but you should test it
first.

> 6. Enhanced stats collector to collect information about candidate WARM
> chains and added mechanism to control WARM cleanup at the heap as well as
> index level, based on configurable parameters. This gives user better
> control over the additional work that is required for WARM cleanup.

I haven't seen previous discussion of this; therefore I doubt whether
we have agreement on these parameters.

> 7. Added table level option to disable WARM if nothing else works.

-1 from me.

> 8. Added mechanism to disable WARM when more than 50% indexes are being
> updated. I ran some benchmarks with different percentage of indexes getting
> updated and thought this is a good threshold.

+1 from me.

> I may have missed something, but there is no intention to ignore known
> regressions/reviews. Of course, I don't think that every regression will be
> solvable, like if you run a CPU-bound workload, setting it up in a way such
> that you repeatedly exercise the area where WARM is doing additional work,
> without providing any benefit, may be you can still find regression. I am
> willing to fix them as long as they are fixable and we are comfortable with
> the additional code complexity. IMHO certain trade-offs are good, but I
> understand that not everybody will agree with my views and that's ok.

The point here is that we can't make intelligent decisions about
whether to commit this feature unless we know which situations get
better and which get worse and by how much.  I don't accept as a
general principle the idea that CPU-bound workloads don't matter.
Obviously, I/O-bound workloads matter too, but we can't throw
CPU-bound workloads under the bus.  Now, avoiding index bloat does
also save CPU, so it is easy to imagine that WARM could come out ahead
even if each update consumes slightly more CPU when actually updating,
so we might not actually regress.  If we do, I guess I'd want to know
why.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] snapbuild woes

2017-04-12 Thread Andres Freund

On 2017-04-12 11:03:57 -0400, Peter Eisentraut wrote:
> On 4/12/17 02:31, Noah Misch wrote:
> >>> But I hope you mean to commit these snapbuild patches before the postgres 
> >>> 10
> >>> release?  As far as I know, logical replication is still very broken 
> >>> without
> >>> them (or at least some of that set of 5 patches - I don't know which ones
> >>> are essential and which may not be).
> >>
> >> Yes, these should go into 10 *and* earlier releases, and I don't plan to
> >> wait for 2017-07.
> > 
> > [Action required within three days.  This is a generic notification.]
> 
> I'm hoping for a word from Andres on this.

Feel free to reassign to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Álvaro Hernández Tortosa




On 12/04/17 18:38, Robert Haas wrote:

On Tue, Apr 11, 2017 at 9:20 AM, Álvaro Hernández Tortosa
 wrote:

 LOL. Do you really want a half-baked Java programmer writing a patch in
C for PostgreSQL? I once tried that and Magnus said my code was Java code
that happened to compile with a C compiler ^_^

 Having said that, I take the bait, I like challenges and putting my
words behind my code :)

Excellent, because that's how stuff gets done around here.  Saying
that you want something and hoping other people will do it is fine,
but being will to put some effort into it is a lot more likely to get
it done.  Not to be harsh, but showing up 3 days after feature freeze
to complain that a feature pending commit for 14 months is missing
something that you really need isn't really the right way to make
something happen.  I'm pretty sure that the lack of channel binding
was discussed quite a bit earlier than now, so I think there was
adequate opportunity to protest, contribute a patch, etc.  It's not
that I don't have sympathy for the way you feel about this: seeing
features you care about fall out of a release sucks, and I've
experienced a lot of that suckage quite recently, so the pain is fresh
in my mind.  But it's something we all have to live with for the
overall good of the product.  We used to not have a firm feature
freeze, and that was way worse.


Hi Robert. Not harsh, no worries, but please note a couple of 
things, let me give you a bit of context about my recent comments:


- I haven't complained about not having channel binding support. I was 
just giving my own recommendations for the PostgreSQL community, in the 
belief that contributing this opinion could let -hackes to make a more 
informed decision about whether to include or not a given feature.


- I haven't said either that I need it. I don't use SCRAM, much less 
with channel binding. Rather than looking after myself here, I'm trying 
to sit on the foot of potential users and speak up for them. I might be 
wrong, of course, but this is the only role I'm trying to play here.


- I know how PostgreSQL release cycles work and not meaning to hack them 
anyway. I just thought raising the fact that something that I believe 
might be requested by enterprise users was on the other hand a minor 
addition to a feature, and thus a really high value-to-effort ratio. 
Indeed, I bet adding channel binding is an order of magnitude easier 
than adding saslprep (which is optional too, btw). Ultimately I'm happy 
with SCRAM in PG 10, whether it has cbinding or not, as it is a great 
addition and makes PG10 an even better software.


- Even though I don't really care about SCRAM, and without having any 
prior knowledge about SCRAM, I volunteered some time ago to study SCRAM, 
give a lightning talk about SCRAM and later write a client 
implementation for the jdbc driver. And I have already devoted a very 
fair amount of time in doing so, and will keep doing that until all code 
is done. Code WIP is here FYI: https://github.com/ahachete/scram. So 
it's not that I haven't already put my code behind my words.


- Given all that, I still want to volunteer to not only do the client 
jdbc part and consequently help debugging the server implementation (as 
I believe it is so far the only non-libpq implementation), but also try 
to also stand by my words by writing channel binding code for PostgreSQL 
server. This is a huge effort for me, I only did C programming on the 
year 2001. But still, I want to help from my limited capabilities as 
much as I can.


- Having thoroughly studied the RFC and companion documentation, I 
thought I was in a knowledge position as to give some recommendations 
that other hackers may not know about (unless a deep study of SCRAM 
would have been done). That's it, recommendation, ideas.





In this case, I think it is abundantly clear that SCRAM without
channel binding is still a good feature.


I agree. I may have exaggerated before, downplaying SCRAM without 
channel binding. I think it is great. Period. I also still think channel 
binding is very small code addition yet provides even better value. I 
apologize for not picking my previous words more carefully.



  One piece of evidence for
that is that the standard uses the suffix -PLUS to denote the
availability of channel binding.  That clearly conveys that channel
binding has value, but also that not having it does not make the whole
thing useless.  Otherwise, they would have required it to be part of
every implementation, or they would have made you add -CRAPPY if you
didn't have it.  The discussion elsewhere on this thread has
adequately underlined the value of what we've already got, so I won't
further belabor the point here.

Furthermore, I think that the state of this feature as it currently
exists in the tree is actually kind of concerning.  There are
currently four open items pertaining to SCRAM at least two of which
look to my mind an

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-04-12 Thread Stas Kelvich

> On 12 Apr 2017, at 20:23, Robert Haas  wrote:
> 
> On Wed, Apr 12, 2017 at 1:18 PM, Nicolas Barbier
>  wrote:
>> 2017-04-11 Robert Haas :
>>> If the data quality is poor (say, 50% of lines have errors) it's
>>> almost impossible to avoid runaway XID consumption.
>> 
>> Yup, that seems difficult to work around with anything similar to the
>> proposed. So the docs might need to suggest not to insert a 300 GB
>> file with 50% erroneous lines :-).
> 
> Yep.  But it does seem reasonably likely that someone might shoot
> themselves in the foot anyway.  Maybe we just live with that.
> 

Moreover if that file consists of one-byte lines (plus one byte of newline char)
then during its import xid wraparound will happens 18 times =)

I think it’s reasonable at least to have something like max_errors parameter
to COPY, that will be set by default to 1000 for example. If user will hit that
limit then it is a good moment to put a warning about possible xid consumption
in case of bigger limit.

However I think it worth of quick research whether it is possible to create 
special
code path for COPY in which errors don’t cancel transaction. At least when
COPY called outside of transaction block.

Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 1:18 PM, Nicolas Barbier
 wrote:
> 2017-04-11 Robert Haas :
>> There's a nasty trade-off here between XID consumption (and the
>> aggressive vacuums it eventually causes) and preserving performance in
>> the face of errors - e.g. if you make k = 100,000 you consume 100x
>> fewer XIDs than if you make k = 1000, but you also have 100x the work
>> to redo (on average) every time you hit an error.
>
> You could make it dynamic: Commit the subtransaction even when not
> encountering any error after N lines (N starts out at 1), then double
> N and continue. When encountering an error, roll back the current
> subtransaction back and re-insert all the known good rows that have
> been rolled back (plus maybe the erroneous row into a separate table
> or whatever) in one new subtransaction and commit; then reset N to 1
> and continue processing the rest of the file.
>
> That would work reasonable well whenever the ratio of erroneous rows
> is not extremely high: whether the erroneous rows are all clumped
> together, entirely randomly spread out over the file, or a combination
> of both.

Right.  I wouldn't suggest the exact algorithm you proposed; I think
you ought to vary between some lower limit >1, maybe 10, and some
upper limit, maybe 1,000,000, ratcheting up and down based on how
often you hit errors in some way that might not be as simple as
doubling.  But something along those lines.

>> If the data quality is poor (say, 50% of lines have errors) it's
>> almost impossible to avoid runaway XID consumption.
>
> Yup, that seems difficult to work around with anything similar to the
> proposed. So the docs might need to suggest not to insert a 300 GB
> file with 50% erroneous lines :-).

Yep.  But it does seem reasonably likely that someone might shoot
themselves in the foot anyway.  Maybe we just live with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Kuzmenkov



On 12.04.2017 12:29, Alexander Korotkov wrote:

That's a cool feature for FTS users! Please, register this patch to 
the next commitfest.
I've added this to the 2017-07 commitfest: 
https://commitfest.postgresql.org/14/1117/



Also, what is planning overhead of this patch? That's worth
testing too.

The planning overhead is about 0.1 - 0.07 ms on the samples from my 
first letter.


Another thing catch my eye.  The estimated number of rows in Bitmap 
Count node is the same as in Bitmap Index Scan node.  Should it be 1 
because it always returns single row?

You're right, I'll fix this in the next version of the patch.


Does this limitation cause a performance drawback?  When bitmap
index scan returns all rechecks, alternative to Bitmap Count is
still Aggregate + Bitmap Heap Scan.  Thus, I think Bitmap Count
would have the same performance or even slightly faster.  That's
worth testing.

Bitmap heap scan can indeed be faster, because it prefetches heap pages, 
and can be run in parallel. When the table data is not cached, the 
difference is not big on my machine. It could probably be significant if 
I used a storage that supported parallel reads. When the data is cached 
in memory, the parallel bitmap heap scan can be significantly faster.
We could use the standard bitmap heap scan node with some tweaks, as 
discussed in the other subthread, to avoid this regression.


Here are some test queries:

--- not cached 
---
test1=# explain analyze select count(*) from pglist where fts @@ 
to_tsquery( 'tom & lane' );

  QUERY PLAN
--
 Bitmap Count on pglist  (cost=542.65..1087.68 rows=54503 width=8) 
(actual time=30264.174..30264.177 rows=1 loops=1)

   Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
   Rows Removed by Index Recheck: 270853
   Heap Fetches: 66138
   Heap Blocks: exact=39854 lossy=66138
   ->  Bitmap Index Scan on idx_pglist_fts  (cost=0.00..529.02 
rows=54503 width=0) (actual time=525.341..525.341 rows=222813 loops=1)

 Index Cond: (fts @@ to_tsquery('tom & lane'::text))
 Planning time: 128.238 ms
 Execution time: 30264.299 ms
(9 rows)

test1=# set enable_bitmapcount to off;
SET
test1=# explain analyze select count(*) from pglist where fts @@ 
to_tsquery( 'tom & lane' );

QUERY PLAN

 Finalize Aggregate  (cost=119989.73..119989.74 rows=1 width=8) (actual 
time=31699.829..31699.830 rows=1 loops=1)
   ->  Gather  (cost=119989.52..119989.73 rows=2 width=8) (actual 
time=31698.699..31699.819 rows=3 loops=1)

 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=118989.52..118989.53 rows=1 
width=8) (actual time=31689.289..31689.290 rows=1 loops=3)
   ->  Parallel Bitmap Heap Scan on pglist 
(cost=542.65..118932.74 rows=22710 width=0) (actual 
time=608.532..31634.692 rows=74271 loops=3)

 Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
 Rows Removed by Index Recheck: 90284
 Heap Blocks: exact=13242 lossy=21960
 ->  Bitmap Index Scan on idx_pglist_fts 
(cost=0.00..529.02 rows=54503 width=0) (actual time=552.136..552.136 
rows=222813 loops=1)
   Index Cond: (fts @@ to_tsquery('tom & 
lane'::text))

 Planning time: 160.055 ms
 Execution time: 31724.468 ms
(13 rows)


- cached 
-
test1=# explain analyze select count(*) from pglist where fts @@ 
to_tsquery( 'tom & lane' );

QUERY PLAN
--
 Finalize Aggregate  (cost=119989.73..119989.74 rows=1 width=8) (actual 
time=1250.973..1250.973 rows=1 loops=1)
   ->  Gather  (cost=119989.52..119989.73 rows=2 width=8) (actual 
time=1250.588..1250.964 rows=3 loops=1)

 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=118989.52..118989.53 rows=1 
width=8) (actual time=1246.144..1246.144 rows=1 loops=3)
   ->  Parallel Bitmap Heap Scan on pglist 
(cost=542.65..118932.74 rows=22710 width=0) (actual 
time=82.781..1237.585 rows=74271 loops=3)

 Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
 Rows Removed by Index Recheck: 90284
 Heap Blocks: exact=13221 lossy=22217
 ->  Bitmap Index Scan on

Re: [HACKERS] Letting the client choose the protocol to use during a SASL exchange

2017-04-12 Thread Heikki Linnakangas


On 04/11/2017 02:32 PM, Álvaro Hernández Tortosa wrote:

 So I still see your proposal more awkward and less clear, mixing
things that are separate. But again, your choice  :)


So, here's my more full-fledged proposal.

The first patch refactors libpq code, by moving the responsibility of 
reading the GSS/SSPI/SASL/MD5 specific data from the authentication 
request packet, from the enormous switch-case construct in 
PQConnectPoll(), into pg_fe_sendauth(). This isn't strictly necessary, 
but I think it's useful cleanup anyway, and now that there's a bit more 
structure in the AuthenticationSASL message, the old way was getting 
awkward.


The second patch contains the protocol changes, and adds the 
documentation for it.


- Heikki

>From 29c958dab9f8ece5d855b335c09cc9125606774a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Wed, 12 Apr 2017 16:01:18 +0300
Subject: [PATCH 1/2] Refactor libpq authentication request processing.

Move the responsibility of reading the data from the authentication request
message from PQconnectPoll() to pg_fe_sendauth(). This way, PQconnectPoll()
doesn't need to know about the different authentication request types,
and we don't need the extra fields in the pg_conn struct to pass the data
from PQconnectPoll() to pg_fe_sendauth() anymore.

In the backend, free each SASL message after sending it. It's not a lot
of wasted memory, but a leak nevertheless. Adding the pfree() revealed
a little bug in build_server_first_message(): It attempts to keeps a copy
of the sent message, but it was missing a pstrdup(), so the pointer
started to dangle, after adding the pfree() into CheckSCRAMAuth().
---
 src/backend/libpq/auth-scram.c|  10 +-
 src/backend/libpq/auth.c  |  10 +-
 src/interfaces/libpq/fe-auth.c| 214 +-
 src/interfaces/libpq/fe-auth.h|   2 +-
 src/interfaces/libpq/fe-connect.c | 113 +++-
 src/interfaces/libpq/fe-misc.c|  13 +++
 src/interfaces/libpq/libpq-int.h  |  11 +-
 7 files changed, 204 insertions(+), 169 deletions(-)

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 5077ff33b1..a47c48d980 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -161,10 +161,10 @@ static char *scram_MockSalt(const char *username);
  * needs to be called before doing any exchange.  It will be filled later
  * after the beginning of the exchange with verifier data.
  *
- * 'username' is the provided by the client.  'shadow_pass' is the role's
- * password verifier, from pg_authid.rolpassword.  If 'shadow_pass' is NULL, we
- * still perform an authentication exchange, but it will fail, as if an
- * incorrect password was given.
+ * 'username' is the username provided by the client in the startup message.
+ * 'shadow_pass' is the role's password verifier, from pg_authid.rolpassword.
+ * If 'shadow_pass' is NULL, we still perform an authentication exchange, but
+ * it will fail, as if an incorrect password was given.
  */
 void *
 pg_be_scram_init(const char *username, const char *shadow_pass)
@@ -984,7 +984,7 @@ build_server_first_message(scram_state *state)
  state->client_nonce, state->server_nonce,
  state->salt, state->iterations);
 
-	return state->server_first_message;
+	return pstrdup(state->server_first_message);
 }
 
 
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 66ead9381d..681b93855f 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -872,6 +872,8 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 	strlen(SCRAM_SHA256_NAME) + 1);
 
 	/*
+	 * Initialize the status tracker for message exchanges.
+	 *
 	 * If the user doesn't exist, or doesn't have a valid password, or it's
 	 * expired, we still go through the motions of SASL authentication, but
 	 * tell the authentication method that the authentication is "doomed".
@@ -880,8 +882,6 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 	 * This is because we don't want to reveal to an attacker what usernames
 	 * are valid, nor which users have a valid password.
 	 */
-
-	/* Initialize the status tracker for message exchanges */
 	scram_opaq = pg_be_scram_init(port->user_name, shadow_pass);
 
 	/*
@@ -918,7 +918,7 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 			return STATUS_ERROR;
 		}
 
-		elog(DEBUG4, "Processing received SASL token of length %d", buf.len);
+		elog(DEBUG4, "Processing received SASL response of length %d", buf.len);
 
 		/*
 		 * we pass 'logdetail' as NULL when doing a mock authentication,
@@ -936,9 +936,11 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 			/*
 			 * Negotiation generated data to be sent to the client.
 			 */
-			elog(DEBUG4, "sending SASL response token of length %u", outputlen);
+			elog(DEBUG4, "sending SASL challenge of length %u", outputlen);
 
 			sendAuthRequest(port,

Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)

2017-04-12 Thread Andres Freund

On 2017-04-12 10:12:47 -0400, Tom Lane wrote:
> Andres mentioned, and I've confirmed locally, that a large chunk of
> initdb's runtime goes into regprocin's brute-force lookups of function
> OIDs from function names.  The recent discussion about cutting TAP test
> time prompted me to look into that question again.  We had had some
> grand plans for getting genbki.pl to perform the name-to-OID conversion
> as part of a big rewrite, but since that project is showing few signs
> of life, I'm thinking that a more localized performance fix would be
> a good thing to look into.  There seem to be a couple of plausible
> routes to a fix:
> 
> 1. The best thing would still be to make genbki.pl do the conversion,
> and write numeric OIDs into postgres.bki.  The core stumbling block
> here seems to be that for most catalogs, Catalog.pm and genbki.pl
> never really break down a DATA line into fields --- and we certainly
> have got to do that, if we're going to replace the values of regproc
> fields.  The places that do need to do that approximate it like this:
> 
>   # To construct fmgroids.h and fmgrtab.c, we need to inspect some
>   # of the individual data fields.  Just splitting on whitespace
>   # won't work, because some quoted fields might contain internal
>   # whitespace.  We handle this by folding them all to a simple
>   # "xxx". Fortunately, this script doesn't need to look at any
>   # fields that might need quoting, so this simple hack is
>   # sufficient.
>   $row->{bki_values} =~ s/"[^"]*"/"xxx"/g;
>   @{$row}{@attnames} = split /\s+/, $row->{bki_values};
> 
> We would need a bullet-proof, non-hack, preferably not too slow way to
> split DATA lines into fields properly.  I'm one of the world's worst
> Perl programmers, but surely there's a way?

I've done something like 1) before:
http://archives.postgresql.org/message-id/20150221230839.GE2037%40awork2.anarazel.de

I don't think the speeds matters all that much, because we'll only do it
when generating the .bki file - a couple ms more or less won't matter
much.

I IIRC spent some more time to also load the data files from a different
format:
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/sane-catalog
although that's presumably heavily outdated now.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Heikki Linnakangas


On 04/12/2017 08:38 PM, Álvaro Hernández Tortosa wrote:

- Even though I don't really care about SCRAM, and without having any
prior knowledge about SCRAM, I volunteered some time ago to study SCRAM,
give a lightning talk about SCRAM and later write a client
implementation for the jdbc driver. And I have already devoted a very
fair amount of time in doing so, and will keep doing that until all code
is done. Code WIP is here FYI: https://github.com/ahachete/scram. So
it's not that I haven't already put my code behind my words.


That is very much appreciated! You writing a second implementation of 
the client-side support (libpq being the first) is very, very helpful, 
to validate that the protocol is sane, unambiguous, and adequately 
documented.



On 12/04/17 18:38, Robert Haas wrote:

Furthermore, I think that the state of this feature as it currently
exists in the tree is actually kind of concerning.  There are
currently four open items pertaining to SCRAM at least two of which
look to my mind an awful lot like stuff that should have ideally been
handled pre-feature-freeze: \password support, and protocol
negotiation.  I'm grateful for the hard work that has gone into this
feature, but these are pretty significant loose ends.  \password
support is a basic usability issue.  Protocol negotiation affects
anyone who may want to make their PG driver work with this feature,
and certainly can't be changed after final release, and ideally not
even after beta.  We really, really need to get that stuff nailed down
ASAP or we're going to have big problems.  So I think we should focus
on those things, not this.


Yes, we need to nail down the protocol and \password before beta. I am 
working on them now.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 6:29 AM, Amit Langote
 wrote:
> Actually, p1 is a partitioned table, so the error.  And I realize that
> that's a wrong behavior.  Currently the check is performed using only the
> relkind, which is bogus.  Specifying ONLY should cause an error only when
> the table has partitions.

That sounds like a REALLY bad idea, because now you're going to end up
with a constraint that can never be enforced against any actual data
rows ... or else you're going to later pretend that ONLY wasn't
specified.  I think the rule that partitioned tables can't have
non-inherited constraints is absolutely right, and relaxing it is
quite wrong.

I think you had the right idea upthread when you suggested dumping it this way:

CREATE TABLE p1 PARTITION OF p (
b NOT NULL
)
FOR VALUES IN (1)
PARTITION BY RANGE (b);

That looks absolutely right to me, and very much principled.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some thoughts about SCRAM implementation

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 2:09 PM, Heikki Linnakangas  wrote:
> That is very much appreciated! You writing a second implementation of the
> client-side support (libpq being the first) is very, very helpful, to
> validate that the protocol is sane, unambiguous, and adequately documented.

+1.

> Yes, we need to nail down the protocol and \password before beta. I am
> working on them now.

Good to hear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Undefined psql variables

2017-04-12 Thread Pavel Stehule

2017-04-12 17:05 GMT+02:00 Robert Haas :

> On Sun, Apr 2, 2017 at 3:56 PM, Tom Lane  wrote:
> > So my view of this is that "send the expression to the server" ought
> > to be just one option for \if, not the only way to do it.
>
> I heartily agree.  There should be some kind of client-side expression
> language, and one thing it should allow is calling out to the server.
> Then people who only want to call out to the server can do that, but
> people who want to do something else have the option.  Insisting that
> this facility isn't allowed to do anything other than consult the
> server is (1) inconsistent with what we've already got in v10 and (2)
> boxing ourselves into a corner for no very good reason.
>
> Now, the optimal shape for that client-side expression language is not
> very clear to me.  Do we want to invent our own language, or maybe
> consider using something that already exists?  It's been previously
> suggested that we should somehow embed Lua, and this might not be a
> bad place to consider doing something like that.  That might be a way
> to add a lot of power without having to invent an entirely new
> programming language one bit at a time.  If we want to invent our own
> expression language, what kind of syntax should it use?  Upon what
> kind of design principles should it be based?  There's likely to be
> vigorous debate on these topics, and probably also complaints that the
> good designs are too much work and the easy-to-implement designs are
> too limiting.  (Regular readers of this mailing list will likely be
> able to guess which side of those debates I'll be on, but we need to
> have them all the same.)
>
>
Integration Lua engine can help lot of - and it can change the design
significantly. For this purpose it is maybe overkill, but it can be fresh
air in psql customisation and usage.

\setlua varname one line lua expression

or

\lua
  ...
  lua code

  psqlvar.set("", somevalue)

\endlua

I like this idea. We can use Math libraries, random generators, ...

If Lua engine and dependency are too strong cafe - very basic calculator
like
https://www.l2f.inesc-id.pt/~david/w/pt/The_YACC_Parser_Generator/Example:_Calculator_with_Variables
can
be good enough (Don't need a variables there)


> Regarding the ostensible topic of this thread, one thought I had while
> reading through these various responses is that the original need
> would be well-served by the (somewhat dubious) syntax that bash uses
> for variable substitution.  Obviously, we aren't going to change the
> interpolate-this-variable character from : to $, but bash has
> ${parameter:-word} to substitute a default for an unset parameter,
> ${parameter:=word} to substitute a default for an unset parameter and
> also set the parameter to that value, ${parameter:?word} to error out
> with word as the error mesage if parameter is not set, and so forth.
> If we decide to roll our own, we might consider taking inspiration
> from those constructs.
>
>
It is great and it can work

\set varname :{varname?some error message} ..
\set varname :{varname:-0} ..

Good ideas

Regards

Pavel




> I think that one of the general problems of language design is, as
> Larry Wall once said, that a good language should make simple things
> simple and complex things possible.  But simple is not an absolute; it
> depends on context.  The things which a language needs to make simple
> are those things which will be done frequently *in that language*.  So
> for example in this case, out-calls to SQL need to be very easy to
> write.  Maybe the empty-parameter thing needs to be easy; not sure.
> Coming up with a good solution here will involve understanding what
> people typically want to do with a language of this type and then
> making sure that stuff can be done succinctly - and ideally also
> making sure that other stuff is also possible if you're willing to put
> in more legwork.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-12 Thread Robert Haas

On Tue, Apr 11, 2017 at 9:41 AM, Jeevan Ladhe
 wrote:
> I have checked for NULLs too, and the default partition can be created even
> when there are partitions for each TRUE, FALSE and NULL.
>
> Consider the example below:
>
> postgres=# CREATE TABLE list_partitioned (
> a bool
> ) PARTITION BY LIST (a);
> CREATE TABLE
> postgres=# CREATE TABLE part_1 PARTITION OF list_partitioned FOR VALUES IN
> ('false');
> CREATE TABLE
> postgres=# CREATE TABLE part_2 PARTITION OF list_partitioned FOR VALUES IN
> ('true');
> CREATE TABLE
> postgres=# CREATE TABLE part_3 PARTITION OF list_partitioned FOR VALUES IN
> (null);
> CREATE TABLE
> postgres=# CREATE TABLE part_default PARTITION OF list_partitioned FOR
> VALUES IN (DEFAULT);
> CREATE TABLE

In my opinion, that's absolutely fine, and it would be very strange to
try to prevent it.  The partitioning method shouldn't have specific
knowledge of the properties of individual data types.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump emits ALTER TABLE ONLY partitioned_table

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 3:29 PM, Stephen Frost  wrote:
> I'm not following what you're getting at here.
>
> There's already a constraint on the table, and ALTER TABLE ONLY doesn't
> say anything about what happens later on (certainly it doesn't make new
> tables created with 'LIKE' have bits omitted, if that's what you were
> thinking).  Lastly, the error being thrown certainly seems to imply that
> one needs to go fix all the child tables to have the constraint first
> and then the constraint can be added to the parent (presumably using the
> same ALTER TABLE ONLY command).  If there aren't any child tables, then
> it should work, if there *are* child tables and they've got the
> necessary constraint, then this should be allowed, so that future child
> tables create will have the constraint.

So I think I was indeed confused before, and I think you're basically
right here, but on one point I think you are not right -- ALTER TABLE
ONLY .. CHECK () doesn't work on a table with inheritance children
regardless of whether the children already have the matching
constraint:

rhaas=# create table foo (a int, b text);
CREATE TABLE
rhaas=# create table bar () inherits (foo);
CREATE TABLE
rhaas=# alter table only foo add check (a = 1);
ERROR:  constraint must be added to child tables too
rhaas=# alter table only bar add check (a = 1);
ALTER TABLE
rhaas=# alter table only foo add check (a = 1);
ERROR:  constraint must be added to child tables too

It looks like ALTER TABLE ONLY works find on a table with no children,
but once it's got children it no longer works, period.  However,
you're right that you can add the constraint to the as-yet-childless
table and then future children will inherit the constraint properly.
Continuing the previous example:

rhaas=# drop table bar;
DROP TABLE
rhaas=# alter table only foo add check (a = 1);
ALTER TABLE
rhaas=# create table bar () inherits (foo);
CREATE TABLE

So, regarding Amit's 0001:

- I think we should update the relevant hunk of the documentation
rather than just removing it.

- Should we similarly allow TRUNCATE ONLY foo and ALTER TABLE ONLY foo
.. to work on a partitioned table without partitions, or is that just
pointless tinkering?  That seems to be the only case where, after this
patch, an ONLY operation will fail on a childless partitioned table.

Do you want to be responsible for committing these or should I do it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign Join pushdowns not working properly for outer joins

2017-04-12 Thread Peter Eisentraut

Is this patch considered ready for review as a backpatch candidate?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade vs extension upgrades

2017-04-12 Thread Robert Haas

On Wed, Apr 12, 2017 at 6:30 PM, Peter Eisentraut
 wrote:
> On 4/10/17 11:30, Magnus Hagander wrote:
>> After you've run pg_upgrade, you have to loop through all your databases
>> and do an "ALTER EXTENSION abc UPDATE" once for each extension.
>>
>> Is there a reason we shouldn't have pg_upgrade emit a script that does
>> this, similar to how it emits a script to run ANALYZE?
>
> Shouldn't pg_dump do this, and perhaps by default?
>
> If I restore a dump into another instance, I need to upgrade all my
> extensions to that installations's versions, no?  That's not particular
> to pg_upgrade.

No, it's an optional step.  If we did the upgrade by default, it would
greatly increase the chance of something failing.  For example,
suppose the upgrade does a DROP and then CREATE of a function whose
signature has changed.  If there's anything that depends on that
function, this will fail.  But if we don't do it, there's no actual
problem: the shared library is supposed to be prepared to cope with
people still using the old SQL definition.  It is probably desirable
to update where possible to gain access to new features, etc., but it
shouldn't be required.

I do think there might be some value in a tool that looked for old
extensions and tried to update them, but I'm not sure it should be
pg_dump.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

2017-04-12 Thread Andres Freund

On 2017-04-05 09:39:37 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-04-05 02:47:55 -0400, Noah Misch wrote:
> >> [Action required within three days.  This is a generic notification.]
>
> > I've a very preliminary patch.  I'd like to only start polishing it up
> > once the code freeze is over, so I can work on getting some patches in -
> > note that I myself have no pending patches.  Once frozen I'll polish it
> > up and send that within a few days.
>
> FWIW, I'm willing to help out on this.

Help would be appreciated.  I've pondered, and partially implemented,
several approaches so far, and my conclusion is that we should just do
nothing.  The amount of corner cases is just too big, and the utility of
the feature too small.

To recap, the issue is that in previous versions it was, by accident
(there's no test, code comments "supporting" the feature talk about a
different corner case, and the behaviour isn't correct in some cases)
allowed to do something like:
SELECT * FROM CAST(generate_series(1,3) * 5 AS int);
while
SELECT * FROM generate_series(1,3) * 5;
is not allowed.

The reason that that works from the gram.y perspective is that CAST etc
types of func_expr's.  The reason that it worked from a code perspective
is that execQual.c:ExecMakeTableFunctionResult() has the following:
/*
 * Normally the passed expression tree will be a FuncExprState, since 
the
 * grammar only allows a function call at the top level of a table
 * function reference.  However, if the function doesn't return set then
 * the planner might have replaced the function call via 
constant-folding
 * or inlining.  So if we see any other kind of expression node, execute
 * it via the general ExecEvalExpr() code; the only difference is that 
we
 * don't get a chance to pass a special ReturnSetInfo to any functions
 * buried in the expression.
 */
if (funcexpr && IsA(funcexpr, FuncExprState) &&
IsA(funcexpr->expr, FuncExpr))
and back then ExecEvalExpr() was allowed to return sets.  It also
depends on some default initializations (e.g. rsinfo.returnMode =
SFRM_ValuePerCall).

This yields plenty weird behaviour in < v10. E.g. the following is
disallowed:
  SELECT * FROM int4mul(generate_series(1,3), 1);
  ERROR:  0A000: set-valued function called in context that cannot accept a set
as is
  SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS bigint);
  ERROR:  0A000: set-valued function called in context that cannot accept a set
because the cast is implemented as int8(expr) which avoids the fallback
path as it's a FuncExpr, but
  SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS text);
works because the cast is implemented as a io coercion, which is not a
funcexpr. Therefore it uses the fallback ExecEvalExpr().

The mismatch between ExecEvalExpr() and nodeFunctionscan.c SRF behaviour
back then also yields odd behaviour, e.g.:
  SELECT * FROM generate_series(1,0);
returns zero rows, but
  SELECT * FROM CAST(generate_series(1,0) * 5 AS INT);
returns one NULL row.

In v10, as it stands, these all error out, because the SRFs are now only
to be evaluated via either nodeFunctionscan.c (FROM) or via
nodeProjectSet.c (targetlist), not ExecEvalExpr() anymore.

I've basically pondered three different methods of implementing
something akin to the old behaviour:

1) Move the non-SRF part into nodeFunctionscan.c's targetlist, and let
   it be evaluated there.  E.g. if the expression is
   CAST(generate_series(1,5) AS text), evaluate the generate_series(1,5)
   using nodeFunctionscan's FunctionScanPerFuncState machinery, but
   implement the CAST as CAST(Var(whatever, 1) AS Text).

   That doesn't work well for composite/record returning rows, because
   RangeTblFunction's returning composites are expanded into multiple
   columns. E.g.
   SELECT * FROM CAST(CAST(twocol_noset_outline(generate_series(1,3), 'text') 
AS text) AS twocol);
   returns all the columns of twocol, not a single wholerow datum.

   There's also some issues around what to do with cases involving
   volatile functions when the output is not referenced, or referenced
   multiple times e.g.
 SELECT f, f FROM CAST(generate_series(1,3) * nextval(...)) AS f;
   would evaluate nextval() twice with such an approach...

2) During planning, split of the SRF bit from the !SRF bit and have
   nodeFunctionscan.c evaluate both separately, like 1).  That allows to
   avoid the volatility issue, because the targetlist is still projected
   separately.

   I've prototyped this to a reasonable point, by having
   create_functionscan_plan() process each RangeTblFunction and split
   the expression into SRF and non-SRF parts, and then have
   FunctionNext() evaluate the non-SRF part.

   At the current state of my prototype this happens to allow simple SRF
   in arguments cases like SELECT * FROM int4mul(generate_series(1,3), 1)
   which are

Re: [pgsql-www] [HACKERS] Small issue in online devel documentation build

2017-04-12 Thread Peter Eisentraut

On 4/12/17 09:50, Bruce Momjian wrote:
> On Wed, Apr 12, 2017 at 01:31:51PM +0200, Magnus Hagander wrote:
>> I think that only leaves the change to the javascript code that Bruce sent.
>> Let's see if we can figure out a way to do that one without requiring
>> javascript, but after that we have covered all listed issues I think?
> 
> Well, we have been using JavaScript for years, so my patch only fixes
> the JavaScript we already have.  Seems applying the fix now makes sense
> and then we can ponder other methods.

I have googled a bit.  These two articles describe what I think is the
issue we are trying to address:

http://code.stephenmorley.org/html-and-css/fixing-browsers-broken-monospace-font-handling/

http://meyerweb.com/eric/thoughts/2010/02/12/fixed-monospace-sizing/

The recipe in the first one seems to work somewhat, but not quite as
well as the current javascript.  (The recipe in the second article is
ultimately equivalent but more verbose.)

I tend to agree with Bruce that we could just fix this within the
current framework.  But if we want to make a change, there is the
information.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade vs extension upgrades

2017-04-12 Thread Peter Eisentraut

On 4/12/17 18:59, Robert Haas wrote:
> I do think there might be some value in a tool that looked for old
> extensions and tried to update them, but I'm not sure it should be
> pg_dump.

This reminds me a bit of the problem of upgrading all collations after
an upgrade.  Perhaps we can find similar solutions to both problems.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_statistic_ext.staenabled might not be the best column name

2017-04-12 Thread Tomas Vondra




On 04/12/2017 03:36 PM, David Rowley wrote:


"stakind" seems like a better name. I'd have personally gone with
"statype" but pg_statistic already thinks stakind is better.


+1 to stakind

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical replication and PANIC during shutdown checkpoint in publisher

2017-04-12 Thread Peter Eisentraut

On 4/12/17 09:55, Fujii Masao wrote:
> To fix this issue, we should terminate walsender for logical replication
> before shutdown checkpoint starts. Of course walsender for physical
> replication still needs to keep running until shutdown checkpoint ends,
> though.

Can we turn it into a kind of read-only or no-new-commands mode instead,
so it can keep streaming but not accept any new actions?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-04-12 Thread Peter Geoghegan

On Wed, Apr 12, 2017 at 10:12 AM, Robert Haas  wrote:
>> I may have missed something, but there is no intention to ignore known
>> regressions/reviews. Of course, I don't think that every regression will be
>> solvable, like if you run a CPU-bound workload, setting it up in a way such
>> that you repeatedly exercise the area where WARM is doing additional work,
>> without providing any benefit, may be you can still find regression. I am
>> willing to fix them as long as they are fixable and we are comfortable with
>> the additional code complexity. IMHO certain trade-offs are good, but I
>> understand that not everybody will agree with my views and that's ok.
>
> The point here is that we can't make intelligent decisions about
> whether to commit this feature unless we know which situations get
> better and which get worse and by how much.  I don't accept as a
> general principle the idea that CPU-bound workloads don't matter.
> Obviously, I/O-bound workloads matter too, but we can't throw
> CPU-bound workloads under the bus.  Now, avoiding index bloat does
> also save CPU, so it is easy to imagine that WARM could come out ahead
> even if each update consumes slightly more CPU when actually updating,
> so we might not actually regress.  If we do, I guess I'd want to know
> why.

I myself wonder if this CPU overhead is at all related to LP_DEAD
recycling during page splits. I have my suspicions that the recyling
has some relationship to locality, which leads me to want to
investigate how Claudio Freire's patch to consistently treat heap TID
as part of the B-Tree sort order could help, both in general, and for
WARM.

Bear in mind that the recycling has to happen with an exclusive buffer
lock held on a leaf page, which could hold up rather a lot of scans
that need to visit the same value even if it's on some other,
relatively removed leaf page.

This is just a theory.

-- 
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SUBSCRIPTIONS and pg_upgrade

2017-04-12 Thread Peter Eisentraut

On 4/11/17 23:41, Noah Misch wrote:
> On Tue, Apr 11, 2017 at 11:21:24PM -0400, Peter Eisentraut wrote:
>> On 4/9/17 22:16, Noah Misch wrote:
>>> [Action required within three days.  This is a generic notification.]
>>
>> Patches have been posted.  Discussion is still going on a bit.
> 
> By what day should the community look for your next update?

tomorrow

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 144 matches

Mail list logo