Re: [HACKERS] increasing the default WAL segment size

2017-03-13 Thread Beena Emerson
Hello,

Attached is the updated patch. It fixes the issues and also updates few
code comments.

On Fri, Mar 10, 2017 at 1:09 PM, tushar 
wrote:
>
>
> 1)at the time of initdb, we have set - "--wal-segsize 4"  ,so all the WAL
> file size should be 4 MB each  but in the postgresql.conf file , it is
> mentioned
>
> #wal_keep_segments = 0  # in logfile segments,* 16MB each*; 0
> disables
>

> so the comment  (16MB ) mentioned against parameter 'wal_keep_segments'
> looks wrong , either we should remove this or modify it .
>

Removed.


>
> 2)Getting "Aborted (core dumped)"  error at the time of running
> pg_basebackup  ,
> *(this issue is only coming on Linux32 ,not on Linux64) * we have  double
> check to confirm it .
>

 Can you please check with the new patch?

-- 
Thank you,

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


02-initdb-walsegsize-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-03-13 Thread David Rowley
On 14 March 2017 at 11:35, David Rowley 
wrote:

> On 14 March 2017 at 07:50, Tom Lane  wrote:
>
>> [ getting back to this patch finally... ]
>>
>> David Rowley  writes:
>> > I've attached a patch which implements this, though only for
>> > MergeJoin, else I'd imagine we'd also need to ensure all proofs used
>> > for testing the uniqueness were also hash-able too. I added some XXX
>> > comments in analyzejoin.c around the mergeopfamilies == NIL tests to
>> > mention that Merge Join depends on all the unique proof quals having
>> > mergeopfamilies.  This also assumes we'll never use some subset of
>> > mergejoin-able quals for a merge join, which could be an interesting
>> > area in the future, as we might have some btree index on a subset of
>> > those columns to provide pre-sorted input. In short, it's a great
>> > optimisation, but I'm a little scared we might break it one day.
>>
>> Umm ... it's broken already isn't it?  See the comment for
>> generate_mergejoin_paths:
>>
>> Thanks for looking at this again.
>
> Yeah confirmed. It's broken. I guess I just need to remember in the Path
> if we got all the join quals, although I've not looked in detail what the
> best fix is. I imagine it'll require storing something else in the JoinPath.
>

OK, so I've spent some more time on this and I've come up with a solution.

Basically the solution is to not skip mark and restore when joinquals
contains any items.  This is a requirement for SEMI joins, but overly
cautious for unique joins. However, I think it'll apply in most cases when
Merge Join will be a win, and that's when there's a btree index on both
sides of the join which covers all columns in the join condition.  I
carefully commented this part of the code to explain what can be done to
have it apply in more cases.

This caused me to go and change the following code too:

@@ -2676,6 +2688,9 @@ final_cost_mergejoin(PlannerInfo *root, MergePath
*path,
  * it off does not entitle us to deliver an invalid plan.
  */
  else if (innersortkeys == NIL &&
+ !((extra->inner_unique || path->jpath.jointype == JOIN_SEMI) &&
+ list_length(path->jpath.joinrestrictinfo) ==
+ list_length(path->path_mergeclauses)) &&
  !ExecSupportsMarkRestore(inner_path))
  path->materialize_inner = true;

I've been staring at this for a while and I think it's correct. If it's
wrong then we'd get "ERROR: unrecognized node type: " from ExecMarkPos().

Here we must make sure and never skip materializing the inner side, if
we're not going to skip mark and restore in ExecInitMergeJoin().

I've attached a patch which fixes the issue.

Also added a regression test to try to make sure it stays fixed. There's a
few other mostly cosmetic fixes in there too.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


unique_joins_2017-03-14a.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ToDo: listagg is in ANSI/SQL:2016

2017-03-13 Thread Pavel Stehule
Hi

looks like Oracle has pretty strong position in standard. ANSI SQL has new
aggregate function listagg. It is supported by DB2 too.

Unfortunately one supported syntax is not possible in Postgres due our
design of ordered aggregates.

Syntax:

1. listagg(expr) FROM ... not deterministic result
2. listagg(expr, separator) FROM ... previous with separator - when sep. is
NULL, then it is ignored

3. listagg(expr [, sep]) WITHIN GROUP (ORDER BY expr_list)

last syntax is not available in Postgres - because our ordered aggregates
expects immutable arguments in ordered aggregates arguments. First two are
not supported two, because ORDER BY clause is required every time.

Regards

Pavel


Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

2017-03-13 Thread Ashutosh Sharma
Hi,

>
>> 2) It seems like you have choosen wrong datatype for page_checksum. I
>> am getting negative checksum value when trying to run below query. I
>> think the return type for the SQL function page_checksum should be
>> 'integer' instead of 'smallint'.
>>
>> postgres=# SELECT page_checksum(get_raw_page('pg_class', 0), 100);
>>  page_checksum
>> ---
>> -19532
>> (1 row)
>>
>> Above problem also exist in case of page_header. I am facing similar
>> problem with page_header as well.
>>
>> postgres=# SELECT page_header(get_raw_page('pg_class', 0));
>>  page_header
>> -
>>  (0/2891538,-28949,1,220,7216,8192,8192,4,0)
>> (1 row)
>>
>
> No. This is consistent with page_header() which is also using smallint for
> the checksum value.
>

Yes. But, as i said earlier I am getting negative checksum value for
page_header as well. Isn't that wrong. For eg. When I debug the
following query, i could pd_checksum value as '40074' in gdb where
page_header shows it as '-25462'.

SELECT page_header(get_raw_page('pg_class', 0));

(gdb) ppage->pd_checksum
$2 = 40074

postgres=# SELECT page_header(get_raw_page('pg_class', 0));
 page_header
-
 (0/304EDE0,-25462,1,220,7432,8192,8192,4,0)
(1 row)

I think pd_checksum in PageHeaderData is defined as uint16 (0 to
65,535) whereas in SQL function for page_header it is defined as
smallint (-32768 to +32767).

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-13 Thread Michael Paquier
On Tue, Mar 14, 2017 at 12:34 PM, Tom Lane  wrote:
> Michael Paquier  writes:
> If some other practice becomes better in v12, then we teach it about that
> one.  It's not like psql hasn't got many other server-version-dependent
> behaviors.

Okay. Looking at the code, scram_build_verifier is only available on
the backend side, so if we need to rethink it a bit to be able to have
libpq build a SCRAM verifier:
- The allocation of the verifier buffer needs to happen outside it,
and the caller needs to provide an allocated buffer. Its size had
better be calculated a macro.
- The routine needs to be moved to scram-common.c.
- a copy of hex_encode, say pg_hex_encode added in its own file in
src/common/, needs to be provided. That's 6 lines of code, we could
just have that in scram-common.c.
Does that sound correct?

> Alternatively, if what you mean by that is you don't trust SCRAM at all,
> maybe we'd better revert the feature as not being ready for prime time.

FWIW, I am confident about the code. Based on years watching this
mailing list, switching long-term behaviors is usually done in a less
faster pace, that's the only reason on which my opinion is based.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Write Ahead Logging for Hash Indexes

2017-03-13 Thread Amit Kapila
On Mon, Mar 13, 2017 at 6:26 PM, Amit Kapila  wrote:
> On Thu, Mar 9, 2017 at 3:11 AM, Robert Haas  wrote:
>> On Tue, Mar 7, 2017 at 6:41 PM, Robert Haas  wrote:
> Great, thanks.  0001 looks good to me now, so committed.

 Committed 0002.
>>>
>>> Here are some initial review thoughts on 0003 based on a first read-through.
>>
>
>> It seems like a good test to do with this patch would be to set up a
>> pgbench test on the master with a hash index replacing the usual btree
>> index.  Then, set up a standby and run a read-only pgbench on the
>> standby while a read-write pgbench test runs on the master.  Maybe
>> you've already tried something like that?
>>
>
> I also think so and apart from that I think it makes sense to perform
> recovery test by Jeff Janes tool and probably tests with
> wal_consistency_check. These tests are already running from past seven
> hours or so and I will keep them running for the whole night to see if
> there is any discrepancy.
>

We didn't found any issue with the above testing.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Tom Lane
Andres Freund  writes:
> I don't mind the new output, but I kinda wonder whether it's a good idea
> to include the '.s.PGSQL.5432' bit in the host and/or whether we
> shouldn't include the port in the TCP cases as well

Yeah, I've been thinking that maybe it should look like

2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv6 address "::1", port 
5432
2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2017-03-13 10:08:49.400 EDT [90059] LOG:  listening on Unix address 
"/tmp/.s.PGSQL.5432"

It would take a couple more lines of code to make that happen, but
it would future-proof the messages against the day we decide to
allow one server to respond to more than one port number ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Andres Freund
On 2017-03-13 10:12:42 -0400, Robert Haas wrote:
> So now on every startup I get this:
> 
> 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv6 address "::1"
> 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv4 address 
> "127.0.0.1"
> 2017-03-13 10:08:49.400 EDT [90059] LOG:  listening on Unix address
> "/tmp/.s.PGSQL.5432"

I don't mind the new output, but I kinda wonder whether it's a good idea
to include the '.s.PGSQL.5432' bit in the host and/or whether we
shouldn't include the port in the TCP cases as well

One cannot use -h /tmp/.s.PGSQL.5432, making this a bit confusing...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-13 Thread Tom Lane
Michael Paquier  writes:
> On Tue, Mar 14, 2017 at 11:47 AM, Tom Lane  wrote:
>> Seems to me the intended behavior of \password is to use the best
>> available practice.  So my guess is that it ought to use SCRAM when
>> talking to a >= 10.0 server.  What the previous password was ought
>> to be irrelevant, even if it could find that out which it shouldn't
>> be able to IMO.

> And in a release or two? SCRAM being a fresh feature, switching the
> hashing now is not much a conservative approach.

If some other practice becomes better in v12, then we teach it about that
one.  It's not like psql hasn't got many other server-version-dependent
behaviors.

Alternatively, if what you mean by that is you don't trust SCRAM at all,
maybe we'd better revert the feature as not being ready for prime time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-13 Thread Michael Paquier
On Tue, Mar 14, 2017 at 11:47 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> I'm not talking about changing the default, just having it be possible
>> to use \password with the new system as it was with the old, whatever
>> exactly we think that means.

I think that this means looking at password_encryption within
PQencryptPassword(), something that could silently break some
applications. That's why with Joe we are mentioning upthread to extend
PQencryptPassword() with a hashing method, and have a function to
allow retrieval of the password type for a given user.

> Seems to me the intended behavior of \password is to use the best
> available practice.  So my guess is that it ought to use SCRAM when
> talking to a >= 10.0 server.  What the previous password was ought
> to be irrelevant, even if it could find that out which it shouldn't
> be able to IMO.

And in a release or two? SCRAM being a fresh feature, switching the
hashing now is not much a conservative approach.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unacccented Fractions

2017-03-13 Thread David E. Wheeler
Hello Hackers,

I noticed that unaccent.rules has spaces in front of the unaccented 
representation of fraction glyphs:

¼1/4
½1/2
¾3/4

Note the space after the tab. In case my client kills what I’ve pasted, those 
lines match

¼\t[ ]1/4
½\t[ ]1/2
¾\t[ ]3/4

This makes sense to me, as I’d like “1¼”, for example to become “1 1/4”. 
However, that’s not what seems to happen:

=# SELECT unaccent('1¼');
 unaccent 
--
 11/4

Should that space from the rules file be preserved, so that the text doesn’t 
become eleven fourths?

Thanks,

David

smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] scram and \password

2017-03-13 Thread Tom Lane
Robert Haas  writes:
> I'm not talking about changing the default, just having it be possible
> to use \password with the new system as it was with the old, whatever
> exactly we think that means.

Seems to me the intended behavior of \password is to use the best
available practice.  So my guess is that it ought to use SCRAM when
talking to a >= 10.0 server.  What the previous password was ought
to be irrelevant, even if it could find that out which it shouldn't
be able to IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-13 Thread Robert Haas
On Sun, Mar 12, 2017 at 11:14 PM, Michael Paquier
 wrote:
> On Mon, Mar 13, 2017 at 9:15 AM, Robert Haas  wrote:
>> On Fri, Mar 10, 2017 at 5:43 PM, Michael Paquier
>>  wrote:
>>> On Sat, Mar 11, 2017 at 2:53 AM, Jeff Janes  wrote:
 Should the \password tool in psql inspect password_encryption and act on it
 being 'scram'?
>>>
>>> Not sure if it is wise to change the default fot this release.
>>
>> Seems like an odd way to phrase it.  Aren't we talking about making a
>> feature that worked in previous releases continue to work?
>
> Considering how fresh scram is, it seems clear to me that we do not
> want to just switch the default values of password_encryption, the
> default behaviors of PQencryptPassword() and \password only to scram,
> but have something else. Actually if we change nothing for default
> deployments of Postgres using md5, PQencryptPassword() and \password
> would still work properly.

I'm not talking about changing the default, just having it be possible
to use \password with the new system as it was with the old, whatever
exactly we think that means.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2017-03-13 Thread Craig Ringer
On 14 March 2017 at 05:43, Robert Haas  wrote:

> For example, suppose vacuum #1 comes along, advances the limits,
> truncates clog, and then gets descheduled.  Now vacuum #2 comes along,
> advances the limits further, and then gets descheduled.  Now vacuum #1
> wakes up and calls SetTransactionIdLimit() and prematurely advances
> xidWrapLimit.  Oops.

Mm, right. And without a lock held from when oldestXid advances
through to completion of clog truncation, then taking the same lock in
SetTransactionIdLimit, there's not really a way around it.

I'm embarrassed not to have seen that.

Doing things the other way around, per the earlier patch, can cause
SetTransactionIdLimit to not to advance as far as it should.

OK, I'm convinced, a new field is safer, even if it's redundant most
of the time.

I'll introduce a new LWLock, ClogTruncationLock, which will be held
from when we advance the new clogOldestXid field through to when clog
truncation completes.

Most of the rest can stay the same. In particular, the expanded xlog
record for clog truncation will still be required.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-03-13 Thread Alvaro Herrera
> @@ -234,6 +236,21 @@ index_beginscan(Relation heapRelation,
>   scan->heapRelation = heapRelation;
>   scan->xs_snapshot = snapshot;
>  
> + /*
> +  * If the index supports recheck, make sure that index tuple is saved
> +  * during index scans.
> +  *
> +  * XXX Ideally, we should look at all indexes on the table and check if
> +  * WARM is at all supported on the base table. If WARM is not supported
> +  * then we don't need to do any recheck. RelationGetIndexAttrBitmap() 
> does
> +  * do that and sets rd_supportswarm after looking at all indexes. But we
> +  * don't know if the function was called earlier in the session when 
> we're
> +  * here. We can't call it now because there exists a risk of causing
> +  * deadlock.
> +  */
> + if (indexRelation->rd_amroutine->amrecheck)
> + scan->xs_want_itup = true;
> +
>   return scan;
>  }

I didn't like this comment very much.  But it's not necessary: you have
already given relcache responsibility for setting rd_supportswarm.  The
only problem seems to be that you set it in RelationGetIndexAttrBitmap
instead of RelationGetIndexList, but it's not clear to me why.  I think
if the latter function is in charge, then we can trust the flag more
than the current situation.  Let's set the value to false on relcache
entry build, for safety's sake.

I noticed that nbtinsert.c and nbtree.c have a bunch of new includes
that they don't actually need.  Let's remove those.  nbtutils.c does
need them because of btrecheck().  Speaking of which:

I have already commented about the executor involvement in btrecheck();
that doesn't seem good.  I previously suggested to pass the EState down
from caller, but that's not a great idea either since you still need to
do the actual FormIndexDatum.  I now think that a workable option would
be to compute the values/isnulls arrays so that btrecheck gets them
already computed.  With that, this function would be no more of a
modularity violation that HeapSatisfiesHOTAndKey() itself.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS][REVIEW] macaddr 64 bit (EUI-64) datatype support

2017-03-13 Thread Haribabu Kommi
On Mon, Mar 13, 2017 at 6:52 AM, Stephen Frost  wrote:

> Greetings,
>
> * Stephen Frost (sfr...@snowman.net) wrote:
> > * Stephen Frost (sfr...@snowman.net) wrote:
> > > * Haribabu Kommi (kommi.harib...@gmail.com) wrote:
> > > > On Wed, Feb 1, 2017 at 6:27 AM, Vitaly Burovoy <
> vitaly.buro...@gmail.com> wrote:
> > > > > The new status of this patch is: Ready for Committer
> > > >
> > > > Thanks for the review.
> > >
> > > I've started taking a look at this with an eye towards committing it
> > > soon.
> >
> > I've spent a good bit of time going over this, possibly even more than
> > it was worth, but hopefully we'll see people making use of this data
> > type with PG10 and as more IPv6 deployment happens.
>
> And, naturally, re-reading the email as it hit the list made me realize
> that the documentation/error-message incorrectly said "3rd and 4th"
> bytes were being set to FF and FE, when it's actually the 4th and 5th
> byte.  The code was correct, just the documentation and the error
> message had the wrong numbers.  The commit message is also correct.
>

Thanks for the review and corrections.

I found some small corrections.

s/4rd/4th/g -- Type correction.

+ Input is accepted in the following formats:

As we are supporting many different input variants, and all combinations
are not listed, so how about changing the above statement as below.

"Following are the some of the input formats that are accepted:"

I didn't find any other problems during testing and review. The patch
is fine.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] asynchronous execution

2017-03-13 Thread Amit Langote
On 2017/03/14 10:08, Corey Huinker wrote:
>> I don't think the plan itself will change as a result of applying this
>> patch. You might however be able to observe some performance improvement.
> 
> I could see no performance improvement, even with 16 separate queries
> combined with UNION ALL. Query performance was always with +/- 10% of a 9.6
> instance given the same script. I must be missing something.

Hmm, maybe I'm missing something too.

Anyway, here is an older message on this thread from Horiguchi-san where
he shared some of the test cases that this patch improves performance for:

https://www.postgresql.org/message-id/20161018.103051.30820907.horiguchi.kyotaro%40lab.ntt.co.jp

>From that message:


I measured performance and had the following result.

t0  - SELECT sum(a) FROM ;
pl  - SELECT sum(a) FROM <4 local children>;
pf0 - SELECT sum(a) FROM <4 foreign children on single connection>;
pf1 - SELECT sum(a) FROM <4 foreign children on dedicate connections>;

The result is written as "time (std dev )"

sync
  t0: 3820.33 (  1.88)
  pl: 1608.59 ( 12.06)
 pf0: 7928.29 ( 46.58)
 pf1: 8023.16 ( 26.43)

async
  t0: 3806.31 (  4.49)0.4% faster (should be error)
  pl: 1629.17 (  0.29)1.3% slower
 pf0: 6447.07 ( 25.19)   18.7% faster
 pf1: 1876.80 ( 47.13)   76.6% faster


IIUC, pf0 and pf1 is the same test case (all 4 foreign tables target the
same server) measured with different implementations of the patch.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS][REVIEW] macaddr 64 bit (EUI-64) datatype support

2017-03-13 Thread Haribabu Kommi
On Mon, Mar 13, 2017 at 6:38 AM, Stephen Frost  wrote:

> Greetings,
>
> * Stephen Frost (sfr...@snowman.net) wrote:
> > * Haribabu Kommi (kommi.harib...@gmail.com) wrote:
> > > On Wed, Feb 1, 2017 at 6:27 AM, Vitaly Burovoy <
> vitaly.buro...@gmail.com> wrote:
> > > > The new status of this patch is: Ready for Committer
> > >
> > > Thanks for the review.
> >
> > I've started taking a look at this with an eye towards committing it
> > soon.
>
> I've spent a good bit of time going over this, possibly even more than
> it was worth, but hopefully we'll see people making use of this data
> type with PG10 and as more IPv6 deployment happens.
>

Thanks for the review.


> Of particular note, I rewrote macaddr8_in to not use sscanf().
> sscanf(), at least on my system, would accept negative values even for
> '%2x', leading to slightly odd errors with certain inputs, including
> with our existing macaddr type:
>
> =# select '00-203040506'::macaddr;
> ERROR:  invalid octet value in "macaddr" value: "00-203040506"
> LINE 1: select '00-203040506'::macaddr;
>
> With my rewrite, the macaddr8 type will throw a clearer (in  my view, at
> least) error:
>
> =# select '00-203040506'::macaddr8;
> ERROR:  invalid input syntax for type macaddr8: "00-203040506"
> LINE 1: select '00-203040506'::macaddr8;
>
> One other point is that the previously disallowed format with just two
> colons ('0800:2b01:0203') is now allowed.  Given that both the two dot
> format ('0800.2b01.0203') and the two dash format ('0800-2b01-0203')
> were accepted, this seemed alright to me.  Is there really a good reason
> to disallow the two colon format?
>

No. There is no special reason to disallow.
The rewrite of macaddr8_in will allow all possible combinations of spacers.


> I also thought about what we expect the usage of macaddr8 to be and
> realized that we should really have a function to help go from EUI-48 to
> the IPv6 Modified EUI-64 format, since users will almost certainly want
> to do exactly that.  As such, I added a macaddr8_set7bit() function
> which will perform the EUI-64 -> Modified EUI-64 change (which is just
> setting the 7th bit) and added associated documentation and reasoning
> for why that function exists.
>

I checked and it is working as expected.

Regards,
Hari Babu
Fujitsu Australia


[HACKERS] Updating the "tool sets" documentation for modern FreeBSD

2017-03-13 Thread Thomas Munro
Hi,

For several operating systems we give handy package manager one-liners
to install all the requirements for building our documentation.  All
current production FreeBSD releases have a friendly new package
manager a bit like apt/yum, so here's a documentation patch to give
the one line command.  Also, a brief note about gmake vs make in the
doc subdir.

-- 
Thomas Munro
http://www.enterprisedb.com


freebsd-pkg-openjade-docs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread Peter Geoghegan
On Wed, Jan 25, 2017 at 3:11 PM, Peter Geoghegan  wrote:
> On Wed, Jan 25, 2017 at 3:11 PM, Tom Lane  wrote:
>> Please.  You might want to hit the existing ones with a separate patch,
>> but it doesn't much matter; I'd be just as happy with a patch that did
>> both things.
>
> Got it.

Attached is a patch that does both things at once. The internal
function tuplesort_gettuple_common() still mentions repeated fetches,
since that context matters to its internal callers. The various
external-facing functions have a simpler, stricter contract, as
discussed.

I didn't end up adding a line like "copy=FALSE is recommended only
when the next tuplesort manipulation will be another
tuplesort_gettupleslot fetch into the same slot", which you suggested.
When the tuplesort state machine is in any state following
"performing" a sort, there are very few remaining sane manipulations.
Just things like skipping or seeking around for tuples, and actually
ending the tuplesort and releasing resources.

-- 
Peter Geoghegan
From 5351b5db257cb39832647d9117465c0217e6268b Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Thu, 13 Oct 2016 10:54:31 -0700
Subject: [PATCH 1/2] Avoid copying within tuplesort_gettupleslot().

Add a "copy" argument to make it optional to receive a copy of caller
tuple that is safe to use following a subsequent manipulating of
tuplesort's state.  This is a performance optimization.  Most existing
tuplesort_gettupleslot() callers are made to opt out of copying.
Existing callers that happen to rely on the validity of tuple memory
beyond subsequent manipulations of the tuplesort request their own copy.

This brings tuplesort_gettupleslot() in line with
tuplestore_gettupleslot().  In the future, a "copy" tuplesort_getdatum()
argument may be added, that similarly allows callers to opt out of
receiving their own copy of tuple.

In passing, clarify assumptions that callers of other tuplesort fetch
routines may make about tuple memory validity, per gripe from Tom Lane.
---
 src/backend/executor/nodeAgg.c | 10 +++---
 src/backend/executor/nodeSort.c|  5 +++--
 src/backend/utils/adt/orderedsetaggs.c |  5 +++--
 src/backend/utils/sort/tuplesort.c | 28 +---
 src/include/utils/tuplesort.h  |  2 +-
 5 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index aa08152..b650f71 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -570,6 +570,10 @@ initialize_phase(AggState *aggstate, int newphase)
  * Fetch a tuple from either the outer plan (for phase 0) or from the sorter
  * populated by the previous phase.  Copy it to the sorter for the next phase
  * if any.
+ *
+ * Callers cannot rely on memory for tuple in returned slot remaining valid
+ * past any subsequent manipulation of the sorter, such as another fetch of
+ * input from sorter.  (The sorter may recycle memory.)
  */
 static TupleTableSlot *
 fetch_input_tuple(AggState *aggstate)
@@ -578,8 +582,8 @@ fetch_input_tuple(AggState *aggstate)
 
 	if (aggstate->sort_in)
 	{
-		if (!tuplesort_gettupleslot(aggstate->sort_in, true, aggstate->sort_slot,
-	NULL))
+		if (!tuplesort_gettupleslot(aggstate->sort_in, true, false,
+	aggstate->sort_slot, NULL))
 			return NULL;
 		slot = aggstate->sort_slot;
 	}
@@ -1272,7 +1276,7 @@ process_ordered_aggregate_multi(AggState *aggstate,
 		ExecClearTuple(slot2);
 
 	while (tuplesort_gettupleslot(pertrans->sortstates[aggstate->current_set],
-  true, slot1, ))
+  true, true, slot1, ))
 	{
 		/*
 		 * Extract the first numTransInputs columns as datums to pass to the
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 591a31a..d901034 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -132,12 +132,13 @@ ExecSort(SortState *node)
 
 	/*
 	 * Get the first or next tuple from tuplesort. Returns NULL if no more
-	 * tuples.
+	 * tuples. Note that we rely on memory for tuple in slot remaining valid
+	 * only until some subsequent manipulation of tuplesort.
 	 */
 	slot = node->ss.ps.ps_ResultTupleSlot;
 	(void) tuplesort_gettupleslot(tuplesortstate,
   ScanDirectionIsForward(dir),
-  slot, NULL);
+  false, slot, NULL);
 	return slot;
 }
 
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index e462fbd..8502fcf 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -1190,7 +1190,7 @@ hypothetical_rank_common(FunctionCallInfo fcinfo, int flag,
 	tuplesort_performsort(osastate->sortstate);
 
 	/* iterate till we find the hypothetical row */
-	while (tuplesort_gettupleslot(osastate->sortstate, true, slot, NULL))
+	while (tuplesort_gettupleslot(osastate->sortstate, true, true, slot, NULL))
 	{
 		bool		isnull;
 		Datum		d = 

Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-13 Thread Andres Freund
Hi,

On 2017-03-13 01:03:51 -0700, Andres Freund wrote:
> What's basically missing here is:
> - pgindent run and minimizing resulting damage

Running into a bit of an issue here - pgindent mangles something like

EEO_SWITCH (op->opcode)
{
EEO_CASE(EEO_DONE):
goto out;

EEO_CASE(EEO_INNER_FETCHSOME):
/* XXX: worthwhile to check tts_nvalid inline first? */
slot_getsomeattrs(innerslot, op->d.fetch.last_var);
EEO_DISPATCH(op);

EEO_CASE(EEO_OUTER_FETCHSOME):
slot_getsomeattrs(outerslot, op->d.fetch.last_var);
EEO_DISPATCH(op);

EEO_CASE(EEO_SCAN_FETCHSOME):
slot_getsomeattrs(scanslot, op->d.fetch.last_var);
EEO_DISPATCH(op);

EEO_CASE(EEO_INNER_VAR):
{
int attnum = op->d.var.attnum;

/*
 * Can't assert tts_nvalid, as wholerow var evaluation or such
 * could have materialized the slot - but the contents are
 * still valid :/
 */
Assert(op->d.var.attnum >= 0);
*op->resnull = innerslot->tts_isnull[attnum];
*op->resvalue = innerslot->tts_values[attnum];
EEO_DISPATCH(op);
}

into

EEO_SWITCH(op->opcode)
{
EEO_CASE(EEO_DONE):
goto out;

EEO_CASE(EEO_INNER_FETCHSOME):
/* XXX: worthwhile to check tts_nvalid inline first? */
slot_getsomeattrs(innerslot, op->d.fetch.last_var);
EEO_DISPATCH(op);

EEO_CASE(EEO_OUTER_FETCHSOME):
slot_getsomeattrs(outerslot, op->d.fetch.last_var);
EEO_DISPATCH(op);

EEO_CASE(EEO_SCAN_FETCHSOME):
slot_getsomeattrs(scanslot, op->d.fetch.last_var);
EEO_DISPATCH(op);

EEO_CASE(EEO_INNER_VAR):
{
int attnum = op->d.var.attnum;

/*
 * Can't assert tts_nvalid, as wholerow var evaluation or such
 * could have materialized the slot - but the contents are still
 * valid :/
 */
Assert(op->d.var.attnum >= 0);
*op->resnull = innerslot->tts_isnull[attnum];
*op->resvalue = innerslot->tts_values[attnum];
EEO_DISPATCH(op);
}


which is a bit annoying.  (the EEO_CASE is either a jump label or a case
statement, depending on computed goto availability).

It seems we could either:
1) live with the damage
2) disable pgindent
3) move the : inside EEO_CASE's definition, and only use {} blocks.

I'm inclined to go for 3).

Opinions?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] asynchronous execution

2017-03-13 Thread Corey Huinker
>
> I don't think the plan itself will change as a result of applying this
> patch. You might however be able to observe some performance improvement.
>
> Thanks,
> Amit
>

I could see no performance improvement, even with 16 separate queries
combined with UNION ALL. Query performance was always with +/- 10% of a 9.6
instance given the same script. I must be missing something.


Re: [HACKERS] scram and \password

2017-03-13 Thread Michael Paquier
On Tue, Mar 14, 2017 at 9:54 AM, Joe Conway  wrote:
> On 03/12/2017 08:10 PM, Michael Paquier wrote:
>> OK, so what about doing the following then:
>> 1) Create pg_role_password_type(oid), taking a role OID in input and
>> returning the type.
>
> That version would make sense for administrative use. I still think we
> might want a version of this that takes no argument, works on the
> current_user, and is executable by anyone.

Sure.

>> 2) Extend PQencryptPassword with a method, and document. Plaintext is 
>> forbidden.
>
> Check, although if "plain" were allowed as a method for the sake of
> consistency/completeness the function could just immediately return the
> argument.
>
>> 3) Extend \password in psql with a similar -method=scram|md5 argument,
>> and forbid the use of "plain" format.
>
> Not sure why we would forbid "plain" here unless we remove it entirely
> elsewhere.

OK. I don't mind about those two, as long as documentation is clear
enough about what using plain leads to.

>> After thinking about it, extending PQencryptPassword() would lead to
>> future breakage, which makes it clear to not fall into the trap of
>> having password_encryption set to scram in the server's as well as in
>> pg_hba.conf and PQencryptPassword() enforcing things to md5.
>
> I'm not grokking this statement

To grok: "To understand profoundly through intuition or empathy.".
Learning a new thing everyday.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-13 Thread Amit Langote
On 2017/03/14 9:17, Robert Haas wrote:
> On Mon, Mar 13, 2017 at 3:24 PM, Robert Haas  wrote:
>> Haven't looked at 0007 yet.
> 
> Overall this one looks pretty good and straightforward.

In the following code of find_partition_scheme():

+   /* Did not find matching partition scheme. Create one. */
+   part_scheme = (PartitionScheme) palloc0(sizeof(PartitionSchemeData));
+
+   /* Copy partition bounds/lists. */
+   part_scheme->nparts = part_desc->nparts;
+   part_scheme->strategy = part_key->strategy;
+   part_scheme->boundinfo = part_desc->boundinfo;
+
+   /* Store partition key information. */
+   part_scheme->partnatts = part_key->partnatts;
+
+   part_scheme->partopfamily = (Oid *) palloc(sizeof(Oid) * partnatts);
+   memcpy(part_scheme->partopfamily, part_key->partopfamily,
+  sizeof(Oid) * partnatts);
+
+   part_scheme->partopcintype = (Oid *) palloc(sizeof(Oid) * partnatts);
+   memcpy(part_scheme->partopcintype, part_key->partopcintype,
+  sizeof(Oid) * partnatts);
+
+   part_scheme->key_types = (Oid *) palloc(sizeof(Oid) * partnatts);
+   memcpy(part_scheme->key_types, part_key->parttypid,
+  sizeof(Oid) * partnatts);
+
+   part_scheme->key_typmods = (int32 *) palloc(sizeof(int32) * partnatts);
+   memcpy(part_scheme->key_typmods, part_key->parttypmod,
+  sizeof(int32) * partnatts);
+
+   part_scheme->key_collations = (Oid *) palloc(sizeof(Oid) * partnatts);
+   memcpy(part_scheme->key_collations, part_key->parttypcoll,
+  sizeof(Oid) * partnatts);

Couldn't we avoid the memcpy() on individual members of part_key?  After
all, RelationData.rd_partkey is guarded just like rd_partdesc by
relcache.c in face of invalidations (see keep_partkey logic in
RelationClearRelation).

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] scram and \password

2017-03-13 Thread Joe Conway
On 03/12/2017 08:10 PM, Michael Paquier wrote:
> OK, so what about doing the following then:
> 1) Create pg_role_password_type(oid), taking a role OID in input and
> returning the type.

That version would make sense for administrative use. I still think we
might want a version of this that takes no argument, works on the
current_user, and is executable by anyone.

> 2) Extend PQencryptPassword with a method, and document. Plaintext is 
> forbidden.

Check, although if "plain" were allowed as a method for the sake of
consistency/completeness the function could just immediately return the
argument.

> 3) Extend \password in psql with a similar -method=scram|md5 argument,
> and forbid the use of "plain" format.

Not sure why we would forbid "plain" here unless we remove it entirely
elsewhere.

> After thinking about it, extending PQencryptPassword() would lead to
> future breakage, which makes it clear to not fall into the trap of
> having password_encryption set to scram in the server's as well as in
> pg_hba.conf and PQencryptPassword() enforcing things to md5.

I'm not grokking this statement

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Should buffer of initialization fork have a BM_PERMANENT flag

2017-03-13 Thread Michael Paquier
On Tue, Mar 14, 2017 at 4:46 AM, Robert Haas  wrote:
> On Wed, Jan 25, 2017 at 7:14 PM, Michael Paquier
>  wrote:
>> (Adding Robert in CC.)
>>
>> On Thu, Jan 26, 2017 at 4:34 AM, Wang Hao  wrote:
>>> An unlogged table has an initialization fork. The initialization fork does
>>> not have an BM_PERMANENT flag when get a buffer.
>>> In checkpoint (not shutdown or end of recovery), it will not write to disk.
>>> after a crash recovery, the page of initialization fork will not correctly,
>>> then make the main fork not correctly too.
>>
>> For init forks the flush need absolutely to happen, so that's really
>> not good. We ought to fix BufferAlloc() appropriately here.
>
> I agree with that, but I propose the attached version instead.  It
> seems cleaner to have the entire test for setting BM_PERMANENT in one
> place rather than splitting it up as you did.

Fine for me. You may want to update the comment of BM_PERMANENT in
buf_internals.h as Artur has mentioned upthread. For example by just
adding "and init forks".

> I believe this sets a record for the longest-lived data corruption bug
> in a commit made by me.

Really? I'll need to double-check the git history here.

> Six years and change, woohoo.  :-(

And that much for someone to report it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread Peter Geoghegan
On Wed, Jan 25, 2017 at 3:16 PM, Tom Lane  wrote:
> Um, I didn't find it all that self-explanatory.  Why wouldn't we want
> to avoid writing undefined data?

For roughly the same reason we'd want to avoid it in existing cases
that are next to the proposed new suppression. We happen to not need
to initialize the data, but the interface we're using still requires
that we write at a coarser granularity than bytes. logtape.c always
writes whole blocks at a time.

> Also, the suppression seems far too broad.  It would for instance
> block any complaint about a write() invoked via an elog call from
> any function invoked from any LogicalTape* function, no matter
> how far removed.

Writing blocks doesn't just happen in what tuplesort.c or even
logtape.c would consider to be a write path -- it happens when
logtape.c has a dirty buffer that it cleans. Plus you have double
buffering here -- buffile.c manages its own BLCKSZ buffer at the end
of the BufFile struct.

IIRC the reason I did things this way is because both
LogicalTapeRead() and LogicalTapeWrite() both appeared in offending
stack traces, and ltsWriteBlock() would not have plugged that, because
ltsReadBlock() could be involved instead. I don't remember the exact
details offhand, so I will have to look into it again.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \h tab-completion

2017-03-13 Thread Andreas Karlsson

On 03/13/2017 03:56 PM, David Steele wrote:

Do you know when you will have a new patch available for review that
incorporates Peter's request?


I believe I will find the time to finish it some time in a couple of days.

Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] asynchronous execution

2017-03-13 Thread Amit Langote
On 2017/03/14 6:31, Corey Huinker wrote:
> On Mon, Mar 13, 2017 at 1:06 AM, Corey Huinker 
> wrote:
> 
>>
>>> I think it will, because Append itself has been made async-capable by one
>>> of the patches and UNION ALL uses Append.  But as mentioned above, only
>>> the postgres_fdw foreign tables will be able to utilize this for now.
>>>
>>>
>> Ok, I'll re-run my test from a few weeks back and see if anything has
>> changed.
>>
> 
> 
> I'm not able to discern any difference in plan between a 9.6 instance and
> this patch.
> 
> The basic outline of my test is:
> 
> EXPLAIN ANALYZE
> SELECT c1, c2, ..., cN FROM tab1 WHERE date = '1 day ago'
> UNION ALL
> SELECT c1, c2, ..., cN FROM tab2 WHERE date = '2 days ago'
> UNION ALL
> SELECT c1, c2, ..., cN FROM tab3 WHERE date = '3 days ago'
> UNION ALL
> SELECT c1, c2, ..., cN FROM tab4 WHERE date = '4 days ago'
> 
> 
> I've tried this test where tab1 through tab4 all are the same postgres_fdw
> foreign table.
> I've tried this test where tab1 through tab4 all are different foreign
> tables pointing to the same remote table sharing a the same server
> definition.
> I've tried this test where tab1 through tab4 all are different foreign
> tables pointing each with it's own foreign server definition, all of which
> happen to point to the same remote cluster.
> 
> Are there some postgresql.conf settings I should set to get a decent test?

I don't think the plan itself will change as a result of applying this
patch. You might however be able to observe some performance improvement.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-13 Thread Robert Haas
On Mon, Mar 13, 2017 at 3:24 PM, Robert Haas  wrote:
> Haven't looked at 0007 yet.

+   if (rel->part_scheme)
+   {
+   int cnt_parts;
+
+   for (cnt_parts = 0; cnt_parts < nparts; cnt_parts++)
+   {
+   if (rel->part_oids[cnt_parts] ==
childRTE->relid)
+   {
+   Assert(!rel->part_rels[cnt_parts]);
+   rel->part_rels[cnt_parts] = childrel;
+   }
+   }
+   }

It's not very appealing to use an O(n^2) algorithm here.  I wonder if
we could arrange things so that inheritance expansion expands
partitions in the right order, and then we could just match them up
one-to-one.  This would probably require an alternate version of
find_all_inheritors() that expand_inherited_rtentry() would call only
for partitioned tables.  Failing that, another idea would be to use
qsort() or qsort_arg() to put the partitions in the right order.

+   if (relation->rd_rel->relkind != RELKIND_PARTITIONED_TABLE ||
+   !inhparent ||
+   !(rel->part_scheme = find_partition_scheme(root, relation)))

Maybe just don't call this function in the first place in the
!inhparent case, instead of passing down an argument that must always
be true.

+   /* Match the partition key types. */
+   for (cnt_pks = 0; cnt_pks < partnatts; cnt_pks++)
+   {
+   /*
+* For types, it suffices to match the type
id, mod and collation;
+* len, byval and align are depedent on the first two.
+*/
+   if (part_key->partopfamily[cnt_pks] !=
part_scheme->partopfamily[cnt_pks] ||
+   part_key->partopcintype[cnt_pks] !=
part_scheme->partopcintype[cnt_pks] ||
+   part_key->parttypid[cnt_pks] !=
part_scheme->key_types[cnt_pks] ||
+   part_key->parttypmod[cnt_pks] !=
part_scheme->key_typmods[cnt_pks] ||
+   part_key->parttypcoll[cnt_pks] !=
part_scheme->key_collations[cnt_pks])
+   break;
+   }

I think memcmp() might be better than a for-loop.

Overall this one looks pretty good and straightforward.  Of course, I
haven't looked at the main act (0009) yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-03-13 Thread Michael Paquier
On Tue, Mar 14, 2017 at 4:07 AM, Heikki Linnakangas  wrote:
> On 03/13/2017 08:53 PM, Tom Lane wrote:
>> Heikki Linnakangas  writes:
>>>
>>> It would be nice to run the map_checker tool one more time, though, to
>>> verify that the mappings match those from PostgreSQL 9.6.
>>
>> +1

Nice to login and see that committed!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Measuring replay lag

2017-03-13 Thread Thomas Munro
Hi,

Please see separate replies to Simon and Craig below.

On Sun, Mar 5, 2017 at 8:38 PM, Simon Riggs  wrote:
> On 1 March 2017 at 10:47, Thomas Munro  wrote:
>> I do see why a new user trying this feature for the first time might
>> expect it to show a lag of 0 just as soon as sent LSN =
>> write/flush/apply LSN or something like that, but after some
>> reflection I suspect that it isn't useful information, and it would be
>> smoke and mirrors rather than real data.
>
> Perhaps I am misunderstanding the way it works.
>
> If the last time WAL was generated the lag was 14 secs, then nothing
> occurs for 2 hours aftwards AND all changes have been successfully
> applied then it should not continue to show 14 secs for the next 2
> hours.
>
> IMHO the lag time should drop to zero in a reasonable time and stay at
> zero for those 2 hours because there is no current lag.
>
> If we want to show historical lag data, I'm supportive of the idea,
> but we must report an accurate current value when the system is busy
> and when the system is quiet.

Ok, I thought about this for a bit and have a new idea that I hope
will be more acceptable.  Here are the approaches considered:

1.  Show the last measured lag times on a completely idle system until
such time as the standby eventually processes more lag, as I had it in
the v5 patch.  You don't like that and I admit that it is not really
satisfying (even though I know that real Postgres systems alway
generate more WAL fairly soon even without user sessions, it's not
great that it depends on an unknown future event to clear the old
data).

2.  Recognise when the last reported write/flush/apply LSN from the
standby == end of WAL on the sending server, and show lag times of
00:00:00 in all three columns.  I consider this entirely bogus: it's
not an actual measurement that was ever made, and on an active system
it would flip-flop between real measurements and the artificial
00:00:00.  I do not like this.

3.  Recognise the end of WAL as above, but show SQL NULL in the
columns.  Now we don't show an entirely bogus number like 00:00:00 but
we still have the flickering/flip-flopping between nothing and a
measured number during typical use (ie during short periods of
idleness between writes).  I do not like this.

4.  Somehow attempt to measure the round trip time for a keep-alive
message or similar during idle periods.  This means that we would be
taking something that reflects one component of the usual lag
measurements, namely network transfer, but I think we would making
false claims when we show that in columns that report measured write
time, flush time and apply time.  I do not like this.

5.  The new proposal:  Show only true measured write/flush/apply data,
as in 1, but with a time limit.  To avoid the scenario where we show
the same times during prolonged periods of idleness, clear the numbers
like in option 3 after a period of idleness.  This way we avoid the
dreaded flickering/flip-flopping.  A natural time to do that is when
wal_receiver_status_interval expires on idle systems and defaults to
10 seconds.

Done using approach 5 in the attached version.  Do you think this is a
good compromise?  No bogus numbers, only true measured
write/flush/apply times, but a time limit on 'stale' lag information.

On Mon, Mar 6, 2017 at 3:22 AM, Craig Ringer  wrote:
> On 5 March 2017 at 15:31, Simon Riggs  wrote:
>> What we want from this patch is something that works for both, as much
>> as that is possible.
>
> If it shows a sawtooth pattern for flush lag, that's good, because
> it's truthful. We can only flush after we replay commit, therefore lag
> is always going to be sawtooth, with tooth size approximating xact
> size and the baseline lag trend representing any sustained increase or
> decrease in lag over time.
>
> This would be extremely valuable to have.

Thanks for your detailed explanation Craig.  (I also had a chat with
Craig about this off-list.)  Based on your feedback, I've added
support for reporting lag from logical replication, warts and all.

Just a thought:  perhaps logical replication could consider
occasionally reporting a 'write' position based on decoded WAL written
to reorder buffers (rather than just reporting the apply LSN as write
LSN at commit time)?  I think that would be interesting information in
its own right, but would also provide more opportunities to
interpolate the flush/apply sawtooth for large transactions.

Please find a new version attached.

-- 
Thomas Munro
http://www.enterprisedb.com


replication-lag-v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-13 Thread Andrew Dunstan


On 03/13/2017 07:21 PM, Andrew Dunstan wrote:
>
> On 03/13/2017 12:35 AM, Tom Lane wrote:
>> Andres Freund  writes:
>>> On 2017-03-11 22:14:07 -0500, Tom Lane wrote:
 This looks generally sane to me, although I'm not very happy about folding
 the "$(MKDIR_P) output_iso" call into pg_isolation_regress_check --- that
 seems weird and unlike the way it's done for the regular regression test
 case.
>>> Yea, not super happy about that either - alternatively we could fold it
>>> into pg_regress.
>> Yeah, teaching pg_regress to auto-create the --temp-instance directory
>> seems perfectly sane from here.
>
> w.r.t. $subject, I thought it might be useful to get some recent stats
> from the buildfarm. Results are below. The bin checks dwarf everything
> else. Upgrade checks and isolation check are other items of significant
> cost. Upgrade checks could be significantly shortened if we could avoid
> rerunning the regression tests.


Sorry for line wrapping. Better formatted here:


cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Need a builtin way to run all tests faster manner

2017-03-13 Thread Andrew Dunstan


On 03/13/2017 12:35 AM, Tom Lane wrote:
> Andres Freund  writes:
>> On 2017-03-11 22:14:07 -0500, Tom Lane wrote:
>>> This looks generally sane to me, although I'm not very happy about folding
>>> the "$(MKDIR_P) output_iso" call into pg_isolation_regress_check --- that
>>> seems weird and unlike the way it's done for the regular regression test
>>> case.
>> Yea, not super happy about that either - alternatively we could fold it
>> into pg_regress.
> Yeah, teaching pg_regress to auto-create the --temp-instance directory
> seems perfectly sane from here.


w.r.t. $subject, I thought it might be useful to get some recent stats
from the buildfarm. Results are below. The bin checks dwarf everything
else. Upgrade checks and isolation check are other items of significant
cost. Upgrade checks could be significantly shortened if we could avoid
rerunning the regression tests.

cheers

andrew


pgbfprod=> select s.branch, s.log_stage, count(*), avg(extract(epoch
from stage_duration)::numeric(15,2))::numeric(15,1),
stddev(extract(epoch from stage_duration)::numeric(15,2))::numeric(15,1)
from public.build_status_log s where sysname <> 'crake' and snapshot >
now() - interval '10 days'  and log_stage !~ 'start|stop' group by
s.branch, s.log_stage having count(*) > 20 and avg(extract(epoch from
stage_duration)::numeric(15,2)) > 20 order by log_stage, branch <>
'HEAD', branch desc
;
branch |   log_stage   | count
|  avg   | stddev
---+---+---++
 HEAD  | bin-check.log |   388 |
1739.0 | 1758.5
 REL9_6_STABLE | bin-check.log |91 |
1430.5 | 1287.9
 REL9_5_STABLE | bin-check.log |87 |
1140.0 |  994.1
 REL9_4_STABLE | bin-check.log |72
|  751.0 |  666.8
 HEAD  | check.log |  2305
|  263.1 | 1197.0
 REL9_6_STABLE | check.log |   610
|  294.7 | 1369.6
 REL9_5_STABLE | check.log |   627
|  170.1 |  819.6
 REL9_4_STABLE | check.log |   512
|  140.4 |  535.3
 REL9_3_STABLE | check.log |   449
|  112.0 |  446.0
 REL9_2_STABLE | check.log |   406
|  109.2 |  380.9
 HEAD  | check-pg_upgrade.log  |  1785
|  319.4 | 1310.5
 REL9_6_STABLE | check-pg_upgrade.log  |   482
|  571.3 | 2811.0
 REL9_5_STABLE | check-pg_upgrade.log  |   484
|  350.5 | 2160.3
 REL9_4_STABLE | check-pg_upgrade.log  |   385
|  240.8 | 1278.9
 REL9_3_STABLE | check-pg_upgrade.log  |   353
|  214.0 | 1188.3
 REL9_2_STABLE | check-pg_upgrade.log  |   314
|  195.6 | 1016.6
 HEAD  | config.log|  2216
|   84.5 |  101.5
 REL9_6_STABLE | config.log|   576
|   90.0 |   90.8
 REL9_5_STABLE | config.log|   584
|  114.0 |  358.8
 REL9_4_STABLE | config.log|   495
|   84.5 |   85.3
 REL9_3_STABLE | config.log|   431
|   97.9 |  100.7
 REL9_2_STABLE | config.log|   391
|   93.1 |   94.6
 HEAD  | contrib-install-check-C.log   |  2250
|  122.9 |  474.5
 REL9_6_STABLE | contrib-install-check-C.log   |   606
|  124.6 |  410.3
 REL9_5_STABLE | contrib-install-check-C.log   |   622
|   84.7 |  348.8
 REL9_4_STABLE | contrib-install-check-C.log   |   508
|  105.9 |  434.1
 REL9_3_STABLE | contrib-install-check-C.log   |   445
|   61.8 |  273.4
 REL9_2_STABLE | contrib-install-check-C.log   |   403
|   54.3 |  205.7
 HEAD  | contrib-install-check-cs_CZ.UTF-8.log |   184
|   25.7 |   11.8
 REL9_6_STABLE | contrib-install-check-cs_CZ.UTF-8.log |46
|   25.7 |   14.0
 REL9_5_STABLE | contrib-install-check-cs_CZ.UTF-8.log |42
|   23.8 |   16.6
 REL9_4_STABLE | contrib-install-check-cs_CZ.UTF-8.log |36
|   22.9 |   12.4
 HEAD  | contrib-install-check-en_US.8859-15.log   |37
|  173.9 |   32.3
 HEAD  | contrib-install-check-en_US.ISO8859-1.log |33
|  244.7 |   35.7
 HEAD  | contrib-install-check-en_US.log   |   171
|   65.9 |  101.6
 REL9_6_STABLE | contrib-install-check-en_US.log   |42
|   61.7 |   89.3
 REL9_5_STABLE | contrib-install-check-en_US.log   |37
|   54.3 |   79.3
 REL9_4_STABLE | contrib-install-check-en_US.log   |33
|   53.7 |   71.5
 REL9_3_STABLE | 

Re: [HACKERS] bytea_output vs make installcheck

2017-03-13 Thread Tom Lane
Robert Haas  writes:
> On Thu, Mar 9, 2017 at 6:45 PM, Neha Khatri  wrote:
>> With this, if an installcheck is done, that might also have been done with
>> the expectation that the output will be in 'escape' format. In that case,
>> how much is it justified to hard code the format for regression database?
>> However, I agree that there are not many bytea outputs in the current
>> regression suite

> I don't understand this.  People don't run the regression tests to get
> the output.  They run the regression tests to see whether they pass.
> While it may not be possible to make them pass with arbitrarily-crazy
> settings, that's not a reason not to patch up the cases we can handle
> sanely.

I think the question that has to be settled to move this forward is
whether we're content with hard-wiring something for bytea_output
(as in Jeff's installcheck_bytea_fix_2.patch, which I concur with
Peter is superior to installcheck_bytea_fix_1.patch), or whether
we want to hold out for a more flexible solution, probably about like
what I sketched in
https://www.postgresql.org/message-id/30246.1487202663%40sss.pgh.pa.us

I think the more flexible solution is definitely a desirable place to
get to, but somehow I doubt it's going to happen for v10.  Meanwhile
the question is whether adding more hard-wired behavior in pg_regress
is desirable or not.

I tend to vote with Andres that it's not worth the trouble, but
considering that it's only a 2-line change, I won't stand in the
way if some other committer is convinced this is an improvement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-03-13 Thread David Rowley
On 14 March 2017 at 07:50, Tom Lane  wrote:

> [ getting back to this patch finally... ]
>
> David Rowley  writes:
> > I've attached a patch which implements this, though only for
> > MergeJoin, else I'd imagine we'd also need to ensure all proofs used
> > for testing the uniqueness were also hash-able too. I added some XXX
> > comments in analyzejoin.c around the mergeopfamilies == NIL tests to
> > mention that Merge Join depends on all the unique proof quals having
> > mergeopfamilies.  This also assumes we'll never use some subset of
> > mergejoin-able quals for a merge join, which could be an interesting
> > area in the future, as we might have some btree index on a subset of
> > those columns to provide pre-sorted input. In short, it's a great
> > optimisation, but I'm a little scared we might break it one day.
>
> Umm ... it's broken already isn't it?  See the comment for
> generate_mergejoin_paths:
>
>  * We generate mergejoins if mergejoin clauses are available.  We have
>  * two ways to generate the inner path for a mergejoin: sort the cheapest
>  * inner path, or use an inner path that is already suitably ordered for
> the
>  * merge.  If we have several mergeclauses, it could be that there is no
> inner
>  * path (or only a very expensive one) for the full list of mergeclauses,
> but
>  * better paths exist if we truncate the mergeclause list (thereby
> discarding
>  * some sort key requirements).  So, we consider truncations of the
>  * mergeclause list as well as the full list.  (Ideally we'd consider all
>  * subsets of the mergeclause list, but that seems way too expensive.)
>
> There's another, more subtle, way in which it could fail in
> sort_inner_and_outer():
>
>  * Each possible ordering of the available mergejoin clauses will generate
>  * a differently-sorted result path at essentially the same cost.  We have
>  * no basis for choosing one over another at this level of joining, but
>  * some sort orders may be more useful than others for higher-level
>  * mergejoins, so it's worth considering multiple orderings.
>  *
>  * Actually, it's not quite true that every mergeclause ordering will
>  * generate a different path order, because some of the clauses may be
>  * partially redundant (refer to the same EquivalenceClasses).  Therefore,
>  * what we do is convert the mergeclause list to a list of canonical
>  * pathkeys, and then consider different orderings of the pathkeys.
>
> I'm fairly sure that it's possible to end up with fewer pathkeys than
> there are mergeclauses in this code path.  Now, you might be all right
> anyway given that the mergeclauses must refer to the same ECs in such a
> case --- maybe they're fully redundant and we can take testing the
> included clause as proving the omitted one(s) too.  I'm not certain
> right now what I meant by "partially redundant" in this comment.
> But in any case, it's moot for the present purpose because
> generate_mergejoin_paths certainly breaks your assumption.
>

Thanks for looking at this again.

Yeah confirmed. It's broken. I guess I just need to remember in the Path if
we got all the join quals, although I've not looked in detail what the best
fix is. I imagine it'll require storing something else in the JoinPath.

Here's my test case:

select 'create tablespace slow_disk LOCATION ''' ||
current_setting('data_directory') || ''' WITH (random_page_cost=1000);';
\gexec
create table ab (a int not null, b int not null);
create unique index ab_pkey on ab (a,b) tablespace slow_disk;
alter table ab add constraint ab_pkey primary key using index ab_pkey;

create index ab_a_idx on ab (a);

insert into ab values(1,1);
insert into ab values(1,2);

set enable_nestloop to 0;
set enable_hashjoin to 0;
set enable_sort to 0;

explain verbose select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and
ab1.b = ab2.b;
 QUERY PLAN

-
 Merge Join  (cost=0.26..24.35 rows=1 width=16)
   Output: ab1.a, ab1.b, ab2.a, ab2.b
   Inner Unique: Yes
   Merge Cond: (ab1.a = ab2.a)
   Join Filter: (ab1.b = ab2.b)
   ->  Index Scan using ab_a_idx on public.ab ab1  (cost=0.13..12.16 rows=2
width=8)
 Output: ab1.a, ab1.b
   ->  Index Scan using ab_a_idx on public.ab ab2  (cost=0.13..12.16 rows=2
width=8)
 Output: ab2.a, ab2.b
(9 rows)

select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b;
-- wrong results
 a | b | a | b
---+---+---+---
 1 | 2 | 1 | 2
(1 row)

drop index ab_a_idx;

-- same query again. This time it'll use the PK index on the slow_disk
tablespace
select * from ab ab1 inner join ab ab2 on ab1.a = ab2.a and ab1.b = ab2.b;
 a | b | a | b
---+---+---+---
 1 | 1 | 1 | 1
 1 | 2 | 1 | 2
(2 rows)

I had to fix up some conflicts before testing this again on a recent
master, so I've attached my rebased version. No fix yet for the above issue


-- 
 David 

Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread Peter Geoghegan
On Mon, Mar 13, 2017 at 8:23 AM, David Steele  wrote:
> It's been a while since there was a new patch or any activity on this
> thread.
>
> If you need more time to produce a patch, please post an explanation for
> the delay and a schedule for the new patch.  If no patch or explanation
> is is posted by 2017-03-16 AoE I will mark this submission
> "Returned with Feedback".

Apologies for the delay on this. I intend to get back to it before that time.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements

2017-03-13 Thread Tom Lane
Robert Haas  writes:
> On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan  wrote:
>> I'd be in favor of a change
>> that makes it easier to copy and paste a query, to run EXPLAIN and so
>> on. Lukas probably realizes that there are no guarantees that the
>> query text that appears in pg_stat_statements will even appear as
>> normalized in all cases. The "sticky entry" stuff is intended to
>> maximize the chances of that happening, but it's still generally quite
>> possible (e.g. pg_stat_statements never swaps constants in a query
>> like "SELECT 5, pg_stat_statements_reset()"). This means that we
>> cannot really say that this buys us a machine-readable query text
>> format, at least not without adding some fairly messy caveats.

> Well, Lukas's original suggestion of using $n for a placeholder would
> do that, unless there's already a $n with the same numerical value,
> but Andres's proposal to use $-n or $:n would not.

I don't much like the $-n or $:n proposals, as those are infringing on
syntax space we might wish we had back someday.  $:n also could cause
confusion with psql variable substitution.

I wonder if it would improve matters to use $n, but starting with the
first number after the actual external Param symbols in the query.
(This presumes that we can identify the last in-use external Param number
reasonably efficiently.  struct Query doesn't really expose that ---
although it might not be unreasonable to add a field to do so.
In practice pg_stat_statements could perhaps look aside to find
that out, say from EState.es_param_list_info->numParams.)

In a situation where you wanted to try to actually execute the query,
this might be fairly convenient since you could do PREPARE with the
original external Params followed by the ones pg_stat_statements had
abstracted from constants.  Of course, guessing the types of those
might be nontrivial.  I wonder whether we should change the printout
to look like '$n::type' not just '$n'.

A variant that might make it easier to read would be to start the
numbering of the abstracted params from $100, or some other number
much larger than the last external Param, thus creating a pretty
clear distinction between the two.  But that would break the
hypothetical use-case of building a prepared statement directly
from the query text, or at least make it mighty inconvenient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

2017-03-13 Thread Robert Haas
On Sat, Mar 11, 2017 at 1:32 AM, Craig Ringer  wrote:
> On 11 March 2017 at 05:09, Robert Haas  wrote:
>> On the other
>> hand, there really are two separate notions of the "oldest" XID.
>> There's the oldest XID that we can safely look up, and then there's
>> the oldest XID that we can't reuse.  These two are the same when no
>> truncation is in progress, but when a truncation is in progress then
>> they're different: the oldest XID that's safe to look up is the first
>> one after whatever we're truncating away, but the oldest XID that we
>> can't reuse is the newest one preceding the stuff that we're busy
>> truncating.
>
> Right.
>
> My view here is that the oldest xid we cannot reuse is already guarded
> by xidWrapLimit, which we advance after clog truncation. Whether as
> this advances at the same time as or after we advance oldestXid and
> truncate clog doesn't actually matter, we must just ensure that it
> never advances _before_.
>
> So tracking a second copy of oldestXid whose only purpose is to
> recalculate xidWrapLimit serves no real purpose.

Hmm, so what this patch is doing changed quite a bit between January
23rd and January 25th.  In the January 23rd version, oldestXid and
oldestXidDB are changed to track the oldest XID that we can safely
look up, and the remaining related fields are still relative to the
oldest XID that can be reused.  That seems, as I said before, scary.
But in the January 25th version, *all* of the related fields have been
changed to track the oldest XID that we can safely look up, because
SetTransactionIdLimit() now uses the values set by AdvanceOldestXid()
to compute all of the other values, which seems flat-out incorrect.
AdvanceOldestXid() is called to advance that limit before clog
truncation happens, and if somebody then calls SetTransactionIdLimit()
before clog truncation is complete, we'll advanced those derived
limits prematurely.

For example, suppose vacuum #1 comes along, advances the limits,
truncates clog, and then gets descheduled.  Now vacuum #2 comes along,
advances the limits further, and then gets descheduled.  Now vacuum #1
wakes up and calls SetTransactionIdLimit() and prematurely advances
xidWrapLimit.  Oops.

The way you put it is that having a second copy of oldestXid whose
only purpose is to recompute xidWrapLimit is pointless, but the way
I'd say is that you're trying to make one variable do two jobs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-13 Thread Tomas Vondra

On 03/13/2017 09:03 AM, Andres Freund wrote:

Hi,

On 2017-03-12 05:40:51 +0100, Tomas Vondra wrote:

I wanted to do a bit of testing and benchmarking on this, but 0004 seems to
be a bit broken.


Well, "broken" in the sense that it's already outdated, because other
stuff that got merged.



Yes, that's what I meant. Sorry for not being quite clear.




The patch does not apply anymore - there are some conflicts
in execQual.c, but I think I fixed those. But then I ran into a bunch of
compile-time errors, because some of the executor nodes still reference bits
that were moved elsewhere.


Updated patch attached.  Note that this patch has two changes I've not
yet evaluated performance-wise.



And which are those?

I plan to do a bit of testing / benchmarking on this later this week, 
depending on other testing already happening on my machine.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] asynchronous execution

2017-03-13 Thread Corey Huinker
On Mon, Mar 13, 2017 at 1:06 AM, Corey Huinker 
wrote:

>
>> I think it will, because Append itself has been made async-capable by one
>> of the patches and UNION ALL uses Append.  But as mentioned above, only
>> the postgres_fdw foreign tables will be able to utilize this for now.
>>
>>
> Ok, I'll re-run my test from a few weeks back and see if anything has
> changed.
>


I'm not able to discern any difference in plan between a 9.6 instance and
this patch.

The basic outline of my test is:

EXPLAIN ANALYZE
SELECT c1, c2, ..., cN FROM tab1 WHERE date = '1 day ago'
UNION ALL
SELECT c1, c2, ..., cN FROM tab2 WHERE date = '2 days ago'
UNION ALL
SELECT c1, c2, ..., cN FROM tab3 WHERE date = '3 days ago'
UNION ALL
SELECT c1, c2, ..., cN FROM tab4 WHERE date = '4 days ago'


I've tried this test where tab1 through tab4 all are the same postgres_fdw
foreign table.
I've tried this test where tab1 through tab4 all are different foreign
tables pointing to the same remote table sharing a the same server
definition.
I've tried this test where tab1 through tab4 all are different foreign
tables pointing each with it's own foreign server definition, all of which
happen to point to the same remote cluster.

Are there some postgresql.conf settings I should set to get a decent test?


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-13 Thread Tom Lane
"Daniel Verite"  writes:
> Tom Lane wrote:
>> when we see \if is that we do nothing but absorb text
>> until we see the matching \endif.  At that point we could bitch and throw
>> everything away if, say, there's \elif after \else, or anything else you
>> want to regard as a "compile time error".  Otherwise we start execution,
>> and from there on it probably has to behave as we've been discussing.
>> But this'd be pretty unfriendly from an interactive standpoint, and I'm
>> not really convinced that it makes for significantly better error
>> reporting.

> On the whole, isn't that a reasonable model to follow for psql?

One thing that occurs to me after more thought is that with such a model,
we could not have different lexing rules for live vs not-live branches,
since we would not have made those decisions before scanning the input.
This seems problematic.  Even if you discount the question of whether
variable expansion is allowed to change command-boundary decisions, we'd
still not want backtick execution to happen everywhere in the block, ISTM.

Maybe we could fix things so that backtick execution happens later, but
it would be a pretty significant and invasive change to backslash command
execution, I'm afraid.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-13 Thread Tom Lane
Corey Huinker  writes:
>> Barring objection I'll push this so that Corey can rebase over it.

> Seems straightforward, and I appreciate you doing it for me!

Hearing no objections, pushed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add amcheck extension to contrib.

2017-03-13 Thread Andres Freund
Hi,

On 2017-03-13 15:45:01 -0400, Tom Lane wrote:
> I could be wrong, but the most obvious explanation for this failure is
> that autovacuum had a lock on the table or index when we looked.
> Even if that isn't why axolotl failed in this particular case, I think
> it's dead certain that we will see such failures from time to time
> if this test script isn't tightened up.  IIUC what the test is trying
> to look for, I think adding "AND pid = pg_backend_pid()" to this query
> would be an appropriate fix.

Yes, that sounds reasonable.  Will do in a bit.  Thanks for noticing.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Silence perl warning in check-world

2017-03-13 Thread Jeff Janes
In some older versions of perl (like v5.10), I get a warning that:

Use of implicit split to @_ is deprecated at src/test/recovery/t/
006_logical_decoding.pl line 26.

Splitting into a dummy variable silences that warning, as in the attached.
There may be a better way to silence the warning.  (Presumably it actually
is clobbering @_ in those older versions of Perl, but that doesn't seem to
cause problems in this particular case.)

Cheers,

Jeff


silence_perl_warning.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check-world output

2017-03-13 Thread Jeff Janes
On Fri, Mar 10, 2017 at 6:19 PM, Tom Lane  wrote:

> Alvaro Herrera  writes:
> > Jeff Janes wrote:
> >> There was some recent discussion about making "make check-world" faster.
> >> I'm all for that, but how about making it quieter?  On both machines
> I've
> >> run it on (CentOS6.8 and Ubuntu 16.04.2), it dumps some gibberish to
> >> stderr, example attached.
>
> > I think you're complaining about the test code added by commit
> > fcd15f13581f.  Together with behavior introduced by 2f227656076a, it is
> > certainly annoying.  I would vote for redirecting that output to a log
> > file which can be ignored/removed unless there is a failure.
>
> What about just reverting 2f227656076a?
>

That takes care of most of it, but leaves:

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
NOTICE:  database "regression" does not exist, skipping

Cheers,

Jeff


Re: [HACKERS] Allow pg_dumpall to work without pg_authid

2017-03-13 Thread Peter Eisentraut
Why this?

+   if (no_role_passwords && binary_upgrade)
+   {
+   fprintf(stderr, _("%s: options --no-role-passwords and
--binary-upgrade cannot be used together\n"),
+   progname);
+   fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+   progname);
+   exit_nicely(1);
+   }


-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE command progress checker

2017-03-13 Thread Robert Haas
On Fri, Mar 10, 2017 at 2:57 PM, Jim Nasby  wrote:
> Oh, I wasn't suggesting a single SRF for everything. Hopefully users will
> eventually figure out a good formula to drive a "progress bar" for each type
> of monitor, which is what you really want anyway (at least 99% of the time).
> If we got there we could have a single view that gave the % complete for
> every command that was providing feedback. If someone wanted details they
> could hit the individual SRF.

So, the point, at least with the VACUUM progress indicator and perhaps
here also, is that we really can't just give a single measure of
progress -- we are 58.32% done.  That's almost impossible to estimate
accurately.  For example, consider VACUUM.  maintenance_work_mem is
69% used, and we are 74% done vacuuming the table.  What percentage
done are we?  Well, it depends.  If the latter part of the table has
exactly the same density of dead tuples we've already witnessed, we
will need only one index vac pass, but there is a good chance that
there's more activity toward the tail end of the table and we will
need two index vac passes.  If the table has several large indexes,
that's a big difference, so if we guess wrong about whether we're
going to run out of memory, we will be way off in estimating overall
progress.  That's why the vacuum progress checker reports the facts we
actually know -- like how many blocks of the relation we've scanned,
and how many dead tuples we've accumulated so far, and how many dead
tuples we are able to store before triggering index vacuuming --
rather than just a percentage.  You can try to derive a percentage
from that if you want, and you can use any formula you like to derive
that, but what the system reports are straight-up facts, not
predictions.

I don't really want to get into the prediction business.  The in-core
progress reporting structure can spit out enough data for people to
write queries or tools that attempt to predict stuff, and *maybe*
someday somebody will write one of those tools that is enough unlike
https://xkcd.com/612/ that we'll feel moved to put it in core.  But my
guess is most likely not.  It's easy to get the easy cases right in
such a prediction tool, but AFAICS getting the hard cases right
requires a crystal ball.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] background sessions

2017-03-13 Thread Robert Haas
On Sat, Mar 11, 2017 at 10:11 AM, Pavel Stehule  wrote:
> 2017-03-09 14:52 GMT+01:00 Peter Eisentraut
> :
>>
>> On 3/8/17 14:22, Pavel Stehule wrote:
>> > 1. will be background session process closed automatically when parent
>> > process is closed?
>>
>> If the communications queue goes away the process will eventually die.
>> This is similar to how a backend process will eventually die if the
>> client goes away.  Some more testing would be good here.
>
>
> what means "eventually die"?
>
> I called pg_sleep() in called subprocess.
>
> Cancel, terminating parent process has not any effect. It is maybe
> artificial test.
>
> Little bit more realistic - waiting on table lock in background worker was
> successful - and when parent was cancelled, then worker process was
> destroyed too.
>
> But when parent was terminated, then background worker process continued.
>
> What is worse - the background worker had 100% CPU and I had to restart
> notebook.
>
> CREATE OR REPLACE FUNCTION public.foo()
>  RETURNS void
>  LANGUAGE plpythonu
> AS $function$
> with plpy.BackgroundSession() as a:
>   a.execute('update foo2 set a = 30')
>   a.execute('insert into foo2 values(10)')
> $function$
> postgres=#
>
>
> I blocked foo2 in another session.

I'm not sure what's going on with this patch set, but in general a
background process can't just go away when the foreground process goes
away.  We could arrange to kill it, a la pg_terminate_backend(), or we
can let it keep running, and either of those things might be what
somebody wants, depending on the situation.  But it can't just vanish
into thin air.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Two phase commit in ECPG

2017-03-13 Thread Michael Meskes
> Previous 002 patch lacked to add describing PREPARE TRANSACTION.
> Attached updated 002 patch.

I just committed both patches and a backport of the bug fix itself.
Thanks again for finding and fixing.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should buffer of initialization fork have a BM_PERMANENT flag

2017-03-13 Thread Robert Haas
On Wed, Jan 25, 2017 at 7:14 PM, Michael Paquier
 wrote:
> (Adding Robert in CC.)
>
> On Thu, Jan 26, 2017 at 4:34 AM, Wang Hao  wrote:
>> An unlogged table has an initialization fork. The initialization fork does
>> not have an BM_PERMANENT flag when get a buffer.
>> In checkpoint (not shutdown or end of recovery), it will not write to disk.
>> after a crash recovery, the page of initialization fork will not correctly,
>> then make the main fork not correctly too.
>
> For init forks the flush need absolutely to happen, so that's really
> not good. We ought to fix BufferAlloc() appropriately here.

I agree with that, but I propose the attached version instead.  It
seems cleaner to have the entire test for setting BM_PERMANENT in one
place rather than splitting it up as you did.

I believe this sets a record for the longest-lived data corruption bug
in a commit made by me.  Six years and change, woohoo.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


unlogged-flush-fix-rmh.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Add amcheck extension to contrib.

2017-03-13 Thread Tom Lane
Andres Freund  writes:
> Add amcheck extension to contrib.

axolotl just failed on this:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl=2017-03-13%2017%3A49%3A24

***
*** 78,86 
  
  -- make sure we don't have any leftover locks
  SELECT * FROM pg_locks WHERE relation IN ('bttest_a_idx'::regclass, 
'bttest_b_idx'::regclass);
!  locktype | database | relation | page | tuple | virtualxid | transactionid | 
classid | objid | objsubid | virtualtransaction | pid | mode | granted | 
fastpath 
! 
--+--+--+--+---++---+-+---+--++-+--+-+--
! (0 rows)
  
  COMMIT;
  -- cleanup
--- 78,87 
  
  -- make sure we don't have any leftover locks
  SELECT * FROM pg_locks WHERE relation IN ('bttest_a_idx'::regclass, 
'bttest_b_idx'::regclass);
!  locktype | database | relation | page | tuple | virtualxid | transactionid | 
classid | objid | objsubid | virtualtransaction |  pid  |  mode   | 
granted | fastpath 
! 
--+--+--+--+---++---+-+---+--++---+-+-+--
!  relation |57562 |57573 |  |   ||   | 
|   |  | 4/29   | 20342 | AccessShareLock | t   
| t
! (1 row)
  
  COMMIT;
  -- cleanup


I could be wrong, but the most obvious explanation for this failure is
that autovacuum had a lock on the table or index when we looked.
Even if that isn't why axolotl failed in this particular case, I think
it's dead certain that we will see such failures from time to time
if this test script isn't tightened up.  IIUC what the test is trying
to look for, I think adding "AND pid = pg_backend_pid()" to this query
would be an appropriate fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2017-03-13 Thread Alvaro Herrera
Kevin Grittner wrote:

> >> What is necessary to indicate an additional SQL feature covered?
> >
> > I assume you're talking about information_schema.sql_features
> 
> I had forgotten we had that in a table.  I was thinking more of the docs:
> 
> https://www.postgresql.org/docs/current/static/features.html
> 
> I guess if we change one, we had better change the other.  (Or are
> they generated off a common source?)

See src/backend/catalog/sql_features.txt

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-03-13 Thread Robert Haas
On Fri, Mar 10, 2017 at 5:43 AM, Ashutosh Bapat
 wrote:
> PFA the zip containing all the patches rebased on
> 56018bf26eec1a0b4bf20303c98065a8eb1b0c5d and contain the patch to free
> memory consumed by paths using a separate path context.

Some very high-level thoughts based on a look through these patches:

In 0001, you've removed a comment about how GEQO needs special
handling, but it doesn't look as if you've made any compensating
change elsewhere.  That seems unlikely to be correct.  If GEQO needs
some paths to survive longer than others, how can it be right for this
code to create them all in the same context?  Incidentally,
geqo_eval() seems to be an existing precedent for the idea of throwing
away paths and RelOptInfos, so we might want to use similar code for
partitionwise join.

0002 and 0003 look OK.

Probably 0004 is OK too, although that seems to be adding some
overhead to existing callers for the benefit of new ones.  Might be
insignificant, though.

0005 looks OK, except that add_join_rel's definition is missing a
"static" qualifier.  That's not just cosmetic; based on previous
expereince, this will break the BF.

0006 seems to be unnecessary; the new function isn't used in later patches.

Haven't looked at 0007 yet.

0008 is, as previously mentioned, more than we probably want to commit.

Haven't looked at 0009 yet.

0010 - 0012 seem to be various fixes which would need to be done
before or along with 0009, rather than afterward, so I am confused
about the ordering of those patches in the patch series.

The commit message for 0013 is a bit unclear about what it's doing,
although I can guess, a bit, based on the commit message for 0007.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-13 Thread Sven R. Kunze

On 13.03.2017 07:24, Nico Williams wrote:

On Thu, Mar 09, 2017 at 07:12:07PM +0100, Sven R. Kunze wrote:

 From my day-to-day work I can tell, the date(time) type is the only missing
piece of JSON to make it perfect for business applications (besides, maybe,
a "currency" type).

And a binary type.  And a chunked-string type (to avoid having to escape
strings).  And an interval type.  And...


YMMV but I tend to say that those aren't the usual types of a business 
application where I come from.


Answering questions like "how many" (integer), "what" (text) and "when" 
(date) is far more common than "give me that binary blob" at least in 
the domain where I work. Never had the necessity for an interval type; 
usually had a start and end value where the "interval" was derived from 
those values.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-13 Thread Daniel Verite
Tom Lane wrote:

> when we see \if is that we do nothing but absorb text
> until we see the matching \endif.  At that point we could bitch and throw
> everything away if, say, there's \elif after \else, or anything else you
> want to regard as a "compile time error".  Otherwise we start execution,
> and from there on it probably has to behave as we've been discussing.
> But this'd be pretty unfriendly from an interactive standpoint, and I'm
> not really convinced that it makes for significantly better error
> reporting.

This is basically what bash does. In an if/else/fi block
in an interactive session, the second prompt is displayed at every new
line and nothing gets executed until it recognizes the end of the
block and it's valid as a whole. Otherwise, nothing of the block
gets executed. That doesn't strike me as unfriendly.

When non-interactive, in addition to the block not being executed,
the fact that it fails implies that the execution of the current script
is ended, independently of the errexit setting.
If errexit is set, the interpreter terminates. If it was
an included script and errexit is not set, the execution resumes
after the point of the inclusion.

On the whole, isn't that a reasonable model to follow for psql?

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-13 Thread Sven R. Kunze

On 10.03.2017 20:28, Josh Berkus wrote:

On 03/09/2017 10:12 AM, Sven R. Kunze wrote:


SaltStack uses YAML for their tools, too. I personally can empathize
with them (as a user of configuration management) about this as writing
JSON would be nightmare with all the quoting, commas, curly braces etc.
But that's my own preference maybe.

Yes, but automated tools can easily convert between JSON and
newline-delimited YAML and back.


Sure. That wasn't point, though.


Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-03-13 Thread Heikki Linnakangas

On 03/13/2017 08:53 PM, Tom Lane wrote:

Heikki Linnakangas  writes:

It would be nice to run the map_checker tool one more time, though, to
verify that the mappings match those from PostgreSQL 9.6.


+1


Just to be sure, and after that the map checker can go to the dustbin.


Hm, maybe we should keep it around for the next time somebody has a bright
idea in this area?


The map checker compares old-style maps with the new radix maps. The 
next time 'round, we'll need something that compares the radix maps with 
the next great thing. Not sure how easy it would be to adapt.


Hmm. A somewhat different approach might be more suitable for testing 
across versions, though. We could modify the perl scripts slightly to 
print out SQL statements that exercise every mapping. For every 
supported conversion, the SQL script could:


1. create a database in the source encoding.
2. set client_encoding=''
3. SELECT a string that contains every character in the source encoding.

You could then run those SQL statements against old and new server 
version, and verify that you get the same results.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-03-13 Thread Thomas Munro
On Mon, Mar 13, 2017 at 8:40 PM, Rafia Sabih
 wrote:
> In an attempt to test v7 of this patch on TPC-H 20 scale factor I found a
> few regressions,
> Q21: 52 secs on HEAD and  400 secs with this patch

Thanks Rafia.  Robert just pointed out off-list that there is a bogus
0 row estimate in here:

->  Parallel Hash Semi Join  (cost=1006599.34..1719227.30 rows=0
width=24) (actual time=38716.488..100933.250 rows=7315896 loops=5)

Will investigate, thanks.

> Q8: 8 secs on HEAD to 14 secs with patch

Also looking into this one.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread Tom Lane
David Steele  writes:
> On 3/13/17 2:16 PM, Tom Lane wrote:
>> I also don't especially want to have to analyze cases like "what
>> happens if user initdb'd with mask X but then changes the GUC and
>> restarts the postmaster?".  Maybe the right thing is to not expose
>> this as a GUC at all, but drive it off the permissions observed on
>> the data directory at postmaster start.  Viz:
>> 
>> * $PGDATA has permissions 0700: adopt umask 077
>> 
>> * $PGDATA has permissions 0750: adopt umask 027
>> 
>> * anything else: fail

> How about a GUC, allow_group_access, that when set will enforce
> permissions and set the umask accordingly, and when not set will follow
> $PGDATA as proposed above?

Seems overcomplicated ...

> Not much we can do for Windows, though.  I think it would have to WARN
> if the GUC is set and then continue as usual.

Yeah, the Windows port has been weak in this area all along.  I don't
think it's incumbent on you to make it better.

>> That way, a "chmod -R" would be the necessary and sufficient procedure
>> for switching from one case to the other.

> I'm OK with that if you think it's the best course, but perhaps the GUC
> would be better because it can detect accidental permission changes.

If we're only checking file permissions at postmaster start, I think it's
illusory to suppose that we're offering very much protection against
accidental changes.  A chmod applied while the postmaster is running
could still break things, and we'd not notice till the next restart.

But it might be worth thinking about whether we want to encourage people
to do manual chmod's at all; that's fairly easy to get wrong, particularly
given the difference in X bits that should be applied to files and
directories.  Another approach that could be worth considering is
a PGC_POSTMASTER GUC with just two states (group access or not) and
make it the postmaster's responsibility to do the equivalent of chmod -R
to make the file tree match the GUC.  I think we do a tree scan anyway
for other purposes, so correcting any wrong file permissions might not
be much added work in the normal case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-03-13 Thread Tom Lane
Heikki Linnakangas  writes:
> I did some more kibitzing here and there, and committed. Thanks everyone!

111 files changed, 147742 insertions(+), 367346 deletions(-)

Nice.

> It would be nice to run the map_checker tool one more time, though, to 
> verify that the mappings match those from PostgreSQL 9.6.

+1

> Just to be sure, and after that the map checker can go to the dustbin.

Hm, maybe we should keep it around for the next time somebody has a bright
idea in this area?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2017-03-13 Thread Kevin Grittner
On Sun, Mar 12, 2017 at 4:08 PM, Thomas Munro
 wrote:

> I found a new way to break it: run the trigger function so
> that the plan is cached by plpgsql, then ALTER TABLE incompatibly,
> then run the trigger function again.  See attached.

The first part doesn't seem so bad.  Using the transition table in a
FOR EACH STATEMENT trigger you get:

test=# update hoge set name = (name::text || name::text)::integer;
ERROR:  attribute 2 has wrong type
DETAIL:  Table has type integer, but query expects text.
CONTEXT:  SQL statement "SELECT (SELECT string_agg(id || '=' || name,
',') FROM d)"
PL/pgSQL function hoge_upd_func() line 3 at RAISE

... while putting each row on its own line from a FOR EACH ROW
trigger you get:

test=# update hoge set name = (name::text || name::text)::integer;
ERROR:  type of parameter 15 (integer) does not match that when
preparing the plan (text)
CONTEXT:  SQL statement "SELECT (SELECT string_agg(old.id || '=' ||
old.name, ','))"
PL/pgSQL function hoge_upd_func() line 3 at RAISE

Does anyone think the first message needs improvement?  If so, to
what?

Obviously the next part is a problem.  With the transition table we
get a segfault:

test=# -- now drop column 'name'
test=# alter table hoge drop column name;
ALTER TABLE
test=# update hoge set id = id;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

While the row-oriented query manages to dodge it:

test=# alter table hoge drop column name;
ALTER TABLE
test=# update hoge set id = id;
ERROR:  record "old" has no field "name"
CONTEXT:  SQL statement "SELECT (SELECT string_agg(old.id || '=' ||
old.name, ','))"
PL/pgSQL function hoge_upd_func() line 3 at RAISE

I expected that existing mechanisms would have forced re-planning of
a trigger function if the table the function was attached to was
altered.  Either that was a bit "optimistic", or the old TupleDesc
is used for the new plan.  Will track down which it is, and fix it.

>> Do you suppose we should have all PLs that are part of the base
>> distro covered?
>
> I vote for doing that in Postgres 11.  My pl/python patch[1] may be a
> useful starting point, but I haven't submitted it in this CF and
> nobody has shown up with pl/tcl or pl/perl versions.

OK, but we'd better add something to the docs saying that only C and
plpgsql triggers are supported in v10.

>> What is necessary to indicate an additional SQL feature covered?
>
> I assume you're talking about information_schema.sql_features

I had forgotten we had that in a table.  I was thinking more of the docs:

https://www.postgresql.org/docs/current/static/features.html

I guess if we change one, we had better change the other.  (Or are
they generated off a common source?)

> a couple of thoughts occurred to me when looking for
> references to transition tables in an old draft standard I have.
> These are both cases where properties of the subject table should
> probably also affect access to the derived transition tables:
>
> * What privileges implications are there for transition tables?  I'm
> wondering about column and row level privileges; for example, if you
> can't see a column in the subject table, I'm guessing you shouldn't be
> allowed to see it in the transition table either, but I'm not sure.

I'll see how that works in FOR EACH ROW triggers.  We should match
that, I think.  Keep in mind that not just anyone can put a trigger
on a table.

> * In future we could consider teaching it about functional
> dependencies as required by the spec; if you can SELECT id, name FROM
>  GROUP BY id, I believe you should be able to SELECT
> id, name FROM  GROUP BY id, but currently you can't.

Interesting idea.

I'll post a new patch once I figure out the dropped column on the
cached function plan.

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-03-13 Thread Tom Lane
[ getting back to this patch finally... ]

David Rowley  writes:
> I've attached a patch which implements this, though only for
> MergeJoin, else I'd imagine we'd also need to ensure all proofs used
> for testing the uniqueness were also hash-able too. I added some XXX
> comments in analyzejoin.c around the mergeopfamilies == NIL tests to
> mention that Merge Join depends on all the unique proof quals having
> mergeopfamilies.  This also assumes we'll never use some subset of
> mergejoin-able quals for a merge join, which could be an interesting
> area in the future, as we might have some btree index on a subset of
> those columns to provide pre-sorted input. In short, it's a great
> optimisation, but I'm a little scared we might break it one day.

Umm ... it's broken already isn't it?  See the comment for
generate_mergejoin_paths:

 * We generate mergejoins if mergejoin clauses are available.  We have
 * two ways to generate the inner path for a mergejoin: sort the cheapest
 * inner path, or use an inner path that is already suitably ordered for the
 * merge.  If we have several mergeclauses, it could be that there is no inner
 * path (or only a very expensive one) for the full list of mergeclauses, but
 * better paths exist if we truncate the mergeclause list (thereby discarding
 * some sort key requirements).  So, we consider truncations of the
 * mergeclause list as well as the full list.  (Ideally we'd consider all
 * subsets of the mergeclause list, but that seems way too expensive.)

There's another, more subtle, way in which it could fail in
sort_inner_and_outer():

 * Each possible ordering of the available mergejoin clauses will generate
 * a differently-sorted result path at essentially the same cost.  We have
 * no basis for choosing one over another at this level of joining, but
 * some sort orders may be more useful than others for higher-level
 * mergejoins, so it's worth considering multiple orderings.
 *
 * Actually, it's not quite true that every mergeclause ordering will
 * generate a different path order, because some of the clauses may be
 * partially redundant (refer to the same EquivalenceClasses).  Therefore,
 * what we do is convert the mergeclause list to a list of canonical
 * pathkeys, and then consider different orderings of the pathkeys.

I'm fairly sure that it's possible to end up with fewer pathkeys than
there are mergeclauses in this code path.  Now, you might be all right
anyway given that the mergeclauses must refer to the same ECs in such a
case --- maybe they're fully redundant and we can take testing the
included clause as proving the omitted one(s) too.  I'm not certain
right now what I meant by "partially redundant" in this comment.
But in any case, it's moot for the present purpose because
generate_mergejoin_paths certainly breaks your assumption.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread David Steele
On 3/13/17 2:16 PM, Tom Lane wrote:
> David Steele  writes:
>> On 3/13/17 1:03 PM, Tom Lane wrote:
>>> TBH, the fact that we're relying on 0600 mode for considerations such
>>> as these makes me tremendously afraid of this whole patch.  I think that
>>> the claimed advantages are not anywhere near worth the risk that somebody
>>> is going to destroy their database because we weakened some aspect of the
>>> protection against starting multiple postmasters in a database directory.
> 
>> I don't see a risk if the user uses umask 0027 which is the example
>> given in the docs.  I'm happy to change this example to a strong
>> recommendation.

<...>

> Anyway, given that we do that analysis, I'd rather not expose this
> as a "here, set the umask you want" variable.  I think a bool saying
> "allow group access" (translating to exactly two supported umasks,
> 027 and 077) would be simpler from the user's standpoint and much
> easier to reason about.  I don't see the value in having to think
> about what happens if the user supplies a mask like 037 or 067.

We debated a flag vs a umask and came down on the side of flexibility.
I'm perfectly happy with using a flag instead.

> I also don't especially want to have to analyze cases like "what
> happens if user initdb'd with mask X but then changes the GUC and
> restarts the postmaster?".  Maybe the right thing is to not expose
> this as a GUC at all, but drive it off the permissions observed on
> the data directory at postmaster start.  Viz:
> 
> * $PGDATA has permissions 0700: adopt umask 077
> 
> * $PGDATA has permissions 0750: adopt umask 027
> 
> * anything else: fail

How about a GUC, allow_group_access, that when set will enforce
permissions and set the umask accordingly, and when not set will follow
$PGDATA as proposed above?

Not much we can do for Windows, though.  I think it would have to WARN
if the GUC is set and then continue as usual.

> That way, a "chmod -R" would be the necessary and sufficient procedure
> for switching from one case to the other.

I'm OK with that if you think it's the best course, but perhaps the GUC
would be better because it can detect accidental permission changes.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Radix tree for character conversion

2017-03-13 Thread Heikki Linnakangas

On 03/06/2017 10:16 AM, Kyotaro HORIGUCHI wrote:

At Fri, 3 Mar 2017 12:53:04 +0900, Michael Paquier  wrote 
in 

On Thu, Mar 2, 2017 at 2:20 PM, Kyotaro HORIGUCHI
 wrote:

5) Just remove plain map files and all related code. Addition to
   that, Makefile stores hash digest of authority files in
   Unicode/authoriy_hashes.txt or something that is managed by
   git.


That may be an idea to check for differences across upstream versions.
But that sounds like a separate discussion to me.


Fine with me either.


I did some more kibitzing here and there, and committed. Thanks everyone!

I agree the new maps should just replace the old maps altogether, so 
committed that way. I also moved the combined map files to the same .map 
files as the main radix trees. Seems more clear that way to me.


I changed the to/from_unicode properties back to a single direction 
property, with Perl constants BOTH, TO_UNICODE and FROM_UNICODE, per 
your alternative suggestion upthread. Seems more clear to me.


It would be nice to run the map_checker tool one more time, though, to 
verify that the mappings match those from PostgreSQL 9.6. Just to be 
sure, and after that the map checker can go to the dustbin.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

2017-03-13 Thread Tomas Vondra

On 03/13/2017 06:49 AM, Ashutosh Sharma wrote:

Hi,

I had a look into this patch and would like to share some of my review
comments that requires author's attention.

1) The comment for page_checksum() needs to be corrected. It seems
like it has been copied from page_header and not edited it further.

/*
 * page_header
 *
 * Allows inspection of page header fields of a raw page
 */

PG_FUNCTION_INFO_V1(page_checksum);

Datum
page_checksum(PG_FUNCTION_ARGS)



True, will fix.


2) It seems like you have choosen wrong datatype for page_checksum. I
am getting negative checksum value when trying to run below query. I
think the return type for the SQL function page_checksum should be
'integer' instead of 'smallint'.

postgres=# SELECT page_checksum(get_raw_page('pg_class', 0), 100);
 page_checksum
---
-19532
(1 row)

Above problem also exist in case of page_header. I am facing similar
problem with page_header as well.

postgres=# SELECT page_header(get_raw_page('pg_class', 0));
 page_header
-
 (0/2891538,-28949,1,220,7216,8192,8192,4,0)
(1 row)



No. This is consistent with page_header() which is also using smallint 
for the checksum value.




3) Any reason for not marking bt_page_items or page_checksum as PARALLEL_SAFE.



No, not really. It's an oversight.


4) Error messages in new bt_page_items are not consistent with old
bt_page_items. For eg. if i pass meta page blocknumber as input to
bt_page_items the error message thrown by new and old bt_page_items
are different.

postgres=# SELECT * FROM bt_page_items('btree_index', 0) LIMIT 1;
ERROR:  block 0 is a meta page

postgres=# SELECT * FROM bt_page_items(get_raw_page('btree_index', 0)) LIMIT 1;
ERROR:  block is a meta page



Well, the new function does not actually know the block number, so how 
could it include it in the error message? You only get the page itself, 
and it might be read from anywhere. Granted, the meta page should only 
be stored in block 0, but when the storage mixes up the pages somehow, 
that's not reliable.




postgres=# SELECT * FROM bt_page_items(get_raw_page('btree_index',
1024)) LIMIT 1;
ERROR:  block number 1024 is out of range for relation "btree_index"


postgres=# SELECT * FROM bt_page_items('btree_index', 1024) LIMIT 1;
ERROR:  block number out of range



Well, that error message does not come from the new function, it comes 
from get_raw_page(), so I'm not sure what am I supposed to do with that? 
Similarly to the previous case, the new function does not actually know 
the block number at all.



5) Code duplication in bt_page_items() and bt_page_items_bytea() needs
to be handled.



Yes. If we adopt the approach proposed by Peter Eisentraut (redirecting 
the old bt_page_items using a SQL function calling the new one), it will 
also make the error messages consistent.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bogus tab completion tweak for UPDATE ... SET ... =

2017-03-13 Thread Tom Lane
Thomas Munro  writes:
> Even though the following is coincidentally meaningful, I don't think
> it was intentional or is useful:

> postgres=# update foo set x = DEFAULT

Uh, seems perfectly sane to me.  Why should we assume that can't be
what the user wants?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread Tom Lane
David Steele  writes:
> On 3/13/17 1:03 PM, Tom Lane wrote:
>> TBH, the fact that we're relying on 0600 mode for considerations such
>> as these makes me tremendously afraid of this whole patch.  I think that
>> the claimed advantages are not anywhere near worth the risk that somebody
>> is going to destroy their database because we weakened some aspect of the
>> protection against starting multiple postmasters in a database directory.

> I don't see a risk if the user uses umask 0027 which is the example
> given in the docs.  I'm happy to change this example to a strong
> recommendation.

I do not want a "strong recommendation".  I want "we don't let you
break this".  We don't let you run the postmaster as root either,
even though there have been repeated requests to remove that safety
check.

It might be all right if we forcibly or'd 027 into whatever umask
the user tries to provide; not sure.  The existing safety analysis,
such as the cited comment, has all been based on the assumption of
file mode 600 not mode 640.  I'm not certain that we always attempt
to open-for-writing files that we expect to be exclusively accessible
by the postmaster.

Anyway, given that we do that analysis, I'd rather not expose this
as a "here, set the umask you want" variable.  I think a bool saying
"allow group access" (translating to exactly two supported umasks,
027 and 077) would be simpler from the user's standpoint and much
easier to reason about.  I don't see the value in having to think
about what happens if the user supplies a mask like 037 or 067.

I also don't especially want to have to analyze cases like "what
happens if user initdb'd with mask X but then changes the GUC and
restarts the postmaster?".  Maybe the right thing is to not expose
this as a GUC at all, but drive it off the permissions observed on
the data directory at postmaster start.  Viz:

* $PGDATA has permissions 0700: adopt umask 077

* $PGDATA has permissions 0750: adopt umask 027

* anything else: fail

That way, a "chmod -R" would be the necessary and sufficient procedure
for switching from one case to the other.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bogus tab completion tweak for UPDATE ... SET ... =

2017-03-13 Thread Thomas Munro
Hi,

Even though the following is coincidentally meaningful, I don't think
it was intentional or is useful:

postgres=# update foo set x = DEFAULT

Shouldn't that completion should be suppressed, like in the attached?

-- 
Thomas Munro
http://www.enterprisedb.com


fix-update-set-completion.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread David Steele
Hi Tom,

On 3/13/17 1:03 PM, Tom Lane wrote:
> David Steele  writes:
>> At miscinit.c:893:
> 
>> /* We can treat the EPERM-error case as okay because that error implies
>> that the existing process has a different userid than we do, which means
>> it cannot be a competing postmaster.  A postmaster cannot successfully
>> attach to a data directory owned by a userid other than its own.  (This
>> is now checked directly in checkDataDir(), but has been true for a long
>> time because of the restriction that the data directory isn't group- or
>> world-accessible.)  Also, since we create the lockfiles mode 600, we'd
>> have failed above if the lockfile belonged to another userid --- which
>> means that whatever process kill() is reporting about isn't the one that
>> made the lockfile.  (NOTE: this last consideration is the only one that
>> keeps us from blowing away a Unix socket file belonging to an instance
>> of Postgres being run by someone else, at least on machines where /tmp
>> hasn't got a stickybit.) */
> 
> TBH, the fact that we're relying on 0600 mode for considerations such
> as these makes me tremendously afraid of this whole patch.  I think that
> the claimed advantages are not anywhere near worth the risk that somebody
> is going to destroy their database because we weakened some aspect of the
> protection against starting multiple postmasters in a database directory.

I don't see a risk if the user uses umask 0027 which is the example
given in the docs.  I'm happy to change this example to a strong
recommendation.

> At the very least, I'd want to see much closer analysis of the safety
> issues than I've seen so far in this thread.  

I think it's clear that there would be safety risks with unwise umask
choices.  I also think the example/recommended umask is safe.

Running external processes as the postgres user carries serious risks as
well.  Not only with regards to data access but the danger of corruption
due to bugs.  If a process does not require write access to do its job
then why take that risk?

To (hopefully) address your concerns, I'll perform an analysis of
starting multiple postmasters with a variety of umask choices and report
the outcomes here.

> And since the proposed
> patch falsifies the above-quoted comment (and probably others), why
> hasn't it updated it?

That was an oversight on my part.  I'll update it in the next patch.

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread David Steele
Hi Tom,

On 3/13/17 1:13 PM, Tom Lane wrote:
> ... oh, and now that I've actually looked at the patch, I think it's
> a seriously bad idea to proceed by removing the mode parameter to
> PathNameOpenFile et al.  That's basically doubling down on an assumption
> that there are NO places in the backend, and never will be any, in which
> we want to create files with nondefault permissions.  That assumption
> seems broken on its face.  It also makes the patch exceedingly invasive
> for extensions.

I think it's a bad idea to have the same parameters copied over and over
throughout the code with slight variations (e.g. 0600 vs S_IRUSR |
S_IWUSR) but the same intent.

In all cases there is another version of the function (added by this
patch) that accepts a mode parameter.  In practice this was only needed
in one place, be_lo_export().  I think this makes a pretty good argument
for standardization/simplification in other areas.

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gather Merge

2017-03-13 Thread Robert Haas
On Fri, Mar 10, 2017 at 7:59 AM, Rushabh Lathia
 wrote:
> Error coming from create_gather_merge_plan() from below condition:
>
> if (memcmp(sortColIdx, gm_plan->sortColIdx,
>numsortkeys * sizeof(AttrNumber)) != 0)
> elog(ERROR, "GatherMerge child's targetlist doesn't match
> GatherMerge");
>
> Above condition checks the sort column numbers explicitly, to ensure the
> tlists
> really do match up. This been copied from the create_merge_append_plan().
> Now
> this make sense as for MergeAppend as its not projection capable plan (see
> is_projection_capable_plan()). But like Gather, GatherMerge is the
> projection
> capable and its target list can be different from the subplan, so I don't
> think this
> condition make sense for the GatherMerge.
>
> Here is the some one the debugging info, through which I was able to reach
> to this conclusion:
>
> - targetlist for the GatherMerge and subpath is same during
> create_gather_merge_path().
>
> - targetlist for the GatherMerge is getting changed into
> create_gather_merge_plan().
>
> postgres=# explain (analyze, verbose) select t2.j from t1 JOIN t2 ON
> t1.k=t2.k where t1.i=1 order by t1.j desc;
> NOTICE:  path parthtarget: {PATHTARGET :exprs ({VAR :varno 2 :varattno 2
> :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno
> 2 :location 34} {VAR :varno 1 :varattno 2 :vartype 23 :vartypmod -1
> :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 2 :location 90})
> :sortgrouprefs 0 1 :cost.startup 0.00 :cost.per_tuple 0.00 :width 8}
>
> NOTICE:  subpath parthtarget: {PATHTARGET :exprs ({VAR :varno 1 :varattno 2
> :vartype 23 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno
> 2 :location 90} {VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1
> :varcollid 0 :varlevelsup 0 :varnoold 2 :varoattno 2 :location 34})
> :cost.startup 0.00 :cost.per_tuple 0.00 :width 8}
>
> - Attached memory watch point and found that target list for GatherMerge is
> getting
> changed into groupping_planner() -> apply_projection_to_path().
>
> PFA patch to fix this issue.

I don't think this fix is correct, partly on theoretical grounds and
partly because I managed to make it crash.  The problem is that
prepare_sort_for_pathkeys() actually alters the output tlist of Gather
Merge, which is inconsistent with the idea that Gather Merge can do
projection.  It's going to produce whatever
prepare_sort_for_pathkeys() says it's going to produce, which may or
may not be what was there before.  Using Kuntal's dump file and your
patch:

set min_parallel_table_scan_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
set enable_sort = false;
set enable_bitmapscan = false;
alter table t1 alter column j type text;
select t2.i from t1 join t2 on t1.k=t2.k where t1.i=1 order by t1.j desc;

Crash.   Abbreviated stack trace:

#0  pg_detoast_datum_packed (datum=0xbc) at fmgr.c:2176
#1  0x00010160e707 in varstrfastcmp_locale (x=188, y=819,
ssup=0x7fe1ea06a568) at varlena.c:1997
#2  0x0001013efc73 in ApplySortComparator [inlined] () at
/Users/rhaas/pgsql/src/include/utils/sortsupport.h:225
#3  0x0001013efc73 in heap_compare_slots (a=, b=, arg=0x7fe1ea04e590) at sortsupport.h:681
#4  0x0001014057b2 in sift_down (heap=0x7fe1ea079458,
node_off=) at
binaryheap.c:274
#5  0x00010140573a in binaryheap_build (heap=0x7fe1ea079458) at
binaryheap.c:131
#6  0x0001013ef771 in gather_merge_getnext [inlined] () at
/Users/rhaas/pgsql/src/backend/executor/nodeGatherMerge.c:421
#7  0x0001013ef771 in ExecGatherMerge (node=0x7fe1ea04e590) at
nodeGatherMerge.c:240

Obviously, this is happening because we're trying to apply a
comparator for text to a value of type integer.  I propose the
attached, slightly more involved fix, which rips out the first call to
prepare_sort_from_pathkeys() altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


gm_plan_fix_rmh.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2017-03-13 Thread Daniel Verite
Vaishnavi Prabakaran wrote:

> >  while (QbatchQueueProcess(conn)) {
> >r = PQsetSingleRowMode(conn);
> >if (r!=1) {
> >   fprintf(stderr, "PQsetSingleRowMode() failed");
> >}
> >..

> Thanks for investigating the problem, and could you kindly explain what
> "next iteration" you mean here? Because I don't see any problem in
> following sequence of calls - PQbatchQueueProcess(),PQsetSingleRowMode()
> , PQgetResult()

I mean the next iteration of the above while statement. Referring
to the doc, that would be the "next batch entry":

  " To get the result of the first batch entry the client must call
   PQbatchQueueProcess. It must then call PQgetResult and handle the
   results until PQgetResult returns null (or would return null if
   called). The result from the next batch entry may then be retrieved
   using PQbatchQueueProcess and the cycle repeated"

Attached is a bare-bones testcase showing the problem.
As it is, it works, retrieving results for three "SELECT 1"
in the same batch.  Now in order to use the single-row
fetch mode, consider adding this:

r = PQsetSingleRowMode(conn);
if (r!=1) {
  fprintf(stderr, "PQsetSingleRowMode() failed for i=%d\n", i);
}

When inserted after the call to PQbatchQueueProcess,
which is what I understand you're saying works for you,
it fails for me when starting to get the results of the 2nd query
and after. 


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


test-singlerow-batch.c
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread Tom Lane
... oh, and now that I've actually looked at the patch, I think it's
a seriously bad idea to proceed by removing the mode parameter to
PathNameOpenFile et al.  That's basically doubling down on an assumption
that there are NO places in the backend, and never will be any, in which
we want to create files with nondefault permissions.  That assumption
seems broken on its face.  It also makes the patch exceedingly invasive
for extensions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread Tom Lane
David Steele  writes:
> At miscinit.c:893:

> /* We can treat the EPERM-error case as okay because that error implies
> that the existing process has a different userid than we do, which means
> it cannot be a competing postmaster.  A postmaster cannot successfully
> attach to a data directory owned by a userid other than its own.  (This
> is now checked directly in checkDataDir(), but has been true for a long
> time because of the restriction that the data directory isn't group- or
> world-accessible.)  Also, since we create the lockfiles mode 600, we'd
> have failed above if the lockfile belonged to another userid --- which
> means that whatever process kill() is reporting about isn't the one that
> made the lockfile.  (NOTE: this last consideration is the only one that
> keeps us from blowing away a Unix socket file belonging to an instance
> of Postgres being run by someone else, at least on machines where /tmp
> hasn't got a stickybit.) */

TBH, the fact that we're relying on 0600 mode for considerations such
as these makes me tremendously afraid of this whole patch.  I think that
the claimed advantages are not anywhere near worth the risk that somebody
is going to destroy their database because we weakened some aspect of the
protection against starting multiple postmasters in a database directory.

At the very least, I'd want to see much closer analysis of the safety
issues than I've seen so far in this thread.  And since the proposed
patch falsifies the above-quoted comment (and probably others), why
hasn't it updated it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Faster Expression Processing v4

2017-03-13 Thread Konstantin Knizhnik



On 13.03.2017 11:03, Andres Freund wrote:

Hi,

On 2017-03-12 05:40:51 +0100, Tomas Vondra wrote:

I wanted to do a bit of testing and benchmarking on this, but 0004 seems to
be a bit broken.

Well, "broken" in the sense that it's already outdated, because other
stuff that got merged.



The patch does not apply anymore - there are some conflicts
in execQual.c, but I think I fixed those. But then I ran into a bunch of
compile-time errors, because some of the executor nodes still reference bits
that were moved elsewhere.

Updated patch attached.  Note that this patch has two changes I've not
yet evaluated performance-wise.



I got the following results at my system with Intel(R) Core(TM) i7-4770 
CPU @ 3.40GHz, 16Gb RAM,
TPC-H Q1/Q6 scale 10, sharedBuffers=8Gb, pg_prewarm on lineitem table 
projection:



Q1
Q6
Master
7503 ms 1171 ms
Your patch  6420 ms 1034 ms
VOPS
396 ms
249 ms
VOPS + patch367 ms
233 ms



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] scram and \password

2017-03-13 Thread Jeff Janes
On Fri, Mar 10, 2017 at 2:43 PM, Michael Paquier 
wrote:

> On Sat, Mar 11, 2017 at 2:53 AM, Jeff Janes  wrote:
> > Should the \password tool in psql inspect password_encryption and act on
> it
> > being 'scram'?
>
> Not sure if it is wise to change the default fot this release.
>

I'm not proposing that we change the default value of password_encryption,
only that \password respect whatever value it currently finds there.  But
after thinking about it a bit more, I reached the same conclusion that Joe
did, that it should use the same hashing method as the current password
does, and only consult password_encryption if there is no password
currently set.


> A patch among those lines would be a simple, do people feel that this
> should be part of PG 10?
>

I think it is pretty important to have some way of setting the password
that doesn't risk it ending up in the server log file, or .psql_history, or
having someone shoulder-surf it.

Cheers,

Jeff


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread David Steele
On 3/10/17 8:34 AM, Stephen Frost wrote:
> Greetings,
> 
> * Tsunakawa, Takayuki (tsunakawa.ta...@jp.fujitsu.com) wrote:
>> From: pgsql-hackers-ow...@postgresql.org
>>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of David Steele
>>> PostgreSQL currently requires the file mode mask (umask) to be 0077.
>>> However, this precludes the possibility of a user in the postgres group
>>> performing a backup (or whatever).  Now that
>>> pg_start_backup()/pg_stop_backup() privileges can be delegated to an
>>> unprivileged user, it makes sense to also allow a (relatively) unprivileged
>>> user to perform the backup at the file system level as well.
>>
>> I'd like to help review this.  First, let me give some questions and 
>> comments.

Much appreciated!

>> 3.The default location of the SSL key file is $PGDATA, so the permission of 
>> the key file is likely to become 0640.  But the current postgres requires it 
>> to be 0600.  See src/backend/libpq/be-secure-openssl.c.
> 
> Yes, that needs to be addressed.  There was discussion on another thread
> that it would be useful to support the SSL key file having group read
> access, but since this patch is handling the other files it seems like
> it would make sense to do that change here also.

Perhaps, but since these files are not setup by initdb I'm not sure if
we should be handling their permissions.  This seems to be a
distro-specific issue.

It seems to me that it would be best to advise in the docs that these
files should be relocated if they won't be readable by the backup user.
In any event, I'm not convinced that backing up server private keys is a
good idea.

>> 5.I think some explanation about the concept of multiple OS users is 
>> necessary, such as here:
>>
>> 16.1. Short Version
>> https://www.postgresql.org/docs/devel/static/install-short.html
>>
>> 18.2. Creating a Database Cluster
>> https://www.postgresql.org/docs/devel/static/creating-cluster.html
> 
> I agree that we should update the documention for this, including those.

We'll add that to the next patch.

Thanks,
-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] PATCH: Configurable file mode mask

2017-03-13 Thread David Steele
On 3/10/17 8:12 AM, Stephen Frost wrote:
> Peter,
> 
> * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
>> On 2/28/17 20:58, David Steele wrote:
>>> This patch introduces a new initdb param, -u/-file-mode-mask, and a new
>>> GUC, file_mode_mask, to allow the default mode of files and directories
>>> in the $PGDATA directory to be modified.
>>
>> The postmaster.pid file appears not to observe the configured mask.
> 
> Good point, it should.

Leaving the mask on this file as-is was intentional.  At miscinit.c:829:

/* Think not to make the file protection weaker than 0600.  See comments
below. */

At miscinit.c:893:

/* We can treat the EPERM-error case as okay because that error implies
that the existing process has a different userid than we do, which means
it cannot be a competing postmaster.  A postmaster cannot successfully
attach to a data directory owned by a userid other than its own.  (This
is now checked directly in checkDataDir(), but has been true for a long
time because of the restriction that the data directory isn't group- or
world-accessible.)  Also, since we create the lockfiles mode 600, we'd
have failed above if the lockfile belonged to another userid --- which
means that whatever process kill() is reporting about isn't the one that
made the lockfile.  (NOTE: this last consideration is the only one that
keeps us from blowing away a Unix socket file belonging to an instance
of Postgres being run by someone else, at least on machines where /tmp
hasn't got a stickybit.) */

I can't see why this explanation does not continue to hold even if
permissions for other files are changed.  For the use cases I envision,
I don't think being able to read/manipulate postmaster.pid is important,
only to detect that it is present.

>> There ought to be a test, perhaps under src/bin/initdb/, to check for
>> that kind of thing.
> 
> Good idea.

Agreed, will add to next patch.

> >> There is no documentation update for initdb.

The --file-mode-mask option was added to the option list, but you are
probably referring to a paragraph under description.  Will add to the
next patch.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-13 Thread Corey Huinker
>
>
> Barring objection I'll push this so that Corey can rebase over it.
>
> regards, tom lane
>

Seems straightforward, and I appreciate you doing it for me!


Re: [HACKERS] PATCH: Make pg_stop_backup() archive wait optional

2017-03-13 Thread David Steele
Hi Robert,

On 3/6/17 12:48 PM, Robert Haas wrote:
> On Sat, Mar 4, 2017 at 9:12 AM, David Steele  wrote:
>> Yes, that makes sense.  Attached are two patches as requested:
>>
>> 01 - Just marks pg_stop_backup() variants as parallel restricted
>> 02 - Add the wait_for_archive param to pg_stop_backup().
>>
>> These apply cleanly on 272adf4.
> 
> Committed 01.  

Thanks!

> Nobody's offered an opinion about 02 yet, so I'm not
> going to commit that, but one minor nitpick:
> 
> +WAL to be archived.  This behavior is only useful for backup
> +software which independently monitors WAL archiving, otherwise WAL
> +required to make the backup consistent might be missing and make the 
> backup
> 
> I think this should really say "...which independently monitors WAL
> archiving.  Otherwise, WAL..."

The attached patch udpates the docs per your suggestion and has been
rebased on master at d69fae2.

Thanks,
-- 
-David
da...@pgmasters.net
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 583b3b2..ddf3298 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18299,7 +18299,7 @@ SELECT set_config('log_statement_stats', 'off', false);
   
   

-pg_stop_backup(exclusive 
boolean)
+pg_stop_backup(exclusive 
boolean , wait_for_archive boolean 
)
 
setof record
Finish performing exclusive or non-exclusive on-line backup 
(restricted to superusers by default, but other users can be granted EXECUTE to 
run the function)
@@ -18383,7 +18383,13 @@ postgres=# select pg_start_backup('label_goes_here');
 pg_start_backup. In a non-exclusive backup, the contents of
 the backup_label and tablespace_map are returned
 in the result of the function, and should be written to files in the
-backup (and not in the data directory).
+backup (and not in the data directory).  There is an optional second
+parameter of type boolean.  If false, the pg_stop_backup
+will return immediately after the backup is completed without waiting for
+WAL to be archived.  This behavior is only useful for backup
+software which independently monitors WAL archiving. Otherwise, WAL
+required to make the backup consistent might be missing and make the backup
+useless.

 

diff --git a/src/backend/access/transam/xlogfuncs.c 
b/src/backend/access/transam/xlogfuncs.c
index 96aa15e..1ef9f2b 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -190,6 +190,7 @@ pg_stop_backup_v2(PG_FUNCTION_ARGS)
boolnulls[3];
 
boolexclusive = PG_GETARG_BOOL(0);
+   boolwait_for_archive = PG_GETARG_BOOL(1);
XLogRecPtr  stoppoint;
 
/* check to see if caller supports us returning a tuplestore */
@@ -232,7 +233,7 @@ pg_stop_backup_v2(PG_FUNCTION_ARGS)
 * Stop the exclusive backup, and since we're in an exclusive 
backup
 * return NULL for both backup_label and tablespace_map.
 */
-   stoppoint = do_pg_stop_backup(NULL, true, NULL);
+   stoppoint = do_pg_stop_backup(NULL, wait_for_archive, NULL);
exclusive_backup_running = false;
 
nulls[1] = true;
@@ -250,7 +251,7 @@ pg_stop_backup_v2(PG_FUNCTION_ARGS)
 * Stop the non-exclusive backup. Return a copy of the backup 
label
 * and tablespace map so they can be written to disk by the 
caller.
 */
-   stoppoint = do_pg_stop_backup(label_file->data, true, NULL);
+   stoppoint = do_pg_stop_backup(label_file->data, 
wait_for_archive, NULL);
nonexclusive_backup_running = false;
cancel_before_shmem_exit(nonexclusive_base_backup_cleanup, 
(Datum) 0);
 
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 0bce209..27ddaf0 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -988,6 +988,12 @@ CREATE OR REPLACE FUNCTION
   RETURNS pg_lsn STRICT VOLATILE LANGUAGE internal AS 'pg_start_backup'
   PARALLEL RESTRICTED;
 
+CREATE OR REPLACE FUNCTION pg_stop_backup (
+exclusive boolean, wait_for_archive boolean DEFAULT true,
+OUT lsn pg_lsn, OUT labelfile text, OUT spcmapfile text)
+  RETURNS SETOF record STRICT VOLATILE LANGUAGE internal as 'pg_stop_backup'
+  PARALLEL RESTRICTED;
+
 -- legacy definition for compatibility with 9.3
 CREATE OR REPLACE FUNCTION
   json_populate_record(base anyelement, from_json json, use_json_as_text 
boolean DEFAULT false)
@@ -1088,7 +1094,7 @@ AS 'jsonb_insert';
 -- available to superuser / cluster owner, if they choose.
 REVOKE EXECUTE ON FUNCTION pg_start_backup(text, boolean, boolean) FROM public;
 REVOKE EXECUTE ON FUNCTION pg_stop_backup() FROM public;
-REVOKE EXECUTE ON FUNCTION pg_stop_backup(boolean) FROM public;
+REVOKE 

Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Tom Lane wrote:
> > Robert Haas  writes:
> > > So now on every startup I get this:
> > 
> > > 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv6 address "::1"
> > > 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv4 address 
> > > "127.0.0.1"
> > > 2017-03-13 10:08:49.400 EDT [90059] LOG:  listening on Unix address
> > > "/tmp/.s.PGSQL.5432"
> > 
> > > I think the idea that this is worth three lines of log output (out of
> > > a total of six) is hard to stomach.
> > 
> > You were in the minority before on the usefulness of this output, and
> > I think you still are.  Personally I've already found it useful to be
> > able to check that buildfarm runs are binding to (only) the addresses
> > they're supposed to.
> 
> I think it's good to have it, but I would argue that it should be a
> single line that lists all the addresses instead.

So, I'm fine with it as-is.  Tom's point that it'd get to be a pretty
long line is correct, imv, and other daemons that I've seen also tend to
put them on independent lines and I don't hear many people complaining
about those.

This is NTP's output, for example:

Mar 12 10:13:41 beorn ntpd[28719]: Listen and drop on 0 v6wildcard [::]:123
Mar 12 10:13:41 beorn ntpd[28719]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Mar 12 10:13:41 beorn ntpd[28719]: Listen normally on 2 lo 127.0.0.1:123
Mar 12 10:13:41 beorn ntpd[28719]: Listen normally on 3 wlan0 192.168.1.191:123
Mar 12 10:13:41 beorn ntpd[28719]: Listen normally on 4 lo [::1]:123
Mar 12 10:13:41 beorn ntpd[28719]: Listen normally on 5 wlan0 
[fe80::bc2a:e1cf:2545:d08e%3]:123
Mar 12 10:13:41 beorn ntpd[28719]: Listening on routing socket on fd #22 for 
interface updates

And bind9 (aka named):

Mar 10 10:53:05 tangmo named[7618]: listening on IPv6 interfaces, port 53
Mar 10 10:53:05 tangmo named[7618]: listening on IPv4 interface lo, 127.0.0.1#53
Mar 10 10:53:05 tangmo named[7618]: listening on IPv4 interface eth0, 
10.10.5.10#53
Mar 10 10:53:05 tangmo named[7618]: listening on IPv4 interface eth0:ext, 
172.16.231.123#53

PowerDNS (aka pdns):

Mar 12 17:15:57 dunmer pdns[26094]: UDP server bound to 0.0.0.0:53
Mar 12 17:15:57 dunmer pdns[26094]: UDPv6 server bound to [::]:53
Mar 12 17:15:57 dunmer pdns[26094]: TCP server bound to 0.0.0.0:53
Mar 12 17:15:57 dunmer pdns[26094]: TCPv6 server bound to [::]:53

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Suppress Clang 3.9 warnings

2017-03-13 Thread Aleksander Alekseev
Hi David,

Thank you for reminding about this patch!

Here is a new patch. I tried to make as little changes as possible. This
is no doubt not the most beautiful patch on Earth but it removes all
warnings. I anyone could suggest an approach that would be significantly
better please don't hesitate to share your ideas.

Tested on Clang 3.9.1 and GCC 6.3.1.

> Why not update the target type to "unsigned char" instead, so that no
> cast is needed and the value compatibility is checked by the compiler? I
> guess there would be some more changes (question is how much), but it
> would be cleaner.

I tried this way as well. After rebuilding PostgreSQL in 5th time I
discovered that now I have to redefine a Poiner typedef. I don't think
we can avoid using type casts. There will be just different type casts
in other places, or we'll have to break most of existing PostgreSQL
extensions.

On Mon, Mar 13, 2017 at 10:19:57AM -0400, David Steele wrote:
> Hi Aleksander,
> 
> On 2/22/17 9:43 AM, Fabien COELHO wrote:
> > 
> > Hello Aleksander,
> > 
> >> ```
> >> xloginsert.c:742:18: warning: implicit conversion from 'int' to 'char'
> >> changes value from 253 to -3 [-Wconstant-conversion]
> >> ```
> > 
> > There is a bunch of these in "xlog.c" as well, and the same code is used
> > in "pg_resetwal.c".
> > 
> >> Patch that fixes these warnings is attached to this email.
> > 
> > My 0.02€:
> > 
> > I'm not at ease at putting the thing bluntly under the carpet with a cast.
> > 
> > Why not update the target type to "unsigned char" instead, so that no
> > cast is needed and the value compatibility is checked by the compiler? I
> > guess there would be some more changes (question is how much), but it
> > would be cleaner.
> 
> There's been no discussion or new patch on this thread recently.  If you
> are planning to address the issues raised please plan to do so by
> Thursday, March 16th.
> 
> If no new patch is submitted by that date I will mark this patch
> "Returned with Feedback".
> 
> Thanks,
> -- 
> -David
> da...@pgmasters.net

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 744360c769..6aa93b5ff7 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5018,7 +5018,7 @@ BootStrapXLOG(void)
 	record->xl_rmid = RM_XLOG_ID;
 	recptr += SizeOfXLogRecord;
 	/* fill the XLogRecordDataHeaderShort struct */
-	*(recptr++) = XLR_BLOCK_ID_DATA_SHORT;
+	*(recptr++) = (char)XLR_BLOCK_ID_DATA_SHORT;
 	*(recptr++) = sizeof(checkPoint);
 	memcpy(recptr, , sizeof(checkPoint));
 	recptr += sizeof(checkPoint);
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 03b05f937f..ea8e915029 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -739,7 +739,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	if ((curinsert_flags & XLOG_INCLUDE_ORIGIN) &&
 		replorigin_session_origin != InvalidRepOriginId)
 	{
-		*(scratch++) = XLR_BLOCK_ID_ORIGIN;
+		*(scratch++) = (char)XLR_BLOCK_ID_ORIGIN;
 		memcpy(scratch, _session_origin, sizeof(replorigin_session_origin));
 		scratch += sizeof(replorigin_session_origin);
 	}
@@ -749,13 +749,13 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	{
 		if (mainrdata_len > 255)
 		{
-			*(scratch++) = XLR_BLOCK_ID_DATA_LONG;
+			*(scratch++) = (char)XLR_BLOCK_ID_DATA_LONG;
 			memcpy(scratch, _len, sizeof(uint32));
 			scratch += sizeof(uint32);
 		}
 		else
 		{
-			*(scratch++) = XLR_BLOCK_ID_DATA_SHORT;
+			*(scratch++) = (char)XLR_BLOCK_ID_DATA_SHORT;
 			*(scratch++) = (uint8) mainrdata_len;
 		}
 		rdt_datas_last->next = mainrdata_head;
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 27bd9b04e7..ed6968d605 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -1095,7 +1095,7 @@ WriteEmptyXLOG(void)
 	record->xl_rmid = RM_XLOG_ID;
 
 	recptr += SizeOfXLogRecord;
-	*(recptr++) = XLR_BLOCK_ID_DATA_SHORT;
+	*(recptr++) = (char)XLR_BLOCK_ID_DATA_SHORT;
 	*(recptr++) = sizeof(CheckPoint);
 	memcpy(recptr, ,
 		   sizeof(CheckPoint));
diff --git a/src/include/port.h b/src/include/port.h
index f4546016e7..0f014b72b6 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -395,11 +395,22 @@ extern double rint(double x);
 extern int	inet_aton(const char *cp, struct in_addr * addr);
 #endif
 
-#if !HAVE_DECL_STRLCAT
+/*
+ * Unfortunately in case of strlcat and strlcpy we can't trust tests
+ * executed by Autotools if Clang > 3.6 is used. Clang manages to compile
+ * a program that shouldn't compile which causes wrong values of
+ * HAVE_DECL_STRLCAT and HAVE_DECL_STRLCPY. More details could be found here:
+ *
+ * http://lists.llvm.org/pipermail/cfe-dev/2016-March/048126.html
+ *
+ * This is no doubt a dirty hack but apparently alternative solutions are
+ * not much better.
+ */
+#if !HAVE_DECL_STRLCAT || defined(__clang__)
 extern size_t 

Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Tom Lane
Alvaro Herrera  writes:
> I think it's good to have it, but I would argue that it should be a
> single line that lists all the addresses instead.

I don't think that's a terribly good idea, for the reasons I mentioned in
https://www.postgresql.org/message-id/12776.1489160501%40sss.pgh.pa.us

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Alvaro Herrera
Tom Lane wrote:
> Robert Haas  writes:
> > So now on every startup I get this:
> 
> > 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv6 address "::1"
> > 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv4 address 
> > "127.0.0.1"
> > 2017-03-13 10:08:49.400 EDT [90059] LOG:  listening on Unix address
> > "/tmp/.s.PGSQL.5432"
> 
> > I think the idea that this is worth three lines of log output (out of
> > a total of six) is hard to stomach.
> 
> You were in the minority before on the usefulness of this output, and
> I think you still are.  Personally I've already found it useful to be
> able to check that buildfarm runs are binding to (only) the addresses
> they're supposed to.

I think it's good to have it, but I would argue that it should be a
single line that lists all the addresses instead.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort_gettuple_common() and *should_free argument

2017-03-13 Thread David Steele
Hi Peter,

On 3/2/17 9:43 AM, David Steele wrote:
> Peter,
> 
> On 2/1/17 12:59 AM, Michael Paquier wrote:
>> On Thu, Jan 26, 2017 at 8:16 AM, Tom Lane  wrote:
>>> [ in the service of closing out this thread... ]
>>>
>>> Peter Geoghegan  writes:
 Finally, 0003-* is a Valgrind suppression borrowed from my parallel
 CREATE INDEX patch. It's self-explanatory.
>>>
>>> Um, I didn't find it all that self-explanatory.  Why wouldn't we want
>>> to avoid writing undefined data?  I think the comment at least needs
>>> to explain exactly what part of the written data might be uninitialized.
>>> And I'd put the comment into valgrind.supp, too, not in the commit msg.
>>>
>>> Also, the suppression seems far too broad.  It would for instance
>>> block any complaint about a write() invoked via an elog call from
>>> any function invoked from any LogicalTape* function, no matter
>>> how far removed.
> 
> It looks like we are waiting on a new patch.  Do you know when you will
> have that ready?

It's been a while since there was a new patch or any activity on this
thread.

If you need more time to produce a patch, please post an explanation for
the delay and a schedule for the new patch.  If no patch or explanation
is is posted by 2017-03-16 AoE I will mark this submission
"Returned with Feedback".

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

2017-03-13 Thread Tom Lane
I wrote:
> IIRC, I objected to putting knowledge of ConditionalStack
> into the shared psqlscan.l lexer, and I still think that would be a bad
> idea; but we need some way to get the lexer to shut that off.  Probably
> the best way is to add a passthrough "void *" argument that would let the
> get_variable callback function mechanize the rule about not expanding
> in a false branch.

Here's a proposed patch that adds a passthrough of this sort.

The passthrough argument is passed only to the get_variable callback.
I dithered about whether to also pass it to the write_error callback,
but ultimately decided not to for now.  Neither psql nor pgbench wants it,
and in the case of psql we'd have to invent a separate wrapper function
because we would certainly not want to change the signature of
psql_error().

Barring objection I'll push this so that Corey can rebase over it.

regards, tom lane

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 1aa56ab..e9d4fe6 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** setQFout(const char *fname)
*** 119,127 
   * If "escape" is true, return the value suitably quoted and escaped,
   * as an identifier or string literal depending on "as_ident".
   * (Failure in escaping should lead to returning NULL.)
   */
  char *
! psql_get_variable(const char *varname, bool escape, bool as_ident)
  {
  	char	   *result;
  	const char *value;
--- 119,131 
   * If "escape" is true, return the value suitably quoted and escaped,
   * as an identifier or string literal depending on "as_ident".
   * (Failure in escaping should lead to returning NULL.)
+  *
+  * "passthrough" is the pointer previously given to psql_scan_set_passthrough.
+  * psql currently doesn't use this.
   */
  char *
! psql_get_variable(const char *varname, bool escape, bool as_ident,
!   void *passthrough)
  {
  	char	   *result;
  	const char *value;
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index a83bc69..3d8b8da 100644
*** a/src/bin/psql/common.h
--- b/src/bin/psql/common.h
***
*** 16,22 
  extern bool openQueryOutputFile(const char *fname, FILE **fout, bool *is_pipe);
  extern bool setQFout(const char *fname);
  
! extern char *psql_get_variable(const char *varname, bool escape, bool as_ident);
  
  extern void psql_error(const char *fmt,...) pg_attribute_printf(1, 2);
  
--- 16,23 
  extern bool openQueryOutputFile(const char *fname, FILE **fout, bool *is_pipe);
  extern bool setQFout(const char *fname);
  
! extern char *psql_get_variable(const char *varname, bool escape, bool as_ident,
!   void *passthrough);
  
  extern void psql_error(const char *fmt,...) pg_attribute_printf(1, 2);
  
diff --git a/src/bin/psql/psqlscanslash.l b/src/bin/psql/psqlscanslash.l
index 5b7953b..ba4a08d 100644
*** a/src/bin/psql/psqlscanslash.l
--- b/src/bin/psql/psqlscanslash.l
*** other			.
*** 243,249 
  			 yyleng - 1);
  		value = cur_state->callbacks->get_variable(varname,
     false,
!    false);
  		free(varname);
  
  		/*
--- 243,250 
  			 yyleng - 1);
  		value = cur_state->callbacks->get_variable(varname,
     false,
!    false,
!    cur_state->cb_passthrough);
  		free(varname);
  
  		/*
diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index 1b29341..19b3e57 100644
*** a/src/fe_utils/psqlscan.l
--- b/src/fe_utils/psqlscan.l
*** other			.
*** 700,706 
  	if (cur_state->callbacks->get_variable)
  		value = cur_state->callbacks->get_variable(varname,
     false,
!    false);
  	else
  		value = NULL;
  
--- 700,707 
  	if (cur_state->callbacks->get_variable)
  		value = cur_state->callbacks->get_variable(varname,
     false,
!    false,
!    cur_state->cb_passthrough);
  	else
  		value = NULL;
  
*** psql_scan_destroy(PsqlScanState state)
*** 923,928 
--- 924,942 
  }
  
  /*
+  * Set the callback passthrough pointer for the lexer.
+  *
+  * This could have been integrated into psql_scan_create, but keeping it
+  * separate allows the application to change the pointer later, which might
+  * be useful.
+  */
+ void
+ psql_scan_set_passthrough(PsqlScanState state, void *passthrough)
+ {
+ 	state->cb_passthrough = passthrough;
+ }
+ 
+ /*
   * Set up to perform lexing of the given input line.
   *
   * The text at *line, extending for line_len bytes, will be scanned by
*** psqlscan_escape_variable(PsqlScanState s
*** 1409,1415 
  	/* Variable lookup. */
  	varname = psqlscan_extract_substring(state, txt + 2, len - 3);
  	if (state->callbacks->get_variable)
! 		value = state->callbacks->get_variable(varname, true, as_ident);
  	else
  		value = NULL;
  	free(varname);
--- 1423,1430 

Re: [HACKERS] [PATCH] ALTER DEFAULT PRIVILEGES with GRANT/REVOKE ON SCHEMAS

2017-03-13 Thread David Steele
Hi Matheus,

On 3/2/17 8:27 AM, David Steele wrote:
> On 1/18/17 7:18 PM, Petr Jelinek wrote:
>>
>> The patch looks good, the only thing I am missing is tab completion
>> support for psql.
> 
> It looks like this patch is still waiting on an update for tab
> completion in psql.
> 
> Do you know when will have that patch ready?

It's been a while since there was a new patch or any activity on this
thread.

If you need more time to produce a patch, please post an explanation for
the delay and a schedule for the new patch.  If no patch or explanation
is is posted by 2017-03-16 AoE I will mark this submission
"Returned with Feedback".

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

2017-03-13 Thread Kevin Grittner
On Sat, Mar 11, 2017 at 8:39 PM, Mengxing Liu
 wrote:

>> The worst problems have been
>> seen with 32 or more cores on 4 or more sockets with a large number
>> of active connections.  I don't know whether you have access to a
>> machine capable of putting this kind of stress on it (perhaps at
>> your university?), but if not, the community has access to various
>> resources we should be able to schedule time on.
>
> There is a NUMA machine ( 120 cores, 8 sockets) in my lab.

Fantastic!  Can you say a bit more about the architecture and OS?

> I think it's enough to put this kind of stress.

The researchers who saw this bottleneck reported that performance
started to dip at 16 cores and the problem was very noticeable at 32
cores.  A stress test with 120 cores on 8 sockets will be great!

I think perhaps the first milestone on the project should be to
develop a set of benchmarks we want to compare to at the end.  That
would need to include a stress test that clearly shows the problem
we're trying to solve, along with some cases involving 1 or two
connections -- just to make sure we don't harm performance for
low-contention situations.

--
Kevin Grittner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \h tab-completion

2017-03-13 Thread David Steele
Hi Andreas,

On 3/1/17 8:47 AM, Peter Eisentraut wrote:
> On 2/3/17 07:12, Andreas Karlsson wrote:
>> On 01/25/2017 07:13 AM, Michael Paquier wrote:
>>> What I think you should do is making the code path of
>>> \\h smarter with some exceptions by using TailMatchesCS2() for ALTER.
>>> There is as well the case of DROP commands that should be treated by
>>> the way.
>>
>> Yes, I think that is correct approach. I have attached a patch where I 
>> add completion for \h ALTER and \h DROP.
> 
> Instead of creating another copy of list_ALTER, let's use the
> words_after_create list and write a version of
> create_command_generator/drop_command_generator.

Do you know when you will have a new patch available for review that
incorporates Peter's request?

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IF (NOT) EXISTS in psql-completion

2017-03-13 Thread David Steele
Hello,

On 3/1/17 9:38 PM, Kyotaro HORIGUCHI wrote:

> At Tue, 28 Feb 2017 10:39:01 -0500, Stephen Frost  wrote 
> in <20170228153901.gh9...@tamriel.snowman.net>
>> * David Fetter (da...@fetter.org) wrote:
>>> On Mon, Feb 27, 2017 at 11:53:17PM -0500, Stephen Frost wrote:
 * Kyotaro HORIGUCHI (horiguchi.kyot...@lab.ntt.co.jp) wrote:
> I suppose it is for suggesting what kind of word should come
> there, or avoiding silence for a tab. Or for symmetry with other
> types of manipulation, like DROP. Another possibility is creating
> multiple objects with similar names, say CREATE TABLE employee_x1,
> CREATE TABLE employee_x2. Just trying to complete existing
> *schema* is one more another possible objective.

 I don't buy any of these arguments either.  I *really* don't want us
 going down some road where we try to make sure that hitting 'tab'
 never fails...
> 
> These suggestions exist before this patch. Whether to remove them
> would be another discussion. I was going to add some but finally
> I believe I have added no such things in this patchset.
> 
>>> Wouldn't that just be a correct, grammar-aware implementation of tab
>>> completion?  Why wouldn't you want that?
>>
>> No, it wouldn't, it would mean we have to provide something for cases
>> where it doesn't make sense to try and provide an answer, as being
>> discussed here for CREATE TABLE.
>>
>> We can't provide an answer based on tab-completion to what you want to
>> call your new table.
> 
> The difference seems to be that what we take this feature to
> be. If we see it as just a fast-path of entering a word where we
> know what words should come, silence is not a problem. If we see
> it as safety-wheels to guide users to the right way to go,
> silence would be bad. A silence during word completion suggests
> something wrong in the preceding words to me so it is a bit
> annoying.
> 
> As an analogous operation, mkdir on bash suggests existing
> directories. We can suggest existing tables for CREATE TABLE with
> the same basis.
> 
> Another possible way to go would be showing a 'suggestion' not a
> list of possibilities. If readline allows such operation, I
> imagine the following. But this is a quite diferrent discussion.
> 
> =# CREATE TABLE 
> <>
> =# CREATE TABLE table1 
> 
> regards,

It has been a while since this thread has received any comments or a new
patch.  The general consensus seems to be that this feature is too large
a rewrite of tab completion considering a major rewrite was done for 9.6.

Are you considering writing a localized patch for this feature as Tom
suggested?  If so, please post that by 2017-03-16 AoE.

If no new patch is submitted by that date I will mark this submission
"Returned with Feedback".

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Consistency checking for hash indexes

2017-03-13 Thread Ashutosh Sharma
Couple of review comments,,

You may also need to update the documentation as now we are also going
to support wal consistency check for hash index. The current
documentation does not include hash index.

+only records originating from those resource managers.  Currently,
+the supported resource managers are heap,
+heap2, btree, gin,
+gist, sequence, spgist,
+brin, and generic. Only


Following comment in hash_mask() may require changes if patch for
'Microvacuum support for Hash Index - [1]' gets committed.

+   /*
+* In hash bucket and overflow pages, it is possible to modify the
+* LP_FLAGS without emitting any WAL record. Hence, mask the line
+* pointer flags.
+* See hashgettuple() for details.
+*/


[1] - 
https://www.postgresql.org/message-id/CAE9k0PmXyQpHX8%3DL_hFV7HfPV8qrit19xoUB86waQ87rKYzmYQ%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Wed, Mar 8, 2017 at 2:32 PM, Kuntal Ghosh  wrote:
> On Fri, Mar 3, 2017 at 9:44 AM, Amit Kapila  wrote:
>> On Tue, Feb 28, 2017 at 11:06 AM, Kuntal Ghosh
>>  wrote:
>>> Hello everyone,
>>>
>>> I've attached a patch which implements WAL consistency checking for
>>> hash indexes. This feature is going to be useful for developing and
>>> testing of WAL logging for hash index.
>>>
>>
>> 2.
>> + else if ((opaque->hasho_flag & LH_BUCKET_PAGE) ||
>> + (opaque->hasho_flag & LH_OVERFLOW_PAGE))
>> + {
>> + /*
>> + * In btree bucket and overflow pages, it is possible to modify the
>> + * LP_FLAGS without emitting any WAL record. Hence, mask the line
>> + * pointer flags.
>> + * See hashgettuple() for details.
>> + */
>> + mask_lp_flags(page);
>> + }
>>
>> Again, this mechanism is also modified by patch "Microvacuum support
>> for hash index", so above changes needs to be adjusted accordingly.
>> Comment referring to btree is wrong, you need to refer hash.
> I've corrected the text in the comment and re-based the patch on the
> latest hash index patch for WAL logging[1]. As discussed in the
> thread, Microvacuum patch can be re-based on top of this patch.
>
>
> [1] 
> https://www.postgresql.org/message-id/CAA4eK1%2BmvCucroWQwX3S7aBR%3D0yBJGF%2BjQz4x4Cx9QVsMFTZUw%40mail.gmail.com
> --
> Thanks & Regards,
> Kuntal Ghosh
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Suppress Clang 3.9 warnings

2017-03-13 Thread David Steele
Hi Aleksander,

On 2/22/17 9:43 AM, Fabien COELHO wrote:
> 
> Hello Aleksander,
> 
>> ```
>> xloginsert.c:742:18: warning: implicit conversion from 'int' to 'char'
>> changes value from 253 to -3 [-Wconstant-conversion]
>> ```
> 
> There is a bunch of these in "xlog.c" as well, and the same code is used
> in "pg_resetwal.c".
> 
>> Patch that fixes these warnings is attached to this email.
> 
> My 0.02€:
> 
> I'm not at ease at putting the thing bluntly under the carpet with a cast.
> 
> Why not update the target type to "unsigned char" instead, so that no
> cast is needed and the value compatibility is checked by the compiler? I
> guess there would be some more changes (question is how much), but it
> would be cleaner.

There's been no discussion or new patch on this thread recently.  If you
are planning to address the issues raised please plan to do so by
Thursday, March 16th.

If no new patch is submitted by that date I will mark this patch
"Returned with Feedback".

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Tom Lane
Robert Haas  writes:
> So now on every startup I get this:

> 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv6 address "::1"
> 2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv4 address 
> "127.0.0.1"
> 2017-03-13 10:08:49.400 EDT [90059] LOG:  listening on Unix address
> "/tmp/.s.PGSQL.5432"

> I think the idea that this is worth three lines of log output (out of
> a total of six) is hard to stomach.

You were in the minority before on the usefulness of this output, and
I think you still are.  Personally I've already found it useful to be
able to check that buildfarm runs are binding to (only) the addresses
they're supposed to.

Besides, unless someone has an objection to what I proposed in
<17211.1489189...@sss.pgh.pa.us> concerning getting rid of the
"MultiXact member wraparound protections are now enabled" message
in the normal case, we'll have saved three other lines relative
to where we were before, so that the net chattiness is the same.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel seq. plan is not coming against inheritance or partition table

2017-03-13 Thread Amit Kapila
On Wed, Mar 8, 2017 at 6:58 PM, Robert Haas  wrote:
> On Wed, Mar 8, 2017 at 7:04 AM, Amit Kapila  wrote:
>> On Wed, Mar 8, 2017 at 8:28 AM, Robert Haas  wrote:
>>> On Tue, Mar 7, 2017 at 9:24 PM, Amit Kapila  wrote:
 I think it can give us benefit in
 such cases as well (especially when we have to discard rows based heap
 rows).  Now, consider it from another viewpoint, what if there are
 enough index pages (> min_parallel_index_scan_size) but not sufficient
 heap pages.  I think in such cases parallel index (only) scans will be
 beneficial especially because in the parallel index only scans
 heap_pages could be very less or possibly could be zero.
>>>
>>> That's a separate problem.  I think we ought to consider having an
>>> index-only scan pass -1 for the number of heap pages, so that the
>>> degree of parallelism in that case is limited only by the number of
>>> index pages.
>>
>> Sure, that sounds sensible, but even after that, I am not sure if for
>> plain index scans it is a good idea to not choose parallelism if the
>> number of heap pages is lesser than min_parallel_table_scan_size even
>> though the number of index pages is greater than
>> min_parallel_index_scan_size.  I think we can try out some tests
>> (maybe TPC-H queries where parallel index scan gets picked up) to see
>> what is right here.  Do you think it will be bad if just commit your
>> patch without this change and then consider changing it separately?
>
> No, I think that would be fine.  I think that we need to commit
> something like what I proposed because the earlier commit broke
> something that used to work.  That's got to get fixed.
>

Agreed, so I have rebased your patch and passed heap_pages as -1 for
index_only scans as discussed.   Also, Rafia has tested with attached
patch that parallel index and parallel index only scans are picked for
TPC-H queries.  I have also reviewed and tested your changes with
respect to problem reported and found that it works as expected.  So,
I think we can go ahead with attached patch unless you see some
problem with the changes I have made.

The only remaining open item about parallel index scans is a decision
about storage parameter which is being discussed on parallel index
scan thread [1], if you and or others can share feedback, then we can
proceed on that aspect as well.


[1] - 
https://www.postgresql.org/message-id/44cd4d75-41f2-8e63-e204-1ecb09127fbf%40archidevsys.co.nz

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


compute-parallel-worker-fix.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-03-13 Thread Robert Haas
On Mon, Mar 13, 2017 at 7:46 AM, Robert Haas  wrote:
> On Mon, Mar 13, 2017 at 4:59 AM, Amit Khandekar  
> wrote:
>> I agree that we should preferably have the non-partial plans started
>> first. But I am not sure if it is really worth ordering the partial
>> plans by cost. The reason we ended up not keeping track of the
>> per-subplan parallel_worker, is because it would not matter  much ,
>> and we would just equally distribute the workers among all regardless
>> of how big the subplans are. Even if smaller plans get more worker,
>> they will finish faster, and workers would be available to larger
>> subplans sooner.
>
> Imagine that the plan costs are 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, and 10
> and you have 2 workers.
>
> If you move that 10 to the front, this will finish in 10 time units.
> If you leave it at the end, it will take 15 time units.

Oh, never mind.  You were only asking whether we should sort partial
plans.  That's a lot less important, and maybe not important at all.
The only consideration there is whether we might try to avoid having
the leader start in on a plan with a large startup cost.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve postmaster's logging of listen socket creation.

2017-03-13 Thread Robert Haas
On Fri, Mar 10, 2017 at 4:32 PM, Tom Lane  wrote:
> Improve postmaster's logging of listen socket creation.
>
> When one of the kernel calls in the socket()/bind()/listen() sequence
> fails, include the specific address we're trying to bind to in the log
> message.  This greatly eases debugging of network misconfigurations.
>
> Also, after successfully setting up a listen socket, report its address
> in the log, to ease verification that the expected addresses were bound.
> There was some debate about whether to print this message at LOG level or
> only DEBUG1, but the majority of votes were for the former.
>
> Discussion: https://postgr.es/m/9564.1489091...@sss.pgh.pa.us

So now on every startup I get this:

2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv6 address "::1"
2017-03-13 10:08:49.399 EDT [90059] LOG:  listening on IPv4 address "127.0.0.1"
2017-03-13 10:08:49.400 EDT [90059] LOG:  listening on Unix address
"/tmp/.s.PGSQL.5432"

I think the idea that this is worth three lines of log output (out of
a total of six) is hard to stomach.  If we did logging this detailed
for everything that happens during startup, we could easily have forty
or fifty lines of log output just from starting the server.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CORRESPONDING clause design

2017-03-13 Thread Pavel Stehule
2017-03-13 14:13 GMT+01:00 Surafel Temesgen :

>
>
> On Sat, Mar 11, 2017 at 9:01 AM, Pavel Stehule 
> wrote:
>
>
>> I am sending minor update - cleaning formatting and white spaces, error
>> messages + few more tests
>>
>
> Thank you very much for your help
>
>
>> Maybe correspondingClause needs own node type with attached location.
>> Then context can be much better positioned.
>>
>
>
> I think we can solve it by using your option or using expr_list for
> corresponding column and check the syntax manually.
>
> In my opinion, the last option eliminate the introduction of new node for
> only the sake of error position.
>
>
> What did you think about the second option?
>

I don't like it too much - using expr only for location is too misuse.

Some errors are related to just CORRESPONDING without any columns. So using
expr doesn't help here. So parse node CORRESPONDING can solve both issues.

Regards

Pavel

>
> Regards
>
> Surafel
>
>
>
>


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-03-13 Thread Mithun Cy
>> - The error handling loop around load_block() suggests that you're
>> expecting some reads to fail, which I guess is because you could be
>> trying to read blocks from a relation that's been rewritten under a
>> different relfilenode, or partially or entirely truncated.  But I
>> don't think it's very reasonable to just let ReadBufferWhatever() spew
>> error messages into the log and hope users don't panic.  People will
>> expect an automatic prewarm solution to handle any such cases quietly,
>> not bleat loudly in the log.  I suspect that this error-trapping code
>> isn't entirely correct, but there's not much point in fixing it; what
>> we really need to do is get rid of it (somehow).
>
> [Need Reelook] -- Debug and check if block load fails what will happen.

Oops Sorry, this was for self-reference. It is fixed now

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-03-13 Thread Mithun Cy
Hi all, thanks for the feedback. Based on your recent comments I have
implemented a new patch which is attached below,

On Sun, Feb 26, 2017 at 10:16 PM, Robert Haas  wrote:
> This is not easy to fix.  The lock has to be taken based on the
> relation OID, not the relfilenode, but we don't have the relation OID
> in the dump file, and recording it there won't help, because the
> relfilenode can change under us if the relation is rewritten with
> CLUSTER or VACUUM FULL or relevant forms of ALTER TABLE.  I don't see
> a solution other than launching a separate worker for each database,
> which seems like it could be extremely expensive if there are many
> databases.  Also, I am pretty sure it's no good to take locks before
> recovery reaches a consistent state.  I'm not sure off-hand whether
> crash-recovery will notice conflicting locks, but even if it does,
> blocking crash recovery in order to prewarm stuff is bad news.

On Mon, Feb 27, 2017 at 7:18 PM, Peter Eisentraut
 wrote:
> You don't have to start all these workers at once.  Starting one and not
> starting the next one until the first one is finished should be fine.
> It will have the same serial behavior that the patch is proposing anyway.

I have implemented a similar logic now. The prewarm bgworker will
launch a sub-worker per database in the dump file. And, each
sub-worker will load its database block info. The sub-workers will be
launched only after previous one is finished. All of this will only
start if the database has reached a consistent state.

I have also introduced 2 more utility functions which were requested earlier.
A. launch_autoprewarm_dump() RETURNS int4
This is a SQL callable function to launch the autoprewarm worker to
dump the buffer pool information at regular interval. In a server, we
can only run one autoprewarm worker so if a worker sees another
existing worker it will exit immediately. The return value is pid of
the worker which has been launched.

B. autoprewarm_dump_now() RETURNS int8
This is a SQL callable function to dump buffer pool information
immediately once by a backend. This can work in parallel with an
autoprewarm worker while it is dumping. The return value is the number
of blocks info dumped.

I need some feedback on efficiently dividing the blocks among
sub-workers. Existing dump format will not be much useful.

I have chosen the following format
Each entry of block info looks like this:
 and we shall
call it as BlockInfoRecord.

Contents of AUTOPREWARM_FILE has been formated such a way that
blockInfoRecord of each database can be given to different prewarm
workers.
format of AUTOPREWAM_FILE
===
[offset position of database map table]
[sorted BlockInfoRecords..]
[database map table]
===

The [database map table] is a sequence of offset in file which will
point to first BlockInfoRecords of each database in the dump. The
prewarm worker will read this offset one by one in sequence and ask
its sub-worker to seek to this position and then start loading the
BlockInfoRecords one by one until it sees a BlockInfoRecords of a
different database than it is actually connected to. NOTE: We store
off_t inside file so the dump file will not be portable to be used
across systems where sizeof off_t is different from each other.

I also thought of having one dump file per database. Problems I
thought which can go against it is there could be too many dump file
(also stale files of databases which are no longer in buffer pool).
Also, there is a possibility of dump file getting lost due to
concurrent dumps by bgworker and autoprewarm_dump_now() SQL utility.
ex: While implementing same, dump file names were chosen as number
sequence 0,1,2,3..number_of_db. (This helps to avoid stale files
being chosen before active database dump files). If 2 concurrent
process dump together there will be a race condition. If one page of
the new database is loaded to buffer pool at that point there could be
a possibility that a recent dump of database blocks will be
overwritten due to the race condition and it will go missing from
dumps even though that database is still active in bufferpool. If any
comments to fix this will be very helpful.

On Sun, Feb 26, 2017 at 10:16 PM, Robert Haas  wrote:
> Here are some other review comments (which may not matter unless we
> can think up a solution to the problems above).

> - I think auto_pg_prewarm.c is an unnecessarily long name.  How about
> autoprewarm.c?

Fixed; I am also trying to replace the term "auto pg_prewarm" to
"autoprewarm" in all relevant places.

> - It looks like you ran pgindent over this without adding the new
> typedefs to your typedefs.list, so things like the last line of each
> typedef is formatted incorrectly.

Fixed.

> - ReadBufferForPrewarm isn't a good name for 

  1   2   >