Re: [HACKERS] checkpointer continuous flushing

2015-08-09 Thread Fabien COELHO


Hello Heikki,

Thanks for having a look at the patch.

* I think we should drop the flush part of this for now. It's not as 
clearly beneficial as the sorting part, and adds a great deal more code 
complexity. And it's orthogonal to the sorting patch, so we can deal with it 
separately.


I agree that it is orthogonal and that the two features could be in 
distinct patches. The flush part is the first patch I really submitted 
because it has significant effect on latency, and I was told to mix it 
with sorting...


The flushing part really helps to keep write stalls under control in 
many cases, for instance:


- 400-tps 1-client (or 4 for medium) max 100-ms latency

 options   | percent of late transactions
  flush | sort | tiny | small | medium
off |  off | 12.0 | 64.28 | 68.6
off |   on | 11.3 | 22.05 | 22.6
 on |  off |  1.1 | 67.93 | 67.9
 on |   on |  0.6 |  3.24 |  3.1

The percent of late transactions is really the fraction of time the 
database is unreachable because of write stalls... So sort without flush 
is cleary not enough.


Another thing suggested by Andres is to fsync as early as possible, but 
this is not a simple patch because is intermix things which are currently 
in distinct parts of checkpoint processing, so I already decided that this 
would be for another submission.


* Is it really necessary to parallelize the I/O among tablespaces? I can see 
the point, but I wonder if it makes any difference in practice.


I think that if someone bothers with tablespace there is no reason to kill 
them behind her. Without sorting you may hope that tablespaces will be 
touched randomly enough, but once buffers are sorted you can probably find 
cases where it would write on one table space and then on the other.


So I think that it really should be kept.

* Is there ever any harm in sorting the buffers? The GUC is useful for 
benchmarking, but could we leave it out of the final patch?


I think that the performance show that it is basically always beneficial, 
so the guc may be left out. However on SSD it is unclear to me whether it 
is just a loss of time or whether it helps, say with wear-leveling. Maybe 
best to keep it? Anyway it is definitely needed for testing.


* Do we need to worry about exceeding the 1 GB allocation limit in 
AllocateCheckpointBufferIds? It's enough got 2 TB of shared_buffers. That's a 
lot, but it's not totally crazy these days that someone might do that. At the 
very least, we need to lower the maximum of shared_buffers so that you can't 
hit that limit.


Yep.

I ripped out the flushing part, keeping only the sorting. I refactored 
the logic in BufferSync() a bit. There's now a separate function,
nextCheckpointBuffer(), that returns the next buffer ID from the sorted 
list. The tablespace-parallelization behaviour in encapsulated there,


I do not understand the new tablespace-parallelization logic: there is no 
test about the tablespace of the buffer in the selection process... Note 
that I did wrote a proof for the one I put, and also did some detailed 
testing on the side because I'm always wary of proofs, especially mines:-)


I notice that you assume that table space numbers are always small and 
contiguous. Is that a fact? I was feeling more at ease with relying on a 
hash table to avoid such an assumption.


keeping the code in BufferSync() much simpler. See attached. Needs some 
minor cleanup and commenting still before committing, and I haven't done 
any testing besides a simple make check.


Hmmm..., just another detail, the patch does not sort:

  + if (checkpoint_sort  num_to_write  1  false)


I'll resubmit a patch with only the sorting part, and do the kind of 
restructuring you suggest which is a good thing.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] make check-world problem

2015-08-09 Thread Vladimir Koković

Hi,

PostgreSQL build failed with current GIT source.

tail make-out-dev.log
gcc -Wall -Wmissing-prototypes -Wpointer-arith  
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute  
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard  
-g3 -gdwarf-2 isolation_main.o pg_regress.o -L../../../src/port  
-L../../../src/common -Wl,--as-needed  
-Wl,-rpath,'/home/src/postgresql-devel/dev-install/lib',--enable-new-dtags   
-lpgcommon -lpgport -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz  
-lreadline -lrt -lcrypt -ldl -lm  -o pg_isolation_regress
PATH=/home/src/postgresql-devel/dev-build/tmp_install/home/src/postgresql-devel/dev-install/bin:$PATH  
LD_LIBRARY_PATH=/home/src/postgresql-devel/dev-build/tmp_install/home/src/postgresql-devel/dev-install/lib  
./pg_isolation_regress --temp-instance=./tmp_check  
--inputdir=/home/src/postgresql-devel/postgresql-git/postgresql/src/test/isolation  
--bindir=   
--schedule=/home/src/postgresql-devel/postgresql-git/postgresql/src/test/isolation/isolation_schedule

== creating temporary instance==
== initializing database system   ==
== starting postmaster==
running on port 57832 with PID 15786
== creating database isolationtest  ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test simple-write-skew... ok
test receipt-report   ... ok
test temporal-range-integrity ... ok
test project-manager  ... ok
test classroom-scheduling ... ok
test total-cash   ... ok
test referential-integrity... ok
test ri-trigger   ... ok
test partial-index... ok
test two-ids  ... ok
test multiple-row-versions... ok
test index-only-scan  ... ok
test fk-contention... ok
test fk-deadlock  ... ok
test fk-deadlock2 ... ok
test eval-plan-qual   ... ok
test lock-update-delete   ... ok
test lock-update-traversal... ok
test insert-conflict-do-nothing ... ok
test insert-conflict-do-update ... ok
test insert-conflict-do-update-2 ... ok
test insert-conflict-do-update-3 ... ok
test delete-abort-savept  ... ok
test delete-abort-savept-2... ok
test aborted-keyrevoke... ok
test multixact-no-deadlock... ok
test multixact-no-forget  ... ok
test propagate-lock-delete... ok
test tuplelock-conflict   ... ok
test nowait   ... ok
test nowait-2 ... ok
test nowait-3 ... ok
test nowait-4 ... ok
test nowait-5 ... ok
test skip-locked  ... ok
test skip-locked-2... ok
test skip-locked-3... ok
test skip-locked-4... ok
test brin-1   ... FAILED (test process exited with exit  
code 1)

test drop-index-concurrently-1 ... ok
test alter-table-1... ok
test alter-table-2... ok
test alter-table-3... ok
test create-trigger   ... ok
test timeouts ... ok
== shutting down postmaster   ==

===
 1 of 45 tests failed.
===

The differences that caused some tests to fail can be viewed in the
file  
/home/src/postgresql-devel/dev-build/src/test/isolation/regression.diffs.   
A copy of the test summary that you see
above is saved in the file  
/home/src/postgresql-devel/dev-build/src/test/isolation/regression.out.


Makefile:58: recipe for target 'check' failed
make[2]: *** [check] Error 1
make[2]: Leaving directory  
'/home/src/postgresql-devel/dev-build/src/test/isolation'

Makefile:28: recipe for target 'check-isolation-recurse' failed
make[1]: *** [check-isolation-recurse] Error 2
make[1]: Leaving directory '/home/src/postgresql-devel/dev-build/src/test'
GNUmakefile:69: recipe for target 'check-world-src/test-recurse' failed
make: *** [check-world-src/test-recurse] Error 2



cat  
/home/src/postgresql-devel/dev-build/src/test/isolation/results/brin-1.out

Parsed test spec with 2 sessions

starting permutation: s2check s1b s2b s1i s2summ s1c s2c s2check
setup failed: ERROR:  could not open extension control file  
/home/src/postgresql-devel/dev-build/tmp_install/home/src/postgresql-devel/dev-install/share/extension/pageinspect.control:  
No such file or directory



My build environment:
-
dev-build.sh:
#!/bin/bash

set -v
set -e

POSTGRESQL=/home/src/postgresql-devel
BUILD=dev-build

cd $POSTGRESQL
rm -rf $BUILD
mkdir $BUILD
chown postgres:postgres $BUILD
cd $POSTGRESQL/$BUILD
su -c $POSTGRESQL/dev-build-postgres.sh postgres

kdiff3 /home/src/pgadmin3-git/pgadmin3/pgadmin/pg_scanners/pg95/scan.l  
$POSTGRESQL/postgresql-git/postgresql/src/backend/parser/scan.l 
kdiff3  

Re: [HACKERS] make check-world problem

2015-08-09 Thread Michael Paquier
On Sun, Aug 9, 2015 at 5:00 PM, Vladimir Koković
vladimir.koko...@a-asoft.com wrote:
 starting permutation: s2check s1b s2b s1i s2summ s1c s2c s2check
 setup failed: ERROR:  could not open extension control file
 /home/src/postgresql-devel/dev-build/tmp_install/home/src/postgresql-devel/dev-install/share/extension/pageinspect.control:

Yes, this is a known issue that has been caused by the recent commit
2834855c which has added a dependency to the contrib module
pageinspect in the isolation test you are seeing failing. I would
imagine that a fix is in the works as well, which is going to remove
this dependency to pageinspect.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert in pg_stat_statements

2015-08-09 Thread Robert Haas
On Sun, Aug 9, 2015 at 1:36 AM, Satoshi Nagayasu sn...@uptime.jp wrote:
 On 2015/08/08 22:32, Robert Haas wrote:
 On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu sn...@uptime.jp wrote:

 I just found that pg_stat_statements causes assert when queryId is
 set by other module, which is loaded prior to pg_stat_statements in
 the shared_preload_libraries parameter.

 Theoretically, queryId in the shared memory could be set by other
 module by design.

 So, IMHO, pg_stat_statements should not cause assert even if queryId
 already has non-zero value --- my module has this problem now.

 Is my understanding correct? Any comments?


 Seems like an ERROR would be more appropriate than an Assert.


 Well, it's not that simple, and I'm afraid that it may not fix
 the issue.

 Here, let's assume that we have two modules (foo and pg_stat_statements)
 which calculate, use and store Query-queryId independently.

 What we may see when two are enabled is following.

 (1) Enable two modules in shared_preload_libraries.

 shared_preload_libraries = 'foo,pg_stat_statements'

 (2) Once a query comes in, a callback function in module foo
 associated with post_parse_analyze_hook calculates and store
 queryId in Query-queryId.

 (3) After that, a callback function (pgss_post_parse_analyze) in
 pg_stat_statements associated with post_parse_analyze_hook
 detects that Query-queryId has non-zero value, and asserts it.

 As a result, we can not have two or more modules that use queryId
 at the same time.

 But I think it should be possible because one of the advantages of
 PostgreSQL is its extensibility.

 So, I think the problem here is the fact that pg_stat_statements
 deals with having non-zero value in queryId as error even if
 it has exact the same value with other module.

I'm not too excited about supporting the use case where there are two
people using queryId but it just so happens that they always set
exactly the same value.  That seems like a weird setup.  Wouldn't that
mean both modules were applying the same jumble algorithm, and
therefore you're doing the work of computing the jumble twice per
query?  It would be better to have the second module have an option
not to compute the query ID and just do whatever else it does.  Then,
if you want to use that other module with pg_stat_statements, flip the
GUC.

The reason I think an Assert is wrong is that if there are other
modules using queryId, we should catch attempts to use the queryId for
more than one purpose even in non-assert enabled builds.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] statement_timeout affects query results fetching?

2015-08-09 Thread Robert Haas
On Sat, Aug 8, 2015 at 11:30 AM, Shay Rojansky r...@roji.org wrote:
 the entire row in memory (imagine rows with megabyte-sized columns). This
 makes sense to me; Tom, doesn't the libpq behavior you describe of absorbing
 the result set as fast as possible mean that a lot of memory is wasted on
 the client side?

It sure does.

 I can definitely appreciate the complexity of changing this behavior. I
 think that some usage cases (such as mine) would benefit from a timeout on
 the time until the first row is sent, this would allow to put an upper cap
 on stuff like query complexity, for example.

Unfortunately, it would not do any such thing.  It's possible for the
first row to be returned really really fast and then for an arbitrary
amount of time to pass and computation to happen before all the rows
are returned.  A plan can have a startup cost of zero and a run cost
of a billion (or a trillion).  This kind of scenario isn't even
particularly uncommon.  You just need a plan that looks like this:

Nested Loop
- Nested Loop
  - Nested Loop
- Seq Scan
- Index Scan
  - Index Scan
- Index Scan

You can just keep pushing more nested loop/index scans on there and
the first row will still pop out quite fast.  But if the seq-scanned
table is large, generating the whole result set can take a long, long
time.

Even worse, you could have a case like this:

SELECT somefunc(a) FROM foo;

Now suppose somefunc is normally very quick, but if a = 42 then it
does pg_sleep() in a loop until the world ends.   You're going to have
however many rows of foo have a != 42 pop out near-instantaneously,
and then it will go into the tank and not come out until the meaning
of life, the universe, and everything is finally revealed.

That second case is a little extreme, and a little artificial, but the
point is this: just as you don't want the client to have to buffer the
results in memory, the server doesn't either.  It's not the case that
the server computes the rows and sends them to you.  Each one is sent
to you as it is computed, and in the normal case, at the time the
first row is sent, only a small percentage of the total work of the
query has been performed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cache invalidation skip logic

2015-08-09 Thread Robert Haas
On Thu, Aug 6, 2015 at 2:19 PM, Qingqing Zhou zhouqq.postg...@gmail.com wrote:
 In cache invalidation logic, we have the following comment:

 /*
 * Now that we have the lock, check for invalidation messages, so that we
 * will update or flush any stale relcache entry before we try to use it.
 * RangeVarGetRelid() specifically relies on us for this.  We can skip
 * this in the not-uncommon case that we already had the same type of lock
 * being requested, since then no one else could have modified the
 * relcache entry in an undesirable way.  (In the case where our own xact
 * modifies the rel, the relcache update happens via
 * CommandCounterIncrement, not here.)
 */
 if (res != LOCKACQUIRE_ALREADY_HELD)
AcceptInvalidationMessages();

 It is true after we hold the lock, nobody will further modify it but there
 could be some left-over invalidation message we shall accept before we can
 continue. This is can be demonstrated with the following invalidation
 sequence:
  {
 1: inval A;
 2: inval B;
 ...;
 10: inval pg_class
 }

 After step 10, another session may encounter a lock and replays this sequence:

 step 1:  RelationBuildDesc(A), it heap_open(pg_class),
  pg_class lock not acquired yet, so it acquires the lock and
  recursively replay the sequence, goto step 2.
 step 2:
  RelationBuildDesc(B), it heap_open(pg_class),
  but this time we already have LOCKACQUIRE_ALREADY_HELD with
  pg_class, so we now access pg_class but it is wrong.

 User may ends up with a could not open file ... error.

 Is above sequence possible?

In step 1, AcceptInvalidationMessages() should process all pending
invalidation messages.  So if step 2 did AcceptInvalidationMessages()
again it would be a no-op, because no messages should remain at that
point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-08-09 Thread Robert Haas
On Thu, Aug 6, 2015 at 11:33 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 They also provide a level of control over what is and isn't installed in a
 cluster. Personally, I'd prefer that most users not even be aware of the
 existence of things like pageinspect.

+1.

If everybody feels that moving extensions currently stored in contrib
into src/extensions is going to help us somehow, then, uh, OK.  I
can't work up any enthusiasm for that, but I can live with it.

However, I think it's affirmatively bad policy to say that we're going
to put all of our debugging facilities into core because otherwise
some people might not have them installed.  That's depriving users of
the ability to control their environment, and there are good reasons
for some people to want those things not to be installed.  If we
accept the argument it inconveniences hacker X when Y is not
installed as a reason to put Y in core, then we can justify putting
anything at all into core.  And I don't think that's right at all.
Extensions are a useful packaging mechanism for functionality that is
useful but not required, and debugging facilities are definitely very
useful but should not be required.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-09 Thread Josh Berkus
On 08/09/2015 08:09 AM, Robert Haas wrote:
 Why do we need to be able to authenticate using more than one
 mechanism?  If you have some clients that can't support SCRAM yet, you
 might as well continue using MD5 across the board until that changes.
 You're not going to get much real security out of using MD5 for some
 authentication attempts and SCRAM for other ones, 

Speaking as someone who has sheperded several clients through
infrastructure upgrades, I have to disagree with this.

First, people don't upgrade large infrastructures with multiple
applications, ETL processes and APIs which connect with the database all
at once.  They do it one component at a time, verify that component is
working, and then move on to the next one.  Even within a single
application, there could be many servers to upgrade, and you can't do
them all simultaneously.

Now, for shops where they've had the foresight to set up group roles
which own objects so that a new user with SCRAM can be assigned in the
group role, this is no problem.  But for the other 98% of our large-app
users, setting up that kind of infrastructure would itself require a
weekend-long downtime, due to the locking required to reassign object
permissions and all of the app testing required.

Second, you're forgetting hosted PostgreSQL, where there may be only one
user available to each database owner.  So assigning a new login role
for SCRAM isn't even an option.

Plus all of the above requires that some login roles have a SCRAM
verifier, and others have MD5, for some period.  Even if we don't
support multiple verifiers for one login, that still means we need to
deal with what verifier gets created for a new role and the required
support functions and GUCs for that.  Switching across the board on a
per-installation basis is a complete nonstarter for any running application.

Frankly, switching on a per-postmaster basis isn't even worth discussing
in my book, because some languages/platforms will take years longer than
others to support SCRAM.

Overall, it's to the PostgreSQL project's benefit to have users switch
to SCRAM once we have it available.  For that reason, we should try to
make it easy for them to make the switch.

However ...

 and the amount of
 infrastructure we're proposing to introduce to support that is pretty
 substantial.

... during my exchange with Michael, I was thinking about the bug
potential of taking the password field and multiplexing it in some way,
which is significant.  There is a definite risk of making this too
complicated and we'll need to contrast that against ease-of-migration,
because complicated mechanisms tend to be less secure due to user error.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/09/2015 10:37 AM, Tom Lane wrote:
 I can see the value of a feature like this, but doing it in psql
 sure seems like the wrong place.  It would be unavailable to
 anything except interactive use.
 
 Is there a way to implement pivoting as a set-returning function?

That's exactly what crosstab() in the tablefunc contrib module is.
I've heard rumblings of someone looking to build crosstab/pivot into
the core grammar, but have not yet seen any patches...

Joe

- -- 
Joe Conway
Crunchy Data
Enterprise PostgreSQL
http://crunchydata.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJVx6b7AAoJEDfy90M199hlOJYP/3WJTFrB22SlC7pxTFl/L8La
9F0DbNlCyyK/oUkYH+dy5MXBnwHTFAwj2sEik7xNhax7aPIMnXh095f0AaMjouVU
S0J0dGb5quYYK+TUWBBL745nRIl786H2/XbZbP+L7pz2W72ITtTbKqMHpXVyzMiF
DfEN9xnCd05MF0WiaAXtwtz8HXQkFJxD/r5kgmHvwMIBPvvzrAg6CwHh/8kQMlY2
1kvkYnKvFSNp+PrZ0CvANs6ZyqaDJN1w7nsXul7HAWu+KhBkHxAilnBkCjjOT5JD
beF8E1KNo60k+3zjphEN1yKBl5tT7r9mYLhuOLuI0UdpNpdd8u+rSRTSFaIRMGQ7
UDQIXOtVHSlKSSOpXHH6FYvksUQMDVvPeSwKoI2imwAjSStkInHJMj6cs2uKxsZi
5P0sRnulx7Ur4M1mCq7t3+IKiaZ2KzO0WZ5ewJ1mMt5hIqD7QADjvDoPn0qozV5T
lX3pY4PUaL+N8XwlgBb69vGgUeG3N4nEvDSiUyLbFC3au/osczRGloYBq8AgDNuX
Dta9AOkzbNAA82ODUzFoFts8/1Ydu8vhwnpQiNcl772Hf0fpNvdFGtclc4K+2ToX
4k2WmezT5y8d6IdPPDqM92wbWXAOBIy1I+kaYnbnQpxn2QYzkKrPbJ2unTTKTdUa
5/uFA0fg37acwloBMux/
=5lhb
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-09 Thread Robert Haas
On Sat, Aug 8, 2015 at 1:23 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 On 08/08/2015 04:27 PM, Robert Haas wrote:
 I don't see that there's any good reason to allow the same password to
 be stored in the catalog encrypted more than one way,

 Sure there is. If you want to be able to authenticate using different
 mechanism, you need the same password encrypted in different ways. SCRAM
 uses verifier that's derived from the password in one way, MD5
 authentication needs an MD5 hash, and yet other protocols have other
 requirements.

Why do we need to be able to authenticate using more than one
mechanism?  If you have some clients that can't support SCRAM yet, you
might as well continue using MD5 across the board until that changes.
You're not going to get much real security out of using MD5 for some
authentication attempts and SCRAM for other ones, and the amount of
infrastructure we're proposing to introduce to support that is pretty
substantial.

 and I don't think there's any good reason to introduce the PASSWORD
 VERIFIER terminology.  I think we should store (1) your password,
 either encrypted or unencrypted; and (2) the method used to encrypt
 it.  And that's it.


 Like Joe and Stephen, I actually find it highly confusing that we call the
 MD5 hash an encrypted password. The term password verifier is fairly
 common in the specifications of authentication mechanisms. I think we should
 adopt it.

OK, but it sure looked from Michael's syntax description like you
wrote PASSWORD VERIFIER (md5 'the_actual_password').   Or at least
that was my impression from reading it, maybe I got it wrong.  If you
want to introduce ALTER USER ... PASSWORD VERIFIER as alternative
syntax for what we now call ALTER USER ... ENCRYPTED PASSWORD, that
works for me.  But a plaintext password shouldn't be called a password
verifier under the terminology you are using here, IIUC.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Tom Lane
Daniel Verite dan...@manitou-mail.org writes:
 I want to suggest a client-side \pivot command in psql, implemented
 in the attached patch.

 \pivot takes the current query in the buffer, execute it and
 display it pivoted by interpreting the result as:

 column1 = row in pivoted output
 column2 = column in pivoted output
 column3 = value at (row,column) in pivoted ouput

I can see the value of a feature like this, but doing it in psql sure
seems like the wrong place.  It would be unavailable to anything
except interactive use.

Is there a way to implement pivoting as a set-returning function?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Daniel Verite
  Hi,

I want to suggest a client-side \pivot command in psql, implemented
in the attached patch.

\pivot takes the current query in the buffer, execute it and
display it pivoted by interpreting the result as:

column1 = row in pivoted output
column2 = column in pivoted output
column3 = value at (row,column) in pivoted ouput

The advantage over the server-side crosstab() from contrib is in
ease of use, mostly because \pivot doesn't need in advance the
list of columns that result from pivoting.

Typical use cases are queries that produce values
along two axes:
 SELECT a,b,aggr(something) FROM tbl GROUP a,b;
\pivot displays immediately the matrix-like representation

Or displaying pairing between columns, as in
 SELECT a,b,'X' FROM tblA [LEFT|RIGHT|FULL] JOIN tblB...
which once pivoted shows in an easily readable way
what a is/isn't in relation with any b.

Columns are sorted with strcmp(). I think a more adequate sort could
be obtained through a separate query with ORDER BY just for these
values (casted to their original type), but the patch doesn't do that
yet.

Also, \pivot could take optionally the query as an argument instead
of getting it only from the query buffer.

Anyway, does it look useful enough to be part of psql? 
I guess I should push this to commitfest if that's the case.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index f1336d5..7e1ac24 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o pivot.o \
 	$(WIN32RES)
 
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 6181a61..e348028 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -48,6 +48,7 @@
 #include psqlscan.h
 #include settings.h
 #include variables.h
+#include pivot.h
 
 /*
  * Editable database object types.
@@ -1083,6 +1084,12 @@ exec_command(const char *cmd,
 		free(pw2);
 	}
 
+	/* \pivot -- execute a query and show pivoted results */
+	else if (strcmp(cmd, pivot) == 0)
+	{
+		success = doPivot(query_buf);
+	}
+
 	/* \prompt -- prompt and set variable */
 	else if (strcmp(cmd, prompt) == 0)
 	{
diff --git a/src/bin/psql/pivot.c b/src/bin/psql/pivot.c
new file mode 100644
index 000..cb8ffdf
--- /dev/null
+++ b/src/bin/psql/pivot.c
@@ -0,0 +1,213 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2015, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/pivot.c
+ */
+
+#include common.h
+#include pqexpbuffer.h
+#include settings.h
+#include pivot.h
+
+#include string.h
+
+static int
+headerCompare(const void *a, const void *b)
+{
+	return strcmp( ((struct pivot_field*)a)-name,
+   ((struct pivot_field*)b)-name);
+}
+
+static void
+accumHeader(char* name, int* count, struct pivot_field **sorted_tab, int row_number)
+{
+	struct pivot_field *p;
+
+	/*
+	 * Search for name in sorted_tab. If it doesn't exist, insert it,
+	 * otherwise do nothing.
+	 */
+
+	if (*count = 1)
+	{
+		p = (char**) bsearch(name,
+			 *sorted_tab,
+			 *count,
+			 sizeof(struct pivot_field),
+			 headerCompare);
+	}
+	else
+		p=NULL;
+
+	if (!p)
+	{
+		*sorted_tab = pg_realloc(*sorted_tab, sizeof(struct pivot_field) * (1+*count));
+		(*sorted_tab)[*count].name = name;
+		(*sorted_tab)[*count].rank = *count;
+		(*count)++;
+
+		qsort(*sorted_tab,
+			  *count,
+			  sizeof(struct pivot_field),
+			  headerCompare);
+	}
+}
+
+static void
+printPivotedTuples(const PGresult *results,
+   int num_columns,
+   struct pivot_field *piv_columns,
+   int num_rows,
+   struct pivot_field *piv_rows)
+{
+	printQueryOpt popt = pset.popt;
+	printTableContent cont;
+	int	i, j, k, rn;
+
+	popt.title = _(Pivoted query results);
+	printTableInit(cont, popt.topt, popt.title,
+   num_columns+1, num_rows);
+
+	/* Pivoting keeps the name of the first column */
+	printTableAddHeader(cont, PQfname(results, 0),
+		popt.translate_header, 'l');
+
+	for (i = 0; i  num_columns; i++)
+	{
+		printTableAddHeader(cont, piv_columns[i].name,
+			popt.translate_header, 'l');
+	}
+
+	/* Set row names in the first output column */
+	for (i = 0; i  num_rows; i++)
+	{
+		k = piv_rows[i].rank;
+		cont.cells[k*(num_columns+1)] = piv_rows[i].name;
+		for (j = 0; j  num_columns; j++)
+			cont.cells[k*(num_columns+1)+j+1] = ;
+	}
+	cont.cellsadded = num_rows * (num_columns+1);
+
+	/* Set all the inside cells */
+	for (rn = 0; rn  PQntuples(results); rn++)
+	{
+		char* row_name;
+		char* col_name;
+		int row_number;
+		int col_number;
+		struct pivot_field *p;
+
+		row_number = col_number = -1;
+		/* Find target row */
+		if 

Re: [HACKERS] tap tests remove working directories

2015-08-09 Thread Michael Paquier
On Sun, Aug 9, 2015 at 1:40 AM, Andrew Dunstan and...@dunslane.net wrote:

 On 08/08/2015 09:31 AM, Robert Haas wrote:

 On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net
 wrote:

 That certainly isn't what happens, and given the way this is done in
 TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir()
 function,
 it's not clear how we could do that easily.

 shot-in-the-dark

 Set cleanup to false and manually remove the directory later in the
 code, so that stuff runs only if we haven't died sooner?

 /shot-in-the-dark

 Yeah, maybe. I'm thinking of trying to do it more globally, like in
 src/Makefile.global.in. That way we wouldn't have to add new code to every
 test file.

If we rely on END to clean up the temporary data folder, there is no
need to impact each test file, just the test functions called in
TestLib.pm that could switch a flag to not perform any cleanup at the
end of the run should an unexpected result be found. Now I am as well
curious to see what you have in mind with manipulating
Makefile.global.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert in pg_stat_statements

2015-08-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I'm not too excited about supporting the use case where there are two
 people using queryId but it just so happens that they always set
 exactly the same value.  That seems like a weird setup.  Wouldn't that
 mean both modules were applying the same jumble algorithm, and
 therefore you're doing the work of computing the jumble twice per
 query?

Not only that, but you'd need to keep the two modules in sync, which
would be one of the greatest recipes for bugs we've thought of yet.

If there's actually a use case for that sort of thing, I would vote
for moving the jumble-calculation code into core and making both of
the interested extensions do

/* Calculate query hash if it's not been done yet */
if (query-queryId == 0)
calculate_query_hash(query);

I note also that pg_stat_statements is using query-queryId == 0 as a
state flag, which means that if some other module has already calculated
the hash that would break it, even if the value were identical.
(In particular, it's using the event of calculating the queryId as an
opportunity to possibly make an entry in its own hashtable.)  So you'd
need to do something to replace that logic if you'd like to use
pg_stat_statements concurrently with some other use of queryId.

 The reason I think an Assert is wrong is that if there are other
 modules using queryId, we should catch attempts to use the queryId for
 more than one purpose even in non-assert enabled builds.

Yeah, an explicit runtime check and elog() would be safer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert in pg_stat_statements

2015-08-09 Thread Peter Geoghegan
On Sun, Aug 9, 2015 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If there's actually a use case for that sort of thing, I would vote
 for moving the jumble-calculation code into core

I think that there'd be a good case for doing that for several other
reasons. It would be great to have a per-queryId
log_min_duration_statement, for example.

Most likely, this would work based on a system of grouping queries.
One grouping might be SLA queries, that the user hopes to prevent
ever hobbling loading the application's home page. Outliers in this
group of queries are far more interesting to the user than any other,
or expectations are in some other way quite different from the
average. At the same time, maybe for reporting queries 60 seconds is
considered an unreasonably long time to wait.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Sun, Aug 09, 2015 at 07:16:02PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
 It does risk that.  Same deal with making = have the same precedence as 
 
 instead of keeping it slightly lower.

 Agreed, but in that case I think our hand is forced by the SQL standard.

 In SQL:2008 and SQL:2011 at least, =,  and BETWEEN are all in the same
 boat.  They have no precedence relationships to each other; SQL sidesteps the
 question by requiring parentheses.  They share a set of precedence
 relationships to other constructs.  SQL does not imply whether to put them in
 one %nonassoc precedence group or in a few, but we can contemplate whether
 users prefer an error or prefer the 9.4 behavior for affected queries.

Part of my thinking was that the 9.4 behavior fails the principle of least
astonishment, because I seriously doubt that people expect '=' to be
either right-associative or lower priority than ''.  Here's one example:

regression=# select false = true  false;
 ?column? 
--
 t
(1 row)

Not only does that seem unintuitive, but I actually had to experiment
a bit before finding a combination of values in which I got a different
result from what you'd expect if you think the precedence is (x = y)  z.
So it's not hard to imagine that somebody might write a query thinking
that that's how it works, and even have it get through desultory testing
before silently giving unexpected answers in production.

So yeah, I do think that getting a syntax error if you don't use
parentheses is the preferable behavior here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Noah Misch
On Thu, Feb 19, 2015 at 10:48:34AM -0500, Tom Lane wrote:
 To wit, that the precedence of = = and  is neither sane nor standards
 compliant.

 I claim that this behavior is contrary to spec as well as being
 unintuitive.  Following the grammar productions in SQL99:

Between 1999 and 2006, SQL changed its representation of the grammar in this
area; I have appended to this message some of the key productions as of 2013.
I did not notice a semantic change, though.

 We have that right for =   but not for the other three standard-mandated
 comparison operators.  I think we should change the grammar so that all
 six act like   do now, that is, they should have %nonassoc precedence
 just above NOT.
 
 Another thought, looking at this closely, is that we have the precedence
 of IS tests (IS NOT NULL etc) wrong as well: they should bind less tightly
 than user-defined ops, not more so.

SQL has two groups of IS tests with different precedence.  The boolean test
productions IS [NOT] {TRUE | FALSE | UNKNOWN} have precedence just lower than
, and the null predicate productions IS [NOT] NULL have precedence equal
to .  (An implementation giving them the same precedence can conform,
because conforming queries cannot notice the difference.)

I attempted to catalog the diverse precedence changes in commit c6b3c93:

 @@ -647,13 +654,11 @@ static Node *makeRecursiveViewSelect(char *relname, 
 List *aliases, Node *query);
  %leftOR
  %leftAND
  %right   NOT
 -%right   '='
 -%nonassoc'' ''
 -%nonassocLIKE ILIKE SIMILAR
 -%nonassocESCAPE
 +%nonassocIS ISNULL NOTNULL   /* IS sets precedence for IS NULL, etc 
 */
 +%nonassoc'' '' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
 +%nonassocBETWEEN IN_P LIKE ILIKE SIMILAR NOT_LA
 +%nonassocESCAPE  /* ESCAPE must be just above 
 LIKE/ILIKE/SIMILAR */
  %nonassocOVERLAPS
 -%nonassocBETWEEN
 -%nonassocIN_P
  %leftPOSTFIXOP   /* dummy for postfix Op rules */
  /*
   * To support target_el without AS, we must give IDENT an explicit priority
 @@ -678,9 +683,6 @@ static Node *makeRecursiveViewSelect(char *relname, List 
 *aliases, Node *query);
  %nonassocUNBOUNDED   /* ideally should have same precedence 
 as IDENT */
  %nonassocIDENT NULL_P PARTITION RANGE ROWS PRECEDING FOLLOWING
  %leftOp OPERATOR /* multi-character ops and 
 user-defined operators */
 -%nonassocNOTNULL
 -%nonassocISNULL
 -%nonassocIS  /* sets precedence for IS NULL, 
 etc */
  %left'+' '-'
  %left'*' '/' '%'
  %left'^'

1. Decrease precedence of =, = and  to match .

2. Increase precedence of, for example, BETWEEN x AND Y to match precedence
   with BETWEEN keyword instead of AND keyword.  Make similar precedence
   changes to other multiple-keyword productions involving AND, NOT, etc.

3. Decrease precedence of IS [NOT] {TRUE | FALSE | UNKNOWN} to fall between
   NOT and .

4. Decrease precedence of IS [NOT] NULL and IS[NOT]NULL to match IS [NOT]
   {TRUE | FALSE | UNKNOWN}.

5. Forbid chains of = (make it nonassoc), and increase its precedence to
   match .

6. Decrease precedence of BETWEEN and IN keywords to match LIKE.

 It's
 definitely weird that the IS tests bind more tightly than multicharacter
 Ops but less tightly than + - * /.

(1), (2) and (3) improve SQL conformance, and that last sentence seems to
explain your rationale for (4).  I've been unable to explain (5) and (6).  Why
in particular the following three precedence groups instead of combining them
as in SQL or subdividing further as in PostgreSQL 9.4?

 +%nonassoc'' '' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
 +%nonassocBETWEEN IN_P LIKE ILIKE SIMILAR NOT_LA

  %nonassocOVERLAPS

Thanks,
nm


comparison predicate ::=
  row value predicand comparison predicate part 2

comparison predicate part 2 ::=
  comp op row value predicand

row value predicand ::=
row value special case
  | row value constructor predicand

# Syntax Rules
# 1) The declared type of a row value special case shall be a row type.
row value special case ::=
  nonparenthesized value expression primary

# Things with precedence higher than comparison.
row value constructor predicand ::=
common value expression
  | boolean predicand
  | explicit row value constructor

# numeric: addition, multiplication
# string: concat, collate clause
# datetime: addition, AT TIME ZONE
# interval: addition, division
# UDT: value expression primary
# reference: value expression primary
# collection: array/multiset
common value expression ::=
numeric value expression
  | string value expression
  | datetime value expression
  | interval value expression
  | user-defined type value expression
  | reference value expression
  | collection value expression

boolean predicand ::=
parenthesized boolean value 

Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Tom Lane
I wrote:
 Noah Misch n...@leadboat.com writes:
 Why in particular the following three precedence groups instead of
 combining them as in SQL or subdividing further as in PostgreSQL 9.4?

 +%nonassoc'' '' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
 +%nonassocBETWEEN IN_P LIKE ILIKE SIMILAR NOT_LA
 %nonassoc OVERLAPS

 OVERLAPS is a special case in that it doesn't really need precedence at
 all: both its arguments are required to be parenthesized.  We could
 possibly have removed it from the precedence hierarchy altogether, but
 I didn't bother experimenting with that, just left it alone.  But
 because of that, moving BETWEEN/IN below it doesn't really change
 anything.

I got off my rear and did the experiment, and found that indeed simply
removing %nonassoc OVERLAPS seems to work and not change any behavior
(in fact, the generated gram.c is identical).  Shall we do that and remove
the listing of OVERLAPS in the documentation's precedence table?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-09 Thread Stephen Frost
* Sehrope Sarkuni (sehr...@jackdb.com) wrote:
 It'd be nice if the new auth mechanism supports multiple passwords in the
 same format as well (not just one per format).
 
 That way you could have two different passwords for a user that are active
 at the same time. This would simplify rolling database credentials as it
 wouldn't have to be done all at once. You could add the new credentials,
 update your app servers one by one, then disable the old ones.
 
 A lot of systems that use API keys let you see the last time a particular
 set of keys was used. This helps answer the Is this going to break
 something if I disable it? question. Having a last used at timestamp for
 each auth mechanism (per user) would be useful.

Excellent points and +1 to all of these ideas from me.

 I'm not sure how updates should work when connecting to a read-only slave
 though. It would need some way of letting the master know that user X
 connected using credentials Y.

That wouldn't be all that hard to add to the protocol..

What would be nice also would be to include slave connections in
pg_stat_activity, so you could figure out what transaction on what slave
is causing your master to bloat...  And then if we could send signals
from the master to those processes, it'd be even nicer..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Fix pg_dump to dump shell types.

2015-08-09 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Still not quite there. If either 9.0 or 9.1 is upgraded to 9.2 or later, 
 they fail like this:

 pg_restore: creating TYPE public.myshell
 pg_restore: setting owner and privileges for TYPE public.myshell
 pg_restore: setting owner and privileges for ACL public.myshell
 pg_restore: [archiver (db)] Error from TOC entry 4293; 0 0 ACL
 myshell buildfarm
 pg_restore: [archiver (db)] could not execute query: ERROR:  type
 myshell is only a shell
  Command was: REVOKE ALL ON TYPE myshell FROM PUBLIC;
 REVOKE ALL ON TYPE myshell FROM buildfarm;
 GRANT ALL ON TYPE myshell TO PUBL...

 We could just declare that we don't support this for versions older than 
 9.2, in which case I would just remove the type from the test database 
 before testing cross-version upgrade. But if it's easily fixed then 
 let's do it.

It looks to me like the reason for this is that pg_dump forces the
typacl of a type to be '{=U}' when reading the schema data for a
pre-9.2 type, rather than reading it as NULL (ie default permissions)
which would result in not printing any grant/revoke commands for
the object.

I do not see a good reason for that; quite aside from this problem,
it means there is one more place that knows the default permissions
for a type than there needs to be.  Peter, what was the rationale?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Daniel Verite
Tom Lane wrote:

 Is there a way to implement pivoting as a set-returning function?

Not with the same ease of use. We have crosstab functions
in contrib/tablefunc already, but the killer problem with PIVOT
is that truly dynamic columns are never reachable directly.

If we could do this:

  SELECT * FROM crosstab('select a,b,c FROM tbl');

and the result came back pivoted, with a column for each distinct value
of b, there will be no point in a client-side pivot. But postgres (or
I suppose any SQL interpreter) won't execute this, for not knowing
beforehand what structure * is going to have.

So what is currently required from the user, with dynamic columns,
is more like:

1st pass: identify the columns
 SELECT DISTINCT a FROM tbl;

2nd pass: inject the columns, in a second embedded query
and in a record definition, with careful quoting:

  select * from crosstab(
'SELECT a,b,c FROM tbl ORDER BY 1', 
' VALUES (col1),(col2),(col3)...' -- or 'select distinct...' again
  ) AS ct(b type, col1 type, col2 type, col3 type)


Compared to this, \pivot limited to the psql  interpreter 
is a no-brainer, we could just write instead:

= select a,b,c FROM tbl;
= \pivot

This simplicity is the whole point. It's the result of doing
the operation client-side, where the record structure can be
pivoted without the target structure being formally declared.

Some engines have a built-in PIVOT syntax (Oracle, SQL server).
I have looked only at their documentation.
Their pivot queries look nicer and are possibly more efficient than
with SET functions, but AFAIK one still needs to programmatically 
inject the list of column values into them, when that list
is not static.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Sun, Aug 09, 2015 at 04:48:22PM -0400, Tom Lane wrote:
 I'm curious about your rationale for claiming that null predicate has
 precedence exactly equal to  according to the spec.

 Both null predicate and comparison predicate are in the set of productions
 that take row value predicand arguments and appear only in predicate.
 Passing a production in that set as an argument to a production in that set
 requires parentheses.  To restate (non-associative) precedence equal more
 pedantically, there exists no conforming query whose interpretation hinges on
 the relative precedence of IS NULL and .

Ah.  So really, according to the spec IS and  could have any precedence
relative to each other as long as there is no other construct between.
Works for me.

 To my knowledge, SQL is agnostic about whether LIKE is an operator.  The six
 comparison operators bind looser than common value expression syntax like
 * and ||, tighter than IS TRUE, and indistinguishable from predicate
 syntax like IS DISTINCT FROM and LIKE.

Indistinguishable in the same sense as above, right?  So for our
purposes, it's better to keep BETWEEN and friends as binding slightly
tighter than '' than to make them the same precedence.  Same precedence
risks breaking things that weren't broken before.  Also, while I argued
above that making BETWEEN a potential argument for LIKE wasn't sensible,
that's not true for the comparison operators.  In particular, boolean '='
is the only SQL-provided spelling for logical implication.

 OVERLAPS is a special case in that it doesn't really need precedence at
 all: both its arguments are required to be parenthesized.  We could
 possibly have removed it from the precedence hierarchy altogether, but
 I didn't bother experimenting with that, just left it alone.  But
 because of that, moving BETWEEN/IN below it doesn't really change
 anything.

 Ah, quite right.  SQL OVERLAPS takes various forms of two-column input, but
 PostgreSQL OVERLAPS is more particular.  I like your subsequent proposal to
 remove OVERLAPS from the order of precedence.

Yeah.  If we ever extend our OVERLAPS syntax, we might have to assign a
precedence to OVERLAPS, but we can do that then --- and it would be good
if we did not at that time have a false impression that the precedence
was already determined by previous decisions.  I'll go make that change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Noah Misch
On Sun, Aug 09, 2015 at 06:44:58PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Sun, Aug 09, 2015 at 04:48:22PM -0400, Tom Lane wrote:
  I'm curious about your rationale for claiming that null predicate has
  precedence exactly equal to  according to the spec.
 
  Both null predicate and comparison predicate are in the set of 
  productions
  that take row value predicand arguments and appear only in predicate.
  Passing a production in that set as an argument to a production in that set
  requires parentheses.  To restate (non-associative) precedence equal more
  pedantically, there exists no conforming query whose interpretation hinges 
  on
  the relative precedence of IS NULL and .
 
 Ah.  So really, according to the spec IS and  could have any precedence
 relative to each other as long as there is no other construct between.
 Works for me.
 
  To my knowledge, SQL is agnostic about whether LIKE is an operator.  The 
  six
  comparison operators bind looser than common value expression syntax like
  * and ||, tighter than IS TRUE, and indistinguishable from predicate
  syntax like IS DISTINCT FROM and LIKE.
 
 Indistinguishable in the same sense as above, right?

Right.

 So for our
 purposes, it's better to keep BETWEEN and friends as binding slightly
 tighter than '' than to make them the same precedence.  Same precedence
 risks breaking things that weren't broken before.

It does risk that.  Same deal with making = have the same precedence as 
instead of keeping it slightly lower.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 SQL has two groups of IS tests with different precedence.  The boolean test
 productions IS [NOT] {TRUE | FALSE | UNKNOWN} have precedence just lower than
 , and the null predicate productions IS [NOT] NULL have precedence equal
 to .  (An implementation giving them the same precedence can conform,
 because conforming queries cannot notice the difference.)

I'm curious about your rationale for claiming that null predicate has
precedence exactly equal to  according to the spec.  AFAICS the SQL
spec doesn't really tell us much about precedence of different subparts
of the grammar; at best you can infer that some things bind tighter than
others.

 I attempted to catalog the diverse precedence changes in commit c6b3c93:

 1. Decrease precedence of =, = and  to match .

Check.

 2. Increase precedence of, for example, BETWEEN x AND Y to match precedence
with BETWEEN keyword instead of AND keyword.  Make similar precedence
changes to other multiple-keyword productions involving AND, NOT, etc.

Uh, no, I wouldn't have said that.  I decreased BETWEEN's precedence,
along with IN's, to be less than OVERLAPS' precedence, matching the
precedence of LIKE/ILIKE/SIMILAR.  (But see comment below about OVERLAPS.)
There was not any case where the AND would have determined its precedence
AFAICS.

 3. Decrease precedence of IS [NOT] {TRUE | FALSE | UNKNOWN} to fall between
NOT and .

Check.

 4. Decrease precedence of IS [NOT] NULL and IS[NOT]NULL to match IS [NOT]
{TRUE | FALSE | UNKNOWN}.

Check.

 5. Forbid chains of = (make it nonassoc), and increase its precedence to
match .

Check.

 6. Decrease precedence of BETWEEN and IN keywords to match LIKE.

Check.

 It's
 definitely weird that the IS tests bind more tightly than multicharacter
 Ops but less tightly than + - * /.

 (1), (2) and (3) improve SQL conformance, and that last sentence seems to
 explain your rationale for (4).

The real reason for (4) is that it would be very difficult to get bison to
consider IS TRUE to have different precedence (against an operator on its
left) than IS NULL; we'd need even more lookahead mess than we have now.
They were not different precedence before, and they aren't now.

I did change ISNULL and NOTNULL to have exactly the same precedence as the
long forms IS NULL and IS NOT NULL, where before they were (for no good
reason AFAICS) slightly different precedence.  I think that's justifiable
on the grounds that they should not act differently from the long forms.
In any case the SQL standard has nothing to teach us on the point, since
it doesn't admit these shorthands.

 I've been unable to explain (5) and (6).

I'm not following your concern about (5).  The spec seems to clearly
put all six basic comparison operators on the same precedence level.
I believe that the reason our grammar had '=' as right-associative is
someone's idea that we might someday consider assignment as a regular
operator a la C, but that never has been true and seems unlikely to
become true in the future.  There's certainly nothing in the spec
suggesting that '=' should be right-associative.

The reason for (6) was mainly that having IN/BETWEEN bind tighter than
LIKE doesn't seem to me to have any justification in the spec; moreover,
if it does bind tighter, we're delivering a boolean result to LIKE which
seems unlikely to be what anyone wants.  So I figured we could simplify
things a bit by having one (or two really) fewer precedence levels.
Also, because said level is %nonassoc, this will now force people to
parenthesize if they do indeed want something like a BETWEEN as argument
of a LIKE, which seems like a good thing all round.

 Why in particular the following three precedence groups instead of
 combining them as in SQL or subdividing further as in PostgreSQL 9.4?

 +%nonassoc   '' '' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
 +%nonassoc   BETWEEN IN_P LIKE ILIKE SIMILAR NOT_LA
 %nonassocOVERLAPS

I think that the spec is fairly clear that the six comparison operators
bind looser than other operators.  Now you could argue about whether LIKE
et al are operators but Postgres certainly treats them as such.

OVERLAPS is a special case in that it doesn't really need precedence at
all: both its arguments are required to be parenthesized.  We could
possibly have removed it from the precedence hierarchy altogether, but
I didn't bother experimenting with that, just left it alone.  But
because of that, moving BETWEEN/IN below it doesn't really change
anything.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Noah Misch
On Sun, Aug 09, 2015 at 04:48:22PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  SQL has two groups of IS tests with different precedence.  The boolean 
  test
  productions IS [NOT] {TRUE | FALSE | UNKNOWN} have precedence just lower 
  than
  , and the null predicate productions IS [NOT] NULL have precedence 
  equal
  to .  (An implementation giving them the same precedence can conform,
  because conforming queries cannot notice the difference.)
 
 I'm curious about your rationale for claiming that null predicate has
 precedence exactly equal to  according to the spec.  AFAICS the SQL
 spec doesn't really tell us much about precedence of different subparts
 of the grammar; at best you can infer that some things bind tighter than
 others.

Both null predicate and comparison predicate are in the set of productions
that take row value predicand arguments and appear only in predicate.
Passing a production in that set as an argument to a production in that set
requires parentheses.  To restate (non-associative) precedence equal more
pedantically, there exists no conforming query whose interpretation hinges on
the relative precedence of IS NULL and .

  2. Increase precedence of, for example, BETWEEN x AND Y to match 
  precedence
 with BETWEEN keyword instead of AND keyword.  Make similar precedence
 changes to other multiple-keyword productions involving AND, NOT, 
  etc.
 
 Uh, no, I wouldn't have said that.  I decreased BETWEEN's precedence,
 along with IN's, to be less than OVERLAPS' precedence, matching the
 precedence of LIKE/ILIKE/SIMILAR.  (But see comment below about OVERLAPS.)
 There was not any case where the AND would have determined its precedence
 AFAICS.

Ah.  I read this patch hunk carelessly:

   @@ -11420,7 +11436,7 @@ a_expr:   c_expr  
   { $$ = $1; }
 {
 $$ = (Node *) 
   makeSimpleA_Expr(AEXPR_OF, , $1, (Node *) $6, @2);
 }
   - | a_expr BETWEEN opt_asymmetric b_expr AND b_expr   
   %prec BETWEEN
   + | a_expr BETWEEN opt_asymmetric b_expr AND a_expr   
   %prec BETWEEN

That's allowing additional productions in the final BETWEEN operand, not
changing precedence.

  5. Forbid chains of = (make it nonassoc), and increase its precedence to
 match .
  6. Decrease precedence of BETWEEN and IN keywords to match LIKE.

  I've been unable to explain (5) and (6).
 
 I'm not following your concern about (5).  The spec seems to clearly
 put all six basic comparison operators on the same precedence level.
 [snipped rest of explanation]

No particular concern beyond wanting to know the rationale; thanks for writing
one.  Getting this wrong a second time would be awfully sad, so I'm being more
cautious than usual.

  Why in particular the following three precedence groups instead of
  combining them as in SQL or subdividing further as in PostgreSQL 9.4?
 
  +%nonassoc '' '' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
  +%nonassoc BETWEEN IN_P LIKE ILIKE SIMILAR NOT_LA
  %nonassoc  OVERLAPS
 
 I think that the spec is fairly clear that the six comparison operators
 bind looser than other operators.  Now you could argue about whether LIKE
 et al are operators but Postgres certainly treats them as such.

To my knowledge, SQL is agnostic about whether LIKE is an operator.  The six
comparison operators bind looser than common value expression syntax like
* and ||, tighter than IS TRUE, and indistinguishable from predicate
syntax like IS DISTINCT FROM and LIKE.

 OVERLAPS is a special case in that it doesn't really need precedence at
 all: both its arguments are required to be parenthesized.  We could
 possibly have removed it from the precedence hierarchy altogether, but
 I didn't bother experimenting with that, just left it alone.  But
 because of that, moving BETWEEN/IN below it doesn't really change
 anything.

Ah, quite right.  SQL OVERLAPS takes various forms of two-column input, but
PostgreSQL OVERLAPS is more particular.  I like your subsequent proposal to
remove OVERLAPS from the order of precedence.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Noah Misch
On Sun, Aug 09, 2015 at 07:16:02PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Sun, Aug 09, 2015 at 06:44:58PM -0400, Tom Lane wrote:
  So for our
  purposes, it's better to keep BETWEEN and friends as binding slightly
  tighter than '' than to make them the same precedence.  Same precedence
  risks breaking things that weren't broken before.
 
  It does risk that.  Same deal with making = have the same precedence as 
  
  instead of keeping it slightly lower.
 
 Agreed, but in that case I think our hand is forced by the SQL standard.

In SQL:2008 and SQL:2011 at least, =,  and BETWEEN are all in the same
boat.  They have no precedence relationships to each other; SQL sidesteps the
question by requiring parentheses.  They share a set of precedence
relationships to other constructs.  SQL does not imply whether to put them in
one %nonassoc precedence group or in a few, but we can contemplate whether
users prefer an error or prefer the 9.4 behavior for affected queries.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Moving SS_finalize_plan processing to the end of planning

2015-08-09 Thread Tom Lane
I've started to work on path-ification of the upper planner (finally),
and since that's going to be a large patch in any case, I've been looking
for pieces that could be bitten off and done separately.  One such piece
is the fact that SS_finalize_plan (computation of extParams/allParams)
currently gets run at the end of subquery_planner; as long as that's true,
we cannot have subquery_planner returning paths rather than concrete
plans.  The attached patch moves that processing to occur just before
set_plan_references is run.

Ideally I'd like to get rid of SS_finalize_plan altogether and merge its
work into set_plan_references, so as to save one traversal of the finished
plan tree.  However, that's a bit difficult because of the fact that the
traversal order is different: in SS_finalize_plan, we must visit subplans
before main plan, whereas set_plan_references wants to do the main plan
first.  (I experimented with changing that, but then the flat rangetable
comes out in a different order, with unpleasant side effects on the way
EXPLAIN prints things.)  Since that would be purely a minor performance
improvement, I set that goal aside for now.

Basically what this patch does is to divide what had been done in 
SS_finalize_plan into three steps:

* SS_identify_outer_params determines which outer-query-level Params will
be available for the current query level to use, and saves that aside in
a new PlannerInfo field root-outer_params.  This processing turns out
to be the only reason that SS_finalize_plan had to be called in
subquery_planner: we have to capture this data before exiting
subquery_planner because the upper levels' plan_params lists may change
as they plan other subplans.

* SS_attach_initplans does the work of attaching initplans to the parent
plan node and adjusting the parent's cost estimate accordingly.

* SS_finalize_plan now *only* does extParam/allParam calculation.

I had to change things around a bit in planagg.c (which was already a
hack, and the law of conservation of cruft applies).  Otherwise it's
pretty straightforward and removes some existing hacks.  One notable
point is that there's no longer an assumption within SS_finalize_plan
that initPlans can only appear in the top plan node.

Any objections?

regards, tom lane

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 7609183..a878498 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*** _outPlannerInfo(StringInfo str, const Pl
*** 1799,1804 
--- 1799,1805 
  	WRITE_NODE_FIELD(glob);
  	WRITE_UINT_FIELD(query_level);
  	WRITE_NODE_FIELD(plan_params);
+ 	WRITE_BITMAPSET_FIELD(outer_params);
  	WRITE_BITMAPSET_FIELD(all_baserels);
  	WRITE_BITMAPSET_FIELD(nullable_baserels);
  	WRITE_NODE_FIELD(join_rel_list);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index f461586..404c6f5 100644
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*** make_material(Plan *lefttree)
*** 4473,4483 
   * materialize_finished_plan: stick a Material node atop a completed plan
   *
   * There are a couple of places where we want to attach a Material node
!  * after completion of subquery_planner().  This currently requires hackery.
!  * Since subquery_planner has already run SS_finalize_plan on the subplan
!  * tree, we have to kluge up parameter lists for the Material node.
!  * Possibly this could be fixed by postponing SS_finalize_plan processing
!  * until setrefs.c is run?
   */
  Plan *
  materialize_finished_plan(Plan *subplan)
--- 4473,4479 
   * materialize_finished_plan: stick a Material node atop a completed plan
   *
   * There are a couple of places where we want to attach a Material node
!  * after completion of subquery_planner(), without any MaterialPath path.
   */
  Plan *
  materialize_finished_plan(Plan *subplan)
*** materialize_finished_plan(Plan *subplan)
*** 4498,4507 
  	matplan-plan_rows = subplan-plan_rows;
  	matplan-plan_width = subplan-plan_width;
  
- 	/* parameter kluge --- see comments above */
- 	matplan-extParam = bms_copy(subplan-extParam);
- 	matplan-allParam = bms_copy(subplan-allParam);
- 
  	return matplan;
  }
  
--- 4494,4499 
diff --git a/src/backend/optimizer/plan/planagg.c b/src/backend/optimizer/plan/planagg.c
index f0e9c05..a761cfd 100644
*** a/src/backend/optimizer/plan/planagg.c
--- b/src/backend/optimizer/plan/planagg.c
*** build_minmax_path(PlannerInfo *root, Min
*** 416,428 
  	 *		 WHERE col IS NOT NULL AND existing-quals
  	 *		 ORDER BY col ASC/DESC
  	 *		 LIMIT 1)
  	 *--
  	 */
  	subroot = (PlannerInfo *) palloc(sizeof(PlannerInfo));
  	memcpy(subroot, root, sizeof(PlannerInfo));
  	subroot-parse = parse = (Query *) copyObject(root-parse);
! 	/* make sure subroot planning won't change root-init_plans contents */
! 	subroot-init_plans = 

Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Sun, Aug 09, 2015 at 06:44:58PM -0400, Tom Lane wrote:
 So for our
 purposes, it's better to keep BETWEEN and friends as binding slightly
 tighter than '' than to make them the same precedence.  Same precedence
 risks breaking things that weren't broken before.

 It does risk that.  Same deal with making = have the same precedence as 
 instead of keeping it slightly lower.

Agreed, but in that case I think our hand is forced by the SQL standard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread Tom Lane
Daniel Verite dan...@manitou-mail.org writes:
 Tom Lane wrote:
 Is there a way to implement pivoting as a set-returning function?

 Not with the same ease of use. We have crosstab functions
 in contrib/tablefunc already, but the killer problem with PIVOT
 is that truly dynamic columns are never reachable directly.

I'm not sure how pushing it out to psql makes that better.  There is
no way to do further processing on something that psql has printed,
so you've punted on solving that issue just as much if not more.

I agree that the crosstab solution leaves a lot to be desired
as far as ease of use goes, but I don't want to define fixing its
ease-of-use problem as being it's easy for manual use in psql.
psql is a minority API, you know.

Besides which, psql is not all that great for looking at very wide tables,
so I'm not sure that this would be very useful for dynamic column sets
anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SCRAM authentication

2015-08-09 Thread Michael Paquier
On Mon, Aug 10, 2015 at 6:05 AM, Stephen Frost sfr...@snowman.net wrote:
 * Sehrope Sarkuni (sehr...@jackdb.com) wrote:
 It'd be nice if the new auth mechanism supports multiple passwords in the
 same format as well (not just one per format).

 That way you could have two different passwords for a user that are active
 at the same time. This would simplify rolling database credentials as it
 wouldn't have to be done all at once. You could add the new credentials,
 update your app servers one by one, then disable the old ones.

 A lot of systems that use API keys let you see the last time a particular
 set of keys was used. This helps answer the Is this going to break
 something if I disable it? question. Having a last used at timestamp for
 each auth mechanism (per user) would be useful.

 Excellent points and +1 to all of these ideas from me.

Interesting. I haven't thought of that and those are nice suggestions.
I am not convinced that this is something to tackle with a first
version of the patch though, I am sure we'll have enough problems to
deal with to get out a nice base usable for future improvements as
well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2015-08-09 Thread Amit Kapila
On Thu, Aug 6, 2015 at 12:24 PM, Haribabu Kommi kommi.harib...@gmail.com
wrote:

 On Mon, Jun 30, 2014 at 11:08 PM, Beena Emerson memissemer...@gmail.com
wrote:
 
  I also ran the test script after making the same configuration changes
that
  you have specified. I found that I was not able to get the same
performance
  difference that you have reported.
 
  Following table lists the tps in each scenario and the % increase in
  performance.
 
  Threads Head PatchedDiff
  1  1669  1718  3%
  2  2844  3195  12%
  4  3909  4915  26%
  8  7332  8329 14%
 


 coming back to this old thread.

 I just tried a new approach for this priority table, instead of a
 entirely separate buffer pool,
 Just try to use a some portion of shared buffers to priority tables
 using some GUC variable
 buffer_cache_ratio(0-75) to specify what percentage of shared
 buffers to be used.

 Syntax:

 create table tbl(f1 int) with(buffer_cache=true);

 Comparing earlier approach, I though of this approach is easier to
implement.
 But during the performance run, it didn't showed much improvement in
 performance.
 Here are the test results.


What is the configuration for test (RAM of m/c, shared_buffers,
scale_factor, etc.)?

  Threads HeadPatchedDiff
  1  3123  3238  3.68%
  2  5997  6261  4.40%
  4 11102   11407  2.75%

 I am suspecting that, this may because of buffer locks that are
 causing the problem.
 where as in older approach of different buffer pools, each buffer pool
 have it's own locks.
 I will try to collect the profile output and analyze the same.

 Any better ideas?


I think you should try to find out during test, for how many many pages,
it needs to perform clocksweep (add some new counter like
numBufferBackendClocksweep in BufferStrategyControl to find out the
same).  By theory your patch should reduce the number of times it needs
to perform clock sweep.

I think in this approach even if you make some buffers as non-replaceable
(buffers for which BM_BUFFER_CACHE_PAGE is set), still clock sweep
needs to access all the buffers.  I think we might want to find some way to
reduce that if this idea helps.

Another thing is that, this idea looks somewhat similar (although not same)
to current Ring Buffer concept, where Buffers for particular types of scan
uses buffers from Ring.  I think it is okay to prototype as you have done
in patch and we can consider to do something on those lines if at all
this patch's idea helps.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] WIP: SCRAM authentication

2015-08-09 Thread Michael Paquier
On Mon, Aug 10, 2015 at 3:42 AM, Josh Berkus j...@agliodbs.com wrote:
 On 08/09/2015 08:09 AM, Robert Haas wrote:
 Why do we need to be able to authenticate using more than one
 mechanism?  If you have some clients that can't support SCRAM yet, you
 might as well continue using MD5 across the board until that changes.
 You're not going to get much real security out of using MD5 for some
 authentication attempts and SCRAM for other ones,

 Speaking as someone who has sheperded several clients through
 infrastructure upgrades, I have to disagree with this.

 [...]

 and the amount of
 infrastructure we're proposing to introduce to support that is pretty
 substantial.

Maybe. But that's worth it IMO. I think that we should keep in mind as
well that we may not stick with SCRAM forever either and that we may
have to do a similar recommended-protocol switch at some point. Or
that we may have to implement additional authorization protocols in
parallel to what we have which would still require manipulation of
multiple verifiers per role.

 ... during my exchange with Michael, I was thinking about the bug
 potential of taking the password field and multiplexing it in some way,
 which is significant.  There is a definite risk of making this too
 complicated and we'll need to contrast that against ease-of-migration,
 because complicated mechanisms tend to be less secure due to user error.

Sure. That's why I am all in for adding a compatibility GUC or similar
that enforces the removal of old verifier types after marking those as
deprecated for a couple of years as there's surely a significant risk
to keep old passwords around or bad pg_hba entries. Still we need IMO
a way for a user to save multiple verifiers generated from a client to
manage carefully the password verifier aging, deprecations and support
removal.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] A \pivot command for psql

2015-08-09 Thread David Fetter
On Sun, Aug 09, 2015 at 07:29:40PM +0200, Daniel Verite wrote:
   Hi,
 
 I want to suggest a client-side \pivot command in psql, implemented
 in the attached patch.
 
 \pivot takes the current query in the buffer, execute it and
 display it pivoted by interpreting the result as:
 
 column1 = row in pivoted output
 column2 = column in pivoted output
 column3 = value at (row,column) in pivoted ouput

This is really neat work, and thanks for the patch.

This issue in this one as in the crosstab extension is that it's
available only if you are using some particular piece of extra
software, in this case the psql client.

I'm working up a proposal to add (UN)PIVOT support to the back-end.
Would you like to join in on that?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-08-09 Thread Michael Paquier
On Mon, Aug 10, 2015 at 12:39 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Aug 6, 2015 at 11:33 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 They also provide a level of control over what is and isn't installed in a
 cluster. Personally, I'd prefer that most users not even be aware of the
 existence of things like pageinspect.

 +1.

 [...]

 Extensions are a useful packaging mechanism for functionality that is
 useful but not required, and debugging facilities are definitely very
 useful but should not be required.

+1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mention column name in error messages

2015-08-09 Thread Franck Verrot
On Wed, Jul 1, 2015 at 12:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 What seems more likely to lead to a usable patch is to arrange for the
 extra information you want to be emitted as error context, via an error
 context callback that gets installed at the right times.  ...
 ...
 with no need for int8in to be directly aware of the context.  You should
 try adapting that methodology for the cases you're worried about.


Hi Tom (and others),

Sorry it took so long for me to follow up on this, hopefully I found a
couple
a hours today to try writing another patch.

In any case, thanks for reviewing my first attempt and taking time to write
such a detailed critique... I've learned a lot!

I am now using the error context callback stack. The current column name
and column type are passed to the callback packed inside  a new structure
of type TransformExprState.
Those information are then passed to `errhint` and will be presented to the
user later on (in case of coercion failure).


Please find the WIP patch attached.
(I've pushed the patch on my GH fork[1] too).

Thanks again,
Franck



[1]:
https://github.com/franckverrot/postgres/commit/73dd2cd096c91cee1b501d5f94ba81037de30fd1


0001-Report-column-for-which-type-coercion-fails.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert in pg_stat_statements

2015-08-09 Thread Satoshi Nagayasu
2015-08-10 0:04 GMT+09:00 Robert Haas robertmh...@gmail.com:
 On Sun, Aug 9, 2015 at 1:36 AM, Satoshi Nagayasu sn...@uptime.jp wrote:
 On 2015/08/08 22:32, Robert Haas wrote:
 On Sat, Aug 8, 2015 at 5:00 AM, Satoshi Nagayasu sn...@uptime.jp wrote:

 I just found that pg_stat_statements causes assert when queryId is
 set by other module, which is loaded prior to pg_stat_statements in
 the shared_preload_libraries parameter.

 Theoretically, queryId in the shared memory could be set by other
 module by design.

 So, IMHO, pg_stat_statements should not cause assert even if queryId
 already has non-zero value --- my module has this problem now.

 Is my understanding correct? Any comments?


 Seems like an ERROR would be more appropriate than an Assert.


 Well, it's not that simple, and I'm afraid that it may not fix
 the issue.

 Here, let's assume that we have two modules (foo and pg_stat_statements)
 which calculate, use and store Query-queryId independently.

 What we may see when two are enabled is following.

 (1) Enable two modules in shared_preload_libraries.

 shared_preload_libraries = 'foo,pg_stat_statements'

 (2) Once a query comes in, a callback function in module foo
 associated with post_parse_analyze_hook calculates and store
 queryId in Query-queryId.

 (3) After that, a callback function (pgss_post_parse_analyze) in
 pg_stat_statements associated with post_parse_analyze_hook
 detects that Query-queryId has non-zero value, and asserts it.

 As a result, we can not have two or more modules that use queryId
 at the same time.

 But I think it should be possible because one of the advantages of
 PostgreSQL is its extensibility.

 So, I think the problem here is the fact that pg_stat_statements
 deals with having non-zero value in queryId as error even if
 it has exact the same value with other module.

 I'm not too excited about supporting the use case where there are two
 people using queryId but it just so happens that they always set
 exactly the same value.  That seems like a weird setup.

I think so too, but unfortunately, I have to do that to work
with 9.4 and 9.5.

 Wouldn't that
 mean both modules were applying the same jumble algorithm, and
 therefore you're doing the work of computing the jumble twice per
 query?

Basically yes, because both modules should work not only together,
but also solely. So, my module need to have capability to calculate
queryId itself.

More precisely, if we have following configuration,

  shared_preload_libraries = 'foo,pg_stat_statements'

my module foo calculates queryId itself, and pg_stat_statements
would do the same. (and gets crashed when it has --enable-cassert)

If we have following configuration,

  shared_preload_libraries = 'pg_stat_statements,foo'

my module foo can skip queryId calculation because
pg_stat_statements has already done that.

Of course, my module foo should be able to work solely as following.

  shared_preload_libraries = 'foo'

 It would be better to have the second module have an option
 not to compute the query ID and just do whatever else it does.  Then,
 if you want to use that other module with pg_stat_statements, flip the
 GUC.

Yes, that's exactly what I'm doing in my module, and what I intended
in my first patch for pg_stat_statements on this thread. :)

Regards,
-- 
NAGAYASU Satoshi sn...@uptime.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert in pg_stat_statements

2015-08-09 Thread Satoshi Nagayasu
2015-08-10 2:23 GMT+09:00 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 I'm not too excited about supporting the use case where there are two
 people using queryId but it just so happens that they always set
 exactly the same value.  That seems like a weird setup.  Wouldn't that
 mean both modules were applying the same jumble algorithm, and
 therefore you're doing the work of computing the jumble twice per
 query?

 Not only that, but you'd need to keep the two modules in sync, which
 would be one of the greatest recipes for bugs we've thought of yet.

 If there's actually a use case for that sort of thing, I would vote
 for moving the jumble-calculation code into core and making both of
 the interested extensions do

 /* Calculate query hash if it's not been done yet */
 if (query-queryId == 0)
 calculate_query_hash(query);

I vote for this too.

Jumble-calculation is the smartest way to identify query groups,
so several modules can take advantages of it if in the core.

Regards,
-- 
Satoshi Nagayasu sn...@uptime.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-08-09 Thread Rahila Syed
Hello,

Say, 6 bigint counters, 6 float8
counters, and 3 strings up to 80 characters each.  So we have a
fixed-size chunk of shared memory per backend, and each backend that
wants to expose progress information can fill in those fields however
it likes, and we expose the results.
This would be sorta like the way pg_statistic works: the same columns
can be used for different purposes depending on what estimator will be
used to access them.

After thinking more on this suggestion, I came up with following generic
structure which can be used to store progress of any command per backend in
shared memory.

Struct PgBackendProgress
{
int32 *counter[COMMAND_NUM_SLOTS];
float8 *counter_float[COMMAND_NUM_SLOTS];

char *progress_message[COMMAND_NUM_SLOTS];
}

COMMAND_NUM_SLOTS will define maximum number of slots(phases)  for any
command.
Progress of command will be measured using progress of each phase in
command.
For some command the number of phases can be singular and rest of the slots
will be NULL.

Each phase will report n integer counters, n float counters and a progress
message.
For some phases , any of the above fields can be NULL.

For VACUUM , there can 3 phases as discussed in the earlier mails.

Phase 1. Report 2 integer counters: heap pages scanned and total heap
pages, 1 float counter: percentage_complete and progress message.
Phase 2. Report 2 integer counters: index pages scanned and total index
pages(across all indexes) and progress message.
Phase 3. 1 integer counter: heap pages vacuumed.

This structure can be accessed by statistics collector to display progress
via new view.

Thank you,
Rahila Syed


Re: [HACKERS] tap tests remove working directories

2015-08-09 Thread Michael Paquier
On Sun, Aug 9, 2015 at 11:19 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 08/09/2015 08:41 AM, Michael Paquier wrote:

 On Sun, Aug 9, 2015 at 1:40 AM, Andrew Dunstan and...@dunslane.net
 wrote:

 On 08/08/2015 09:31 AM, Robert Haas wrote:

 On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net
 wrote:

 That certainly isn't what happens, and given the way this is done in
 TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir()
 function,
 it's not clear how we could do that easily.

 shot-in-the-dark

 Set cleanup to false and manually remove the directory later in the
 code, so that stuff runs only if we haven't died sooner?

 /shot-in-the-dark

 Yeah, maybe. I'm thinking of trying to do it more globally, like in
 src/Makefile.global.in. That way we wouldn't have to add new code to
 every
 test file.

 If we rely on END to clean up the temporary data folder, there is no
 need to impact each test file, just the test functions called in
 TestLib.pm that could switch a flag to not perform any cleanup at the
 end of the run should an unexpected result be found. Now I am as well
 curious to see what you have in mind with manipulating
 Makefile.global.

 See attached. If you have a better idea please post your patch.

Sure. Attached is what I have in mind. Contrary to your version we
keep around temp paths should a run succeed after one that has failed
when running make check multiple times in a row. Perhaps it does not
matter much in practice as log files get removed at each new run but I
think that this allows a finer control in case of failure. Feel free
to have a look.
-- 
Michael
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 4927d45..11f774e 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -35,6 +35,7 @@ our @EXPORT = qw(
 
 use Cwd;
 use File::Basename;
+use File::Path qw(rmtree);
 use File::Spec;
 use File::Temp ();
 use IPC::Run qw(run start);
@@ -107,21 +108,24 @@ $ENV{PGPORT} = int($ENV{PGPORT}) % 65536;
 # Helper functions
 #
 
+my $temp_cleanup = 1;
+my @temp_dirs = ();
 
 sub tempdir
 {
-	return File::Temp::tempdir(
+	my $path = File::Temp::tempdir(
 		'tmp_test',
 		DIR = $ENV{TESTDIR} || cwd(),
-		CLEANUP = 1);
+		CLEANUP = 0);
+	push(@temp_dirs, $path);
+	return $path;
 }
 
 sub tempdir_short
 {
-
 	# Use a separate temp dir outside the build tree for the
 	# Unix-domain socket, to avoid file name length issues.
-	return File::Temp::tempdir(CLEANUP = 1);
+	return File::Temp::tempdir(CLEANUP = 0);
 }
 
 # Initialize a new cluster for testing.
@@ -217,6 +221,13 @@ END
 		system_log('pg_ctl', '-D', $test_server_datadir, '-m',
 		  'immediate', 'stop');
 	}
+
+	# Cleanup any temporary directory created
+	return if (!$temp_cleanup);
+	foreach my $temp_dir (@temp_dirs)
+	{
+		rmtree($temp_dir);
+	}
 }
 
 sub psql
@@ -278,14 +289,16 @@ sub command_ok
 {
 	my ($cmd, $test_name) = @_;
 	my $result = run_log($cmd);
-	ok($result, $test_name);
+	my $err_num = ok($result, $test_name);
+	$temp_cleanup = 0 if (!$err_num);
 }
 
 sub command_fails
 {
 	my ($cmd, $test_name) = @_;
 	my $result = run_log($cmd);
-	ok(!$result, $test_name);
+	my $err_num = ok(!$result, $test_name);
+	$temp_cleanup = 0 if (!$err_num);
 }
 
 sub command_exit_is
@@ -304,7 +317,8 @@ sub command_exit_is
 	# that, use $h-full_result on Windows instead.
 	my $result = ($Config{osname} eq MSWin32) ?
 		($h-full_results)[0] : $h-result(0);
-	is($result, $expected, $test_name);
+	my $err_num = is($result, $expected, $test_name);
+	$temp_cleanup = 0 if (!$err_num);
 }
 
 sub program_help_ok
@@ -313,9 +327,13 @@ sub program_help_ok
 	my ($stdout, $stderr);
 	print(# Running: $cmd --help\n);
 	my $result = run [ $cmd, '--help' ], '', \$stdout, '2', \$stderr;
-	ok($result, $cmd --help exit code 0);
-	isnt($stdout, '', $cmd --help goes to stdout);
-	is($stderr, '', $cmd --help nothing to stderr);
+	my $err_num;
+	$err_num = ok($result, $cmd --help exit code 0);
+	$temp_cleanup = 0 if (!$err_num);
+	$err_num = isnt($stdout, '', $cmd --help goes to stdout);
+	$temp_cleanup = 0 if (!$err_num);
+	$err_num = is($stderr, '', $cmd --help nothing to stderr);
+	$temp_cleanup = 0 if (!$err_num);
 }
 
 sub program_version_ok
@@ -324,9 +342,13 @@ sub program_version_ok
 	my ($stdout, $stderr);
 	print(# Running: $cmd --version\n);
 	my $result = run [ $cmd, '--version' ], '', \$stdout, '2', \$stderr;
-	ok($result, $cmd --version exit code 0);
-	isnt($stdout, '', $cmd --version goes to stdout);
-	is($stderr, '', $cmd --version nothing to stderr);
+	my $err_num;
+	$err_num = ok($result, $cmd --version exit code 0);
+	$temp_cleanup = 0 if (!$err_num);
+	$err_num = isnt($stdout, '', $cmd --version goes to stdout);
+	$temp_cleanup = 0 if (!$err_num);
+	$err_num = is($stderr, '', $cmd --version nothing to stderr);
+	$temp_cleanup = 0 if (!$err_num);
 }
 
 sub program_options_handling_ok
@@ -335,20 +357,27 @@ sub program_options_handling_ok
 	my ($stdout, $stderr);
 	print(# Running: 

Re: [HACKERS] Precedence of standard comparison operators

2015-08-09 Thread Noah Misch
On Sun, Aug 09, 2015 at 08:06:11PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  In SQL:2008 and SQL:2011 at least, =,  and BETWEEN are all in the 
  same
  boat.  They have no precedence relationships to each other; SQL sidesteps 
  the
  question by requiring parentheses.  They share a set of precedence
  relationships to other constructs.  SQL does not imply whether to put them 
  in
  one %nonassoc precedence group or in a few, but we can contemplate whether
  users prefer an error or prefer the 9.4 behavior for affected queries.
 
 Part of my thinking was that the 9.4 behavior fails the principle of least
 astonishment, because I seriously doubt that people expect '=' to be
 either right-associative or lower priority than ''.  Here's one example:
 
 regression=# select false = true  false;
  ?column? 
 --
  t
 (1 row)

 So yeah, I do think that getting a syntax error if you don't use
 parentheses is the preferable behavior here.

I agree.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tap tests remove working directories

2015-08-09 Thread Andrew Dunstan



On 08/09/2015 08:41 AM, Michael Paquier wrote:

On Sun, Aug 9, 2015 at 1:40 AM, Andrew Dunstan and...@dunslane.net wrote:

On 08/08/2015 09:31 AM, Robert Haas wrote:

On Fri, Aug 7, 2015 at 7:17 PM, Andrew Dunstan and...@dunslane.net
wrote:

That certainly isn't what happens, and given the way this is done in
TestLib.pm, using the CLEANUP parameter of File::Temp's tempdir()
function,
it's not clear how we could do that easily.

shot-in-the-dark

Set cleanup to false and manually remove the directory later in the
code, so that stuff runs only if we haven't died sooner?

/shot-in-the-dark


Yeah, maybe. I'm thinking of trying to do it more globally, like in
src/Makefile.global.in. That way we wouldn't have to add new code to every
test file.

If we rely on END to clean up the temporary data folder, there is no
need to impact each test file, just the test functions called in
TestLib.pm that could switch a flag to not perform any cleanup at the
end of the run should an unexpected result be found. Now I am as well
curious to see what you have in mind with manipulating
Makefile.global.



See attached. If you have a better idea please post your patch.

cheers

andrew
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 39a7879..b379376 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -349,12 +349,12 @@ endef
 ifeq ($(enable_tap_tests),yes)
 
 define prove_installcheck
-cd $(srcdir)  TESTDIR='$(CURDIR)' PATH=$(bindir):$$PATH PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) t/*.pl
+cd $(srcdir)  TESTDIR='$(CURDIR)' PATH=$(bindir):$$PATH PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) t/*.pl  rm -rf $(CURDIR)/tmp_test*
 endef
 
 define prove_check
 rm -rf $(CURDIR)/tmp_check/log
-cd $(srcdir)  TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) t/*.pl
+cd $(srcdir)  TESTDIR='$(CURDIR)' $(with_temp_install) PGPORT='6$(DEF_PGPORT)' PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) t/*.pl  rm -rf $(CURDIR)/tmp_test*
 endef
 
 else
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 4927d45..174566c 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -113,7 +113,7 @@ sub tempdir
 	return File::Temp::tempdir(
 		'tmp_test',
 		DIR = $ENV{TESTDIR} || cwd(),
-		CLEANUP = 1);
+		CLEANUP = 0);
 }
 
 sub tempdir_short

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers