Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-01-27 Thread Fabien COELHO


Hello Corey,


And here it is


About the patch v3:

## DOCUMENTATION

I'm wondering what pg would do on "EXISTS(SELECT 1 FROM customer)" if 
there are many employees. EXPLAIN suggests a seq_scan, which seems bad. 
With "(SELECT COUNT(*) FROM pgbench_accounts) <> 0" pg seems to generate 
an index only scan on a large table, so maybe it is a better way to do it. 
It seems strange that there is no better way to do that in a relational 
tool, the relational model being an extension of set theory... and there 
is no easy way (?) to check whether a set is empty.


"""If an EOF is reached on the main file or an 
\include-ed file before all 
\if-\endif are matched, then psql 
will raise an error."""


In sentence above: "before all" -> "before all local"? "then" -> ""?

"other options booleans" -> "other booleans of options"? or "options' 
booleans" maybe, but that is for people?


"unabigous" -> "unambiguous"

I think that the three paragraph explanation about non evaluation could be 
factor into one, maybe something like: """Lines within false branches are 
not evaluated in any way: queries are not sent to the server, non 
conditional commands are not evaluated but bluntly ignored, nested if 
expressions in such branches are also not evaluated but are tallied to 
check for nesting."""


I would suggest to merge elif/else constraints by describing what is 
expected rather than what is not expected. """An \if command may contain 
any number of \elif clauses and may end with one \else clause""".



## CODE

In "read_boolean_expression":

 + if (value)

"if (value != NULL)" is usually used, I think.

 + if (value == NULL)
 +   return false; /* not set -> assume "off" */

This is dead code, because value has been checked to be non NULL a few 
lines above.


 + (pg_strncasecmp(value, "on", (len > 2 ? len : 2)) == 0)
 + (pg_strncasecmp(value, "off", (len > 2 ? len : 2)) == 0)

Hmmm, not easy to parse. Maybe it deserves a comment?
"check at least two chars to distinguish on & off"

",action" -> ", action" (space after commas).

The "result" is not set on errors, but maybe it should be set to false 
anyway and explicitely, instead of relying on some prior initialization?

Or document that the result is not set on errors.

if command:

  if (is active) {
success = ...
if (success) {
  ...
}
  }
  if (success) {
...
  }

The second test on success may not rely on the value set above. That looks 
very strange. ISTM that the state should be pushed whether parsing 
succeeded or not. Moreover, it results in:


  \if ERROR
 \echo X
  \else
 \echo Y
  \endif

having both X & Y printed and error reported on else and endif. I think 
that an expression error should just put the stuff in ignore state.



On "else" when in state ignored, ISTM that it should remain in state 
ignore, not switch to else-false.



Comment about "IFSTATE_FALSE" talks about the state being true, maybe a 
copy-paste error?


In comments: "... variables the branch" -> "variables if the branch"

The "if_endifs_balanced" variable is not very useful. Maybe just the test 
and error reporting in the right place:


 if (... && !psqlscan_branch_empty(scan_state))
   psql_error("found EOF before closing \\endif(s)\n");



 +
  #endif   /* MAINLOOP_H */


 -
 /*
  * Main processing loop for reading lines of input
  * and sending them to the backend.

Do not add/remove empty lines in places unrelated to the patch?


History saving: I'm wondering whether all read line should be put into 
history, whether executed or not.


Is it possible to make some of the added functions static? If so, do it.

I checked that it does stop on errors with -v ON_ERROR_STOP=1. However I 
would be more at ease if this was tested somewhere.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal: EXPLAIN ANALYZE formatting

2017-01-27 Thread Pavel Stehule
Hi

Now EXPLAIN ANALYZE produce too wide rows for usage in presentations

What do you think about possibility to implement >>optional<< alternative
formatting.

Now:

  node name (estimation) (actual)

Alternative:

  node name (estimation)
   (actual)


Regards

Pavel
Now
===

 postgres=# explain analyze select * from obce join okresy on obce.okres_id = 
okresy.id where okresy.nazev = 'Benešov';
QUERY PLAN  
  
---
 Nested Loop  (cost=0.28..15.19 rows=81 width=58) (actual time=0.067..0.246 
rows=114 loops=1)
   ->  Seq Scan on okresy  (cost=0.00..1.96 rows=1 width=17) (actual 
time=0.025..0.053 rows=1 loops=1)
 Filter: (nazev = 'Benešov'::text)
 Rows Removed by Filter: 76
   ->  Index Scan using obce_okres_id_idx on obce  (cost=0.28..12.41 rows=81 
width=41) (actual time=0.032..0.116 rows=114 loops=1)
 Index Cond: ((okres_id)::text = okresy.id)
 Planning time: 0.940 ms
 Execution time: 0.359 ms
(8 rows)

Alternative
===


 postgres=# explain analyze select * from obce join okresy on obce.okres_id = 
okresy.id where okresy.nazev = 'Benešov';
   QUERY PLAN   
 

 Nested Loop  (cost=0.28..15.19 rows=81 width=58)
  (actual time=0.067..0.246 rows=114 loops=1)
   ->  Seq Scan on okresy  (cost=0.00..1.96 rows=1 width=17)
   (actual time=0.025..0.053 rows=1 loops=1)
 Filter: (nazev = 'Benešov'::text)
 Rows Removed by Filter: 76
   ->  Index Scan using obce_okres_id_idx on obce  (cost=0.28..12.41 rows=81 
width=41)
   (actual time=0.032..0.116 
rows=114 loops=1)
 Index Cond: ((okres_id)::text = okresy.id)
 Planning time: 0.940 ms
 Execution time: 0.359 ms
(8 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-27 Thread Craig Ringer
On 28 Jan. 2017 02:08, "Tom Lane"  wrote:

Kyotaro HORIGUCHI  writes:
> By the way the existing comment for the hook,

>> *
>> * We provide a function hook variable that lets loadable plugins get
>> * control when ProcessUtility is called.  Such a plugin would normally
>> * call standard_ProcessUtility().
>> */

> This is quite a matter of course for developers. planner_hook and
> other ones don't have such a comment or have a one-liner at
> most. Since this hook gets a good amount of commnet, it seems
> better to just remove the existing one..

Hm?  I see pretty much the same wording for planner_hook:

 * To support loadable plugins that monitor or modify planner behavior,
 * we provide a hook variable that lets a plugin get control before and
 * after the standard planning process.  The plugin would normally call
 * standard_planner().

I think other places have copied-and-pasted this as well.

The real problem with this is it isn't a correct specification of the
contract: in most cases, the right thing to do is more like "call the
previous occupant of the ProcessUtility_hook, or standard_ProcessUtility
if there was none."

Not sure if it's worth trying to be more precise in these comments,
but if we do something about it, we need to hit all the places with
this wording.


I'd rather leave it for the hooks documentation work that's being done
elsewhere and have a README.hooks or something that discusses patterns
common across hooks. Then we can just reference it.

Otherwise we'll have a lot of repetitive comments. Especially once we
mention that you can also ERROR out or (potentially) take no action at all.


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread David Steele

On 1/27/17 6:47 PM, Michael Paquier wrote:

On Sat, Jan 28, 2017 at 8:03 AM, Peter Eisentraut
 wrote:

On 1/26/17 2:05 PM, Robert Haas wrote:

I do not think it can be right to rename the directory and not
anything else.


I think this is the root of the confusion.

A lot of people apparently consented to renaming pg_xlog with the
understanding that that's it, whereas other people understood it as
consensus for renaming everything.

You really have (at least) three options here:

1. Rename nothing
2. Rename directory only
3. Rename everything

and you need to count the votes of each pair separately.


4. Rename everything with aliases.

I would vote for 4., or 3. for consistency if 4. is not an eligible choice.


I vote for 3.

The problem I have with aliases is that they would need to be done 
across the board.  At the least, we would need function aliases, 
symlinks for the binaries (which would rneed to be done by the 
packagers), aliases for the command line options, and lots of notations 
in the documentation.


All of this would need to be preserved more or less indefinitely, 
because if we can't decide on a break now it's not likely to happen later.


--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-01-27 Thread Peter Eisentraut
On 1/27/17 6:11 PM, Andres Freund wrote:
> On 2017-01-27 09:09:36 -0500, Peter Eisentraut wrote:
>> My preferred scenario would be to replace the Windows build system by
>> this first, then refine it, then get rid of Autoconf.
>>
>> The ideal timeline would be to have a ready patch to commit early in a
>> development cycle, then get rid of the Windows build system by the end
>> of it.  Naturally, this would need buy-in from Windows developers.
>>
>> I don't foresee replacing the Autoconf build system by this immediately.
> 
> I'm very strongly against this path, it seems way too likely that we'll
> end up with yet another fragile thing that nobody from the *nix side
> will be able to test.

That's a fair concern, but at least with CMake, someone from the *nix
side *can* test it, whereas right now it's completely separate.

What kind of strategy do you have in mind?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing link-time cross-module refs in contrib

2017-01-27 Thread Noah Misch
On Mon, Oct 03, 2016 at 12:29:18PM -0400, Tom Lane wrote:
> Pursuant to Andres' suggestion in
> https://www.postgresql.org/message-id/20161002223927.57xns3arkdg4h...@alap3.anarazel.de
> attached is a draft patch that gets rid of link-time references
> from hstore_plpython to both hstore and plpython.  I've verified
> that this allows "LOAD 'hstore_plpython'" to succeed in a fresh
> session without having loaded the prerequisite modules first.

I like how that turned out.  However, ...

> *** a/contrib/hstore_plpython/Makefile
> --- b/contrib/hstore_plpython/Makefile

> --- 23,40 
>   include $(top_srcdir)/contrib/contrib-global.mk
>   endif
>   
> ! # We must link libpython explicitly
>   ifeq ($(PORTNAME), aix)
>   rpathdir = $(pkglibdir):$(python_libdir)

... adding $(pkglibdir) to rpath is obsolete, now that this ceased to link to
hstore explicitly.

> ! SHLIB_LINK += $(python_libspec) $(python_additional_libs)
> ! else
>   ifeq ($(PORTNAME), win32)
> ! # ... see silliness in plpython Makefile ...
> ! SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a))
> ! else
> ! rpathdir = $(python_libdir)
> ! SHLIB_LINK += $(python_libspec)

For consistency with longstanding src/pl/plpython practice, $(python_libspec)
should always have an accompanying $(python_additional_libs).  This matters on
few configurations.

I propose to clean up both points as attached.  Tested on AIX, which ceases to
be a special case.
diff --git a/contrib/hstore_plperl/Makefile b/contrib/hstore_plperl/Makefile
index 41d3435..34e1e13 100644
--- a/contrib/hstore_plperl/Makefile
+++ b/contrib/hstore_plperl/Makefile
@@ -24,10 +24,6 @@ include $(top_srcdir)/contrib/contrib-global.mk
 endif
 
 # We must link libperl explicitly
-ifeq ($(PORTNAME), aix)
-rpathdir = $(pkglibdir):$(perl_archlibexp)/CORE
-SHLIB_LINK += $(perl_embed_ldflags)
-else
 ifeq ($(PORTNAME), win32)
 # these settings are the same as for plperl
 override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment
@@ -37,7 +33,6 @@ else
 rpathdir = $(perl_archlibexp)/CORE
 SHLIB_LINK += $(perl_embed_ldflags)
 endif
-endif
 
 # As with plperl we need to make sure that the CORE directory is included
 # last, probably because it sometimes contains some header files with names
diff --git a/contrib/hstore_plpython/Makefile b/contrib/hstore_plpython/Makefile
index a55c9a1..7ff787a 100644
--- a/contrib/hstore_plpython/Makefile
+++ b/contrib/hstore_plpython/Makefile
@@ -24,17 +24,12 @@ include $(top_srcdir)/contrib/contrib-global.mk
 endif
 
 # We must link libpython explicitly
-ifeq ($(PORTNAME), aix)
-rpathdir = $(pkglibdir):$(python_libdir)
-SHLIB_LINK += $(python_libspec) $(python_additional_libs)
-else
 ifeq ($(PORTNAME), win32)
 # ... see silliness in plpython Makefile ...
 SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a))
 else
 rpathdir = $(python_libdir)
-SHLIB_LINK += $(python_libspec)
-endif
+SHLIB_LINK += $(python_libspec) $(python_additional_libs)
 endif
 
 REGRESS_OPTS += --load-extension=hstore
diff --git a/contrib/ltree_plpython/Makefile b/contrib/ltree_plpython/Makefile
index c45b7c2..bc7502b 100644
--- a/contrib/ltree_plpython/Makefile
+++ b/contrib/ltree_plpython/Makefile
@@ -24,17 +24,12 @@ include $(top_srcdir)/contrib/contrib-global.mk
 endif
 
 # We must link libpython explicitly
-ifeq ($(PORTNAME), aix)
-rpathdir = $(pkglibdir):$(python_libdir)
-SHLIB_LINK += $(python_libspec) $(python_additional_libs)
-else
 ifeq ($(PORTNAME), win32)
 # ... see silliness in plpython Makefile ...
 SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a))
 else
 rpathdir = $(python_libdir)
-SHLIB_LINK += $(python_libspec)
-endif
+SHLIB_LINK += $(python_libspec) $(python_additional_libs)
 endif
 
 REGRESS_OPTS += --load-extension=ltree

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY as a set returning function

2017-01-27 Thread Peter Eisentraut
On 1/27/17 8:07 PM, David Fetter wrote:
> There are still neither regression tests nor SGML documentation.
> 
> Are we at a point where we should add these things?

One could argue about the documentation at this point, since the
function is somewhat self-documenting for the time being.  But surely
there should be tests.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY as a set returning function

2017-01-27 Thread Peter Eisentraut
On 1/25/17 8:51 PM, Corey Huinker wrote:
> # select * from copy_srf('echo "x\ty"',true) as t(x text, y text);

I find these parameters weird.  Just looking at this, one has no idea
what the "true" means.  Why not have a "filename" and a "program"
parameter and make them mutually exclusive?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-27 Thread Greg Stark
On 27 January 2017 at 14:52, Thomas Kellerer  wrote:
>
> I don't have the exact syntax at hand, but it's something like this:
>
> create distinct type customer_id_type as integer;
> create distinct type order_id_type as integer;
>
> create table customers (id customer_id_type primary key);
> create table orders (id order_id_type primary key, customer_id
> customer_id_type not null);

That seems like a useful thing but it's not exactly the same use case.

Measurements with units and currency amounts both have the property
that you are likely to want to have a single column that uses
different units for different rows. You can aggregate across them
without converting as long as you have an appropriate where clause or
group by clause -- GROUP BY units_of(debit_amount) for example.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-01-27 Thread Peter Geoghegan
Hi Thomas,

On Fri, Jan 27, 2017 at 5:03 PM, Thomas Munro
 wrote:
> I have broken this up into a patch series, harmonised the private vs
> shared hash table code paths better and fixed many things including
> the problems with rescans and regression tests mentioned upthread.
> You'll see that one of the patches is that throwaway BufFile
> import/export facility, which I'll replace with your code as
> discussed.

I'll try to get back to this ASAP, but expect to be somewhat busy next
week. Next week will be my last week at Heroku.

It was not an easy decision for me to leave Heroku, but I felt it was
time for a change. I am very grateful to have had the opportunity. I
have learned an awful lot during my time at the company. It has been
excellent to have an employer that has been so supportive of my work
on Postgres this whole time.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY as a set returning function

2017-01-27 Thread David Fetter
On Wed, Jan 25, 2017 at 08:51:38PM -0500, Corey Huinker wrote:
> I've put in some more work on this patch, mostly just taking Alvaro's
> suggestions, which resulted in big code savings.

The patch applies atop master.

The test (ls) which previously crashed the backend now doesn't, and
works in a reasonable way.

The description of the function still talks about its being a proof of
concept.

There are still neither regression tests nor SGML documentation.

Are we at a point where we should add these things?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: [[Parallel] Shared] Hash

2017-01-27 Thread Thomas Munro
On Fri, Jan 13, 2017 at 2:36 PM, Peter Geoghegan  wrote:
> [...]
> Yeah. That's basically what the BufFile unification process can
> provide you with (or will, once I get around to implementing the
> refcount thing, which shouldn't be too hard). As already noted, I'll
> also want to make it defer creation of a leader-owned segment, unless
> and until that proves necessary, which it never will for hash join.

Hi Peter,

I have broken this up into a patch series, harmonised the private vs
shared hash table code paths better and fixed many things including
the problems with rescans and regression tests mentioned upthread.
You'll see that one of the patches is that throwaway BufFile
import/export facility, which I'll replace with your code as
discussed.

The three 'refactor' patches change the existing hash join code to
work in terms of chunks in more places.  These may be improvements in
their own right, but mainly they pave the way for parallelism.  The
later patches introduce single-batch and then multi-batch shared
tables.

The patches in the attached tarball are:

0001-nail-down-regression-test-row-order-v4.patch:

A couple of regression tests would fail with later refactoring that
changes the order of unmatched rows emitted by hash joins.  So first,
let's fix that by adding ORDER BY in those places, without any code
changes.

0002-hj-add-dtrace-probes-v4.patch:

Before making any code changes, let's add some dtrace probes so that
we can measure time spent doing different phases of hash join work
before and after the later changes.  The main problem with the probes
as I have them here (and the extra probes inserted by later patches in
the series) is that interesting query plans contain multiple hash
joins so these get all mixed up when you're trying to measure stuff,
so perhaps I should pass executor node IDs into all the probes.  More
on this later.  (If people don't want dtrace probes in the executor,
I'm happy to omit them and maintain that kind of thing locally for my
own testing purposes...)

0003-hj-refactor-memory-accounting-v4.patch:

Modify the existing hash join code to work in terms of chunks when
estimating and later tracking memory usage.  This is probably more
accurate than the current tuple-based approach, because it tries to
take into account the space used by chunk headers and the wasted space
in chunks.  In practice the difference is probably small, but it's
arguably more accurate;  I did this because I need chunk-based
accounting the later patches.  Also, make HASH_CHUNK_SIZE the actual
size of allocated chunks (ie the header information is included in
that size so we allocate exactly 32KB, not 32KB + a bit, for the
benefit of the dsa allocator which otherwise finishes up allocating
36KB).

0004-hj-refactor-batch-increases-v4.patch:

Modify the existing hash join code to detect work_mem exhaustion at
the point where chunks are allocated, instead of checking after every
tuple insertion.  This matches the logic used for estimating, and more
importantly allows for some parallelism in later patches.

0005-hj-refactor-unmatched-v4.patch:

Modifies the existing hash join code to handle unmatched tuples in
right/full joins chunk-by-chunk.  This is probably a cache-friendlier
scan order anyway, but the real goal is to provide a natural grain for
parallelism in a later patch.

0006-hj-barrier-v4.patch:

The patch from a nearby thread previously presented as a dependency of
this project.  It might as well be considered part of this patch
series.

0007-hj-exec-detach-node-v4.patch

By the time ExecEndNode() runs in workers, ExecShutdownNode() has
already run.  That's done on purpose because, for example, the hash
table needs to survive longer than the parallel environment to allow
EXPLAIN to peek at it.  But it means that the Gather node has thrown
out the shared memory before any parallel-aware node below it gets to
run its Shutdown and End methods.  So I invented ExecDetachNode()
which runs before ExecShutdownNode(), giving parallel-aware nodes a
chance to say goodbye before their shared memory vanishes.  Better
ideas?

0008-hj-shared-single-batch-v4.patch:

Introduces hash joins with "Shared Hash" and "Parallel Shared Hash"
nodes, for single-batch joins only.  If the planner has a partial
inner plan, it'll pick a Parallel Shared Hash plan to divide that over
K participants.  Failing that, if the planner has a parallel-safe
inner plan and thinks that it can avoid batching by using work_mem * K
memory (shared by all K participants), it will now use a Shared Hash.
Otherwise it'll typically use a Hash plan as before.  Without the
later patches, it will blow through work_mem * K if it turns out to
have underestimated the hash table size, because it lacks
infrastructure for dealing with batches.

The trickiest thing at this point in the series is that participants
(workers and the leader) can show up at any time, so there are three
places that provide synchronisation with a parallel 

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread Michael Paquier
On Sat, Jan 28, 2017 at 8:03 AM, Peter Eisentraut
 wrote:
> On 1/26/17 2:05 PM, Robert Haas wrote:
>> I do not think it can be right to rename the directory and not
>> anything else.
>
> I think this is the root of the confusion.
>
> A lot of people apparently consented to renaming pg_xlog with the
> understanding that that's it, whereas other people understood it as
> consensus for renaming everything.
>
> You really have (at least) three options here:
>
> 1. Rename nothing
> 2. Rename directory only
> 3. Rename everything
>
> and you need to count the votes of each pair separately.

4. Rename everything with aliases.

I would vote for 4., or 3. for consistency if 4. is not an eligible choice.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-27 Thread Tom Lane
Haribabu Kommi  writes:
> [ pg_hba_rules_13.patch ]

I spent awhile hacking on this, and made a lot of things better, but
I'm still very unhappy about the state of the comments.  You changed
the APIs of a bunch of functions, often into fairly subtle things,
and you did not touch even one of their API-specification comments.
As an example, next_token() now needs something like

"On error, log a message at ereport level elevel and set *err_msg to
an error string.  Note that the return value might be either true or
false after an error; *err_msg must be checked to determine that.
Hence, *err_msg had better be NULL on entry, or you won't be able
to tell."

Having to write such a thing might even convince you that you should
try a little harder to make the behavior less confusing.  Just adding
arguments, and not changing the result-value specification, is not
necessarily the best way to do this.

I haven't looked at the docs yet.

I'm still not very happy about the choice of view name ...

regards, tom lane

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 086fafc..3f4724c 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***
*** 7804,7809 
--- 7804,7814 
   
  
   
+   pg_hba_rules
+   summary of client authentication configuration file contents
+  
+ 
+  
pg_group
groups of database users
   
***
*** 8352,8357 
--- 8357,8481 
  
  
  
+  
+   pg_hba_rules
+ 
+   
+pg_hba_rules
+   
+ 
+   
+The view pg_hba_rules provides a summary of
+the contents of the client authentication configuration file.  A row 
+appears in this view for each entry appearing in the file, with annotations
+indicating whether the rule could be applied successfully.
+   
+ 
+   
+pg_hba_rules Columns
+ 
+   
+
+ 
+  Name
+  Type
+  Description
+ 
+
+
+ 
+  line_number
+  integer
+  
+   Line number of the client authentication rule in
+   pg_hba.conf file
+  
+ 
+ 
+  type
+  text
+  Type of connection
+ 
+ 
+  database
+  text[]
+  List of database names
+ 
+ 
+  user_name
+  text[]
+  List of user names
+ 
+ 
+  address
+  text
+  
+   Address specifies the set of hosts the record matches.
+   It can be a host name, or it is made up of an IP address
+   or keywords such as (all, 
+   samehost and samenet).
+  
+ 
+ 
+  netmask
+  text
+  Address mask if exist
+ 
+ 
+  auth_method
+  text
+  Authentication method
+ 
+ 
+  options
+  text[]
+  Configuration options set for authentication method
+ 
+ 
+  error
+  text
+  
+   If not null, an error message indicates why this
+   rule could not be loaded.
+  
+ 
+
+   
+   
+ 
+   
+error field, if not NULL, describes problem
+in the rule on the line line_number.
+Following is the sample output of the view.
+   
+   
+ 
+ SELECT line_number, type, database, user_name, auth_method FROM pg_hba_rules;
+ 
+ 
+ 
+  line_number | type  |  database  | user_name  |   address| auth_method 
+ -+---+++--+-
+   84 | local | {all}  | {all}  |  | trust
+   86 | host  | {sameuser} | {postgres} | all  | trust
+   88 | host  | {postgres} | {postgres} | ::1  | trust
+  111 | host  | {all}  | {all}  | 127.0.0.1| trust
+  121 | host  | {all}  | {all}  | localhost| trust
+  128 | host  | {postgres} | {all}  | samenet  | ident
+  134 | host  | {postgres} | {all}  | samehost | md5
+  140 | host  | {db1,db2}  | {all}  | .example.com | md5
+  149 | host  | {test} | {test} | 192.168.54.1 | reject
+  159 | host  | {all}  | {+support} | 192.168.0.0  | ident
+  169 | local | {sameuser} | {all}  |  | md5
+ (11 rows)
+ 
+ 
+   
+See  for more information about the various
+ways to change client authentication configuration.
+   
+  
+ 
   
pg_group
  
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index dda5891..f20486c 100644
*** a/doc/src/sgml/client-auth.sgml
--- b/doc/src/sgml/client-auth.sgml
***
*** 54,59 
--- 54,66 
database user names and OS user names.
   
  
+  
+   The system view
+   pg_hba_rules
+   can be helpful for pre-testing changes to the client authentication configuration file, or for
+   diagnosing problems if loading of file did not have the desired effects.
+  
+ 
   
The pg_hba.conf File
  
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 4dfedf8..d920a72 

Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread Andres Freund
On 2017-01-27 18:06:11 -0500, Peter Eisentraut wrote:
> On 1/24/17 4:47 PM, Robert Haas wrote:
> > I'm not excited about starting to change pg_clog before we finish with
> > xlog -> wal.  Then we just have two half-done things, IMO.  But I'm
> > also not the only one with a commit bit.
> 
> I think that depends on which way you slice the overall mission.  You
> appear to think the mission is to have everything named consistently.  I
> thought the mission was to have no directories named *log*.

+1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-01-27 Thread Andres Freund
On 2017-01-27 09:09:36 -0500, Peter Eisentraut wrote:
> My preferred scenario would be to replace the Windows build system by
> this first, then refine it, then get rid of Autoconf.
> 
> The ideal timeline would be to have a ready patch to commit early in a
> development cycle, then get rid of the Windows build system by the end
> of it.  Naturally, this would need buy-in from Windows developers.
> 
> I don't foresee replacing the Autoconf build system by this immediately.

I'm very strongly against this path, it seems way too likely that we'll
end up with yet another fragile thing that nobody from the *nix side
will be able to test.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread Michael Paquier
On Sat, Jan 28, 2017 at 8:06 AM, Peter Eisentraut
 wrote:
> On 1/24/17 4:47 PM, Robert Haas wrote:
>> I'm not excited about starting to change pg_clog before we finish with
>> xlog -> wal.  Then we just have two half-done things, IMO.  But I'm
>> also not the only one with a commit bit.
>
> I think that depends on which way you slice the overall mission.  You
> appear to think the mission is to have everything named consistently.  I
> thought the mission was to have no directories named *log*.

Those are two different missions. Deciding what to do for the renaming
of pg_xlog does not prevent pg_clog and pg_subtrans to be renamed as
no functions and no binaries use those terms.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread Peter Eisentraut
On 1/24/17 4:47 PM, Robert Haas wrote:
> I'm not excited about starting to change pg_clog before we finish with
> xlog -> wal.  Then we just have two half-done things, IMO.  But I'm
> also not the only one with a commit bit.

I think that depends on which way you slice the overall mission.  You
appear to think the mission is to have everything named consistently.  I
thought the mission was to have no directories named *log*.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread Peter Eisentraut
On 1/26/17 1:10 PM, Tom Lane wrote:
> pg_reset_wal

It's really more pg_reset_controldata, isn't it?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing the default WAL segment size

2017-01-27 Thread Michael Paquier
On Sat, Jan 28, 2017 at 7:29 AM, Robert Haas  wrote:
> On Thu, Jan 26, 2017 at 8:53 PM, Michael Paquier
>  wrote:
>>> I've not done like the most careful review ever, but I'm in favor of the
>>> general change (provided the byval thing is fixed obviously).
>>
>> Thanks for the review.
>
> Why not use pg_ltoa and pg_lltoa like the output functions for the datatype 
> do?

No particular reason.

> Might use CStringGetTextDatum(blah) instead of
> PointerGetDatum(cstring_to_text(blah)).

Yes, thanks.
-- 
Michael


refactor-repl-cmd-output-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-01-27 Thread Peter Eisentraut
On 1/26/17 2:05 PM, Robert Haas wrote:
> I do not think it can be right to rename the directory and not
> anything else.

I think this is the root of the confusion.

A lot of people apparently consented to renaming pg_xlog with the
understanding that that's it, whereas other people understood it as
consensus for renaming everything.

You really have (at least) three options here:

1. Rename nothing
2. Rename directory only
3. Rename everything

and you need to count the votes of each pair separately.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-27 Thread Peter van Hardenberg
On Fri, Jan 27, 2017 at 2:48 PM, Jim Nasby  wrote:

> On 1/27/17 8:17 AM, Brad DeJong wrote:
>
>> Add the potential for regulatory requirements to change at any time -
>> sort of like timezone information. So no hard coded behavior.
>>
>
> Well, I wish we had support for storing those changing requirements as
> well. If we had that it would greatly simplify having a timestamp type that
> stores the original timezone.
>
> BTW, time itself fits in the multi-unit pattern, since months don't have a
> fixed conversion to days (and technically seconds don't have a fixed
> conversion to anything thanks to leap seconds).


I agree with Jim here.

I think we don't need to solve all the possible currency problems to have a
useful type. I'll reiterate what I think is the key point here:

A currency type should work like a wallet. If I have 20USD in my wallet and
I put 20EUR in the wallet, I have 20USD and 20EUR in the wallet, not 42USD
(or whatever the conversion rate is these days). If I want to convert those
to a single currency, I need to perform an operation.

If we had this as a basic building block, support for some of the major
currency formats, and a function that a user could call (think of the way
we justify_interval sums of intervals to account for the ambiguities in day
lengths and so on), I think we'd have a pretty useful type.

As to Tom's point, conversion rates do not vary with time, they vary with
time, space, vendor, whether you're buying or selling, and in what
quantity, and so on. We can give people the tools to more easily and
accurately execute this math without actually building a whole financial
tool suite in the first release.

I'll also note that in the absence of progress here, users continue to get
bad advice about using the existing MONEY type such as here:
http://stackoverflow.com/questions/15726535/postgresql-which-datatype-should-be-used-for-currency

-- 
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut


Re: [HACKERS] Add support to COMMENT ON CURRENT DATABASE

2017-01-27 Thread Peter Eisentraut
On 1/26/17 1:20 PM, Fabrízio de Royes Mello wrote:
> Ok, but doing in that way the syntax would be:
> 
> COMMENT ON DATABASE CURRENT_DATABASE IS 'comment';

Yes, that's right.  We also have ALTER USER CURRENT_USER already.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-01-27 Thread Michael Paquier
On Fri, Jan 27, 2017 at 11:09 PM, Peter Eisentraut
 wrote:
> On 1/24/17 8:37 AM, Tom Lane wrote:
>> Craig Ringer  writes:
>>> Personally I think we should aim to have this in as a non default build
>>> mode in pg10 if it can be made ready, and aim to make it default in pg11 at
>>> least for Windows.
>>
>> AFAIK we haven't committed to accepting this at all, let alone trying
>> to do so on a tight schedule.  And I believe there was general agreement
>> that we would not accept it as something to maintain in parallel with
>> the existing makefiles.  If we have to maintain two build systems, we
>> have that already.
>
> My preferred scenario would be to replace the Windows build system by
> this first, then refine it, then get rid of Autoconf.
>
> The ideal timeline would be to have a ready patch to commit early in a
> development cycle, then get rid of the Windows build system by the end
> of it.  Naturally, this would need buy-in from Windows developers.

This looks like a really good plan to me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow interrupts on waiting standby

2017-01-27 Thread Michael Paquier
On Sat, Jan 28, 2017 at 2:18 AM, Robert Haas  wrote:
> On Fri, Jan 27, 2017 at 8:17 AM, Michael Paquier
>  wrote:
>> There are no default clauses in the pgstat_get_wait_* routines so my
>> compiler is actually complaining...
>
> That's exactly WHY there are no default clauses there.   :-)

And that's exactly why I missed them! ;p
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-27 Thread Haribabu Kommi
On Sat, Jan 28, 2017 at 5:47 AM, Tom Lane  wrote:

> Haribabu Kommi  writes:
> > On Fri, Jan 27, 2017 at 1:36 AM, Tom Lane  wrote:
> >> It might make sense to proceed by writing a separate patch that just
> >> refactors the existing code to have an API like that, and then revise
> >> this patch to add an error message field to the per-line struct.  Or
> >> maybe that's just extra work, not sure.
>
> > Here I attached tokenize_file refactor patch to return single linked list
> > that contains a structure and rebased pg_hba_rules patch on top it
> > by adding an error message to that structure to hold the errors occurred
> > during tokenization..
>
> I pushed the first patch with some revisions.  You had the TokenizedLine
> struct containing something that was still a three-level-nesting list,
> whereas it only needs to be two levels, and you hadn't updated any of
> the comments that the patch falsified.  Also, I figured we might as well
> pass the TokenizedLine struct as-is to parse_hba_line and
> parse_ident_line, because that was going to be the next step anyway
> so they could pass back error messages.
>

sorry for missing to update comments. I also thought of reducing the list
level after sending the patch.

Off to look at the second patch ...
>

Used TokenizeLine->err_msg variable only to return the error message
from parse_hba_line.

Attached a rebased patch on the latest master hopefully.

Regards,
Hari Babu
Fujitsu Australia


pg_hba_rules_14.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-27 Thread Jim Nasby

On 1/27/17 8:17 AM, Brad DeJong wrote:

Add the potential for regulatory requirements to change at any time - sort of 
like timezone information. So no hard coded behavior.


Well, I wish we had support for storing those changing requirements as 
well. If we had that it would greatly simplify having a timestamp type 
that stores the original timezone.


BTW, time itself fits in the multi-unit pattern, since months don't have 
a fixed conversion to days (and technically seconds don't have a fixed 
conversion to anything thanks to leap seconds).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Superowners

2017-01-27 Thread Peter Eisentraut
On 1/26/17 1:25 PM, Simon Riggs wrote:
> That should include the ability to dump all objects, yet without any
> security details. And it should allow someone to setup logical
> replication easily, including both trigger based and new logical
> replication. And GRANT ON ALL should work.

This basically sounds like a GRANT $privilege ON ALL $objecttype TO
$user.  So you could have a user that can read everything, for example.

This kind of thing has been asked for many times, but that quieted down
when the default privileges feature appeared.  I think it would still be
useful.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-01-27 Thread Peter Eisentraut
On 1/18/17 10:45 AM, Jesper Pedersen wrote:
> Fixed in this version:
> 
> * verify_hash_page: Display magic in hex, like hash_metapage_info
> * Update header for hash_page_type
> 
> Moving the patch back to 'Needs Review'.

Please include tests in your patch.  I have posted a sample test suite
in
,
which you could use.

Also, as mentioned before, hash_metapage_info result fields spares and
mapp should be arrays of integer, not a text string.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-01-27 Thread Corey Huinker
On Thu, Jan 26, 2017 at 4:06 PM, Corey Huinker 
wrote:

>
>
> On Thu, Jan 26, 2017 at 3:55 PM, Fabien COELHO 
> wrote:
>
>>
>> Hello Daniel,
>>
>> A comment about control flow and variables: in branches that are not
>>> taken, variables are expanded nonetheless, in a way that can be surprising.
>>> Case in point:
>>>
>>> \set var 'ab''cd'
>>> -- select :var;
>>> \if false
>>>  select :var ;
>>> \else
>>>  select 1;
>>> \endif
>>>
>>> To avoid that kind of trouble, would it make sense not to expand
>>> variables in untaken branches?
>>>
>>
>> Hmmm. This case is somehow contrived (for a string, :'var' could be used,
>> in which case escaping would avoid the hazard), but I think that what you
>> suggest is a better behavior, if easy to implement.
>>
>> --
>> Fabien.
>>
>
> Good question, Daniel. Variable expansion seems to be done via
> psql_get_variable which is invoked via callback, which means that I might
> have to move branch_block_active into PsqlSettings. That's slightly
> different because the existing boolean is scoped with MainLoop(), but
> there's no way to get to a new MainLoop unless you're in an executing
> branch, and no way to legally exit a MainLoop with an unbalanced if-endif
> state. In short, I think it's better behavior. It does prevent someone from
> setting a variable to '\endif' and expecting that to work, like this:
>
> select case
>  when random() < 0.5 then '\endif'
>  else E'\else\n\echo fooled you\n\endif'
> end as contrived_metaprogramming
> \gset
>
> \if false
>   :contrived_metaprogramming
>
>
> I'm sure someone will argue that preventing that is a good thing. Unless
> someone sees a good reason not to move that PsqlSettings, I'll make that
> change.
>
>
And here it is
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 640fe12..fcf265b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2035,6 +2035,78 @@ hello 10
 
   
 
+  
+\if expr
+\elif expr
+\else
+\endif
+
+
+This group of commands implements nestable conditional blocks, like
+this:
+
+
+SELECT
+EXISTS(SELECT 1 FROM customer) as has_customers,
+EXISTS(SELECT 1 FROM employee) as has_employees
+\gset
+\if :has_users
+SELECT * FROM customer ORDER BY creation_date LIMIT 5;
+\elif :has_employees
+\echo 'no customers found'
+SELECT * FROM employee ORDER BY creation_date LIMIT 5;
+\else
+\if yes
+\echo 'No customers or employees'
+\else
+\echo 'this should never print'
+\endif
+\endif
+
+
+Conditional blocks must begin with a \if and end
+with an \endif, and the pairs must be found in
+the same source file. If an EOF is reached on the main file or an
+\include-ed file before all 
+\if-\endif are matched, then
+psql will raise an error.
+
+
+The \if and \elif commands
+read the rest of the line and evaluate it as a boolean expression.
+Currently, expressions are limited to a single unquoted string
+which is evaluated like other options booleans, so the valid values
+are any unabiguous case insensitive matches for one of:
+true, false, 
1,
+0, on, off,
+yes, no.  So 
+t, T, and tR
+will all match true.
+
+
+Queries within a false branch of a conditional block will not be
+sent to the server.
+
+
+Non-conditional \-commands within a false branch
+of a conditional block will not be evaluated for correctness. The
+command will be ignored along with all remaining input to the end
+of the line.
+
+
+Expressions on \if and \elif
+commands within a false branch of a conditional block will not be
+evaluated.
+
+
+A conditional block can at most one \else command.
+
+
+The \elif command cannot follow the
+\else command.
+
+
+  
 
   
 \ir or \include_relative 
filename
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 0c164a3..8235015 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -49,6 +49,7 @@
 #include "psqlscanslash.h"
 #include "settings.h"
 #include "variables.h"
+#include "fe_utils/psqlscan_int.h"
 
 /*
  * Editable database object types.
@@ -132,7 +133,7 @@ HandleSlashCmds(PsqlScanState scan_state,
status = PSQL_CMD_ERROR;
}
 
-   if (status != PSQL_CMD_ERROR)
+   if (status != PSQL_CMD_ERROR && pset.active_branch)
{
/* eat any remaining arguments after a valid command */
/* note we suppress evaluation of backticks here */
@@ -194,6 +195,68 @@ read_connect_arg(PsqlScanState scan_state)
return result;
 }
 
+/*

Re: [HACKERS] increasing the default WAL segment size

2017-01-27 Thread Robert Haas
On Thu, Jan 26, 2017 at 8:53 PM, Michael Paquier
 wrote:
>> I've not done like the most careful review ever, but I'm in favor of the
>> general change (provided the byval thing is fixed obviously).
>
> Thanks for the review.

Why not use pg_ltoa and pg_lltoa like the output functions for the datatype do?

Might use CStringGetTextDatum(blah) instead of
PointerGetDatum(cstring_to_text(blah)).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-01-27 Thread Jim Nasby

On 1/26/17 10:11 PM, Beena Emerson wrote:

In that case, we could add the file location parameter.  By default it
would store in the cluster directory else in the location provided. We
can update this parameter in standby for it to access the file.


I don't see file location being as useful in this case. What would be 
more useful is being able to forcibly load blocks into shared buffers so 
that you didn't need to restart.


Hmm, it occurs to me that could be accomplished by providing an SRF that 
returned the contents of the current save file.


In any case, I strongly suggest focusing on the issues that have already 
been identified before trying to add more features.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo in comment in postgres_fdw.c

2017-01-27 Thread Robert Haas
On Thu, Jan 26, 2017 at 10:45 PM, Etsuro Fujita
 wrote:
> I ran into a typo in a comment in contrib/postgres_fdw/postgres_fdw.c.
> Attached is a small patch for fixing that.

Committed, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

2017-01-27 Thread David G. Johnston
On Fri, Jan 27, 2017 at 3:13 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> In any case the more idiomatic way of writing your query these days (since
> 9.4 came out) is:
>
> SELECT *
> FROM pg_constraint pc
> LEFT JOIN LATERAL generate_series(1, case when contype in ('f','p','u')
> then array_upper(pc.conkey, 1) else 0 end) gs ON true;
>
>
Supposedly ​should work back to 9.3​, mis-remembered when LATERAL was
released.

David J.


Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

2017-01-27 Thread David G. Johnston
On Fri, Jan 27, 2017 at 5:28 AM, Rushabh Lathia 
wrote:

> Consider the below test;
>
> CREATE TABLE tab ( a int primary key);
>
> SELECT  *
> FROM pg_constraint pc,
> CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1,
> array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;
>
> Above query is failing with "set-valued function called in context that
> cannot
> accept a set". But if I remove the CASE from the query then it working
> just good.
>
> Like:
>
> SELECT  *
> FROM pg_constraint pc,
> CAST(generate_series(1, array_upper(pc.conkey, 1)) AS int) AS position;
>
> This started failing with 69f4b9c85f168ae006929eec44fc44d569e846b9. It
> seems
> check_srf_call_placement() sets the hasTargetSRFs flag and but when the
> SRFs
> at the rtable ofcourse this flag doesn't get set. It seems like missing
> something
> their, but I might be completely wrong as not quire aware of this area.
>
>
I'm a bit surprised that your query actually works...and without delving
into source code its hard to explain why it should/shouldn't or whether the
recent SRF work was intended to impact it.

In any case the more idiomatic way of writing your query these days (since
9.4 came out) is:

SELECT *
FROM pg_constraint pc
LEFT JOIN LATERAL generate_series(1, case when contype in ('f','p','u')
then array_upper(pc.conkey, 1) else 0 end) gs ON true;

generate_series is smart enough to return an empty set (instead of erroring
out) when provided with (1,0) as arguments.

David J.


Re: [HACKERS] Parallel bitmap heap scan

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 1:32 AM, Dilip Kumar  wrote:
> There is just one line change in 0003 compared to older version, all
> other patches are the same.

I spent some time looking at 0001 (and how those changes are used in
0003) and I thought it looked good, so I committed 0001.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 3:18 PM, Peter Eisentraut
 wrote:
> On 1/26/17 11:11 PM, Beena Emerson wrote:
>> In that case, we could add the file location parameter.  By default it
>> would store in the cluster directory else in the location provided. We
>> can update this parameter in standby for it to access the file.
>
> I don't see the need for that.

+1.  That seems like over-engineering this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-27 Thread Robert Haas
On Fri, Jan 20, 2017 at 4:01 PM, Tom Lane  wrote:
> * I'm not really on board with patches modifying pgindent/typedefs.list
> retail.  To my mind that file represents the typedefs used the last
> time we pgindent'd the whole tree, and if you want an up-to-date list
> you should ask the buildfarm.  Otherwise there's just too much confusion
> stemming from the fact that not everybody updates it when patching.
>
> My own workflow for reindenting patches goes more like
> curl https://buildfarm.postgresql.org/cgi-bin/typedefs.pl -o my-typedefs.list
> ... manually edit my-typedefs.list to add any new typedefs from patch ...
> pgindent --typedefs=my-typedefs.list target-files

Andres and I -- among others -- have been patching typedefs.list
retail for a while now.  I think it makes life a lot easier.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] privileges for changing schema owner

2017-01-27 Thread Peter Eisentraut
Normally, when changing the owner of an object, we check (among other
things) that the new owner has the same privileges that would be needed
to create the object from scratch.  For for example, when changing the
owner of a type, the new owner needs to have CREATE privilege on the
containing schema.  Or when changing the owner of a foreign server, the
new owner needs to have USAGE privilege on the foreign-data wrapper.

The exception is that when changing the owner of a schema or database,
we check CREATE privilege on the database of the *current* user.  There
is even a comment about it in the code:

 * NOTE: This is different from other alter-owner checks in that the
 * current user is checked for create privileges instead of the
 * destination owner.  This is consistent with the CREATE case for
 * schemas.

I don't understand the rationale for this or what rationale that last
sentence is apparently trying to give.

I'm trying to extrapolate whatever rule this is to new object types, if
appropriate.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-01-27 Thread Peter Eisentraut
On 1/26/17 11:11 PM, Beena Emerson wrote:
> In that case, we could add the file location parameter.  By default it
> would store in the cluster directory else in the location provided. We
> can update this parameter in standby for it to access the file.

I don't see the need for that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2017-01-27 Thread Robert Haas
On Mon, Jan 23, 2017 at 1:03 AM, Amit Kapila  wrote:
> parallel_index_opt_exec_support_v6 - Removed the usage of
> GatherSupportsBackwardScan.  Expanded the comments in
> ExecReScanIndexScan.

I looked through this and in general it looks reasonable to me.
However, I did notice one thing that I think is wrong.  In the
parallel bitmap heap scan patch, the second argument to
compute_parallel_worker() is the number of pages that the parallel
scan is expected to fetch from the heap.  In this patch, it's the
total number of pages in the index.  The former seems to me to be
better, because the point of having a threshold relation size for
parallelism is that we don't want to use a lot of workers to scan a
small number of pages -- the distribution of work won't be even, and
the potential savings are limited.  If we've got a big index but are
using a very selective qual to pull out only one or a small number of
rows on a single page or a small handful of pages, we shouldn't
generate a parallel path for that.

Now, against that theory, the GUC that controls the behavior of
compute_parallel_worker() is called min_parallel_relation_size, which
might make you think that the decision is supposed to be based on the
whole size of some relation.  But I think that just means we need to
rename the GUC to something like min_parallel_scan_size.  Possibly we
also ought consider reducing the default value somewhat, because it
seems like both sequential and index scans can benefit even when
scanning less than 8MB.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 2:00 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jan 27, 2017 at 1:39 PM, Tom Lane  wrote:
>>> Um ... what's that got to do with the point at hand?
>
>> So I assumed from that that the issue was that you'd have to wait for
>> the first time the irrelevant-joinqual got satisfied before the
>> optimization kicked in.
>
> No, the problem is that that needs to happen for *each* outer row,
> and there's only one chance for it to happen.  Given the previous
> example,
>
> select ... from t1 left join t2 on t1.x = t2.x and t1.y < t2.y
>
> once we've found an x match for a given outer row, there aren't going to
> be any more and we should move on to the next outer row.  But as the patch
> stands, we only recognize that if t1.y < t2.y happens to be true for that
> particular row pair.  Otherwise we'll keep searching and we'll never find
> another match for that outer row.  So if the y condition is, say, 50%
> selective then the optimization only wins for 50% of the outer rows
> (that have an x partner in the first place).
>
> Now certainly that's better than a sharp stick in the eye, and
> maybe we should just press forward anyway.  But it feels like
> this is leaving a lot more on the table than I originally thought.
> Especially for the inner-join case, where *all* the WHERE conditions
> get a chance to break the optimization this way.

OK, now I understand why you were concerned. Given the size of some of
the speedups David's reported on this thread, I'd be tempted to press
forward even if no solution to this part of the problem presents
itself, but I also agree with you that it's leaving quite a bit on the
table if we can't do better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Tom Lane
Robert Haas  writes:
> On Fri, Jan 27, 2017 at 1:39 PM, Tom Lane  wrote:
>> Um ... what's that got to do with the point at hand?

> So I assumed from that that the issue was that you'd have to wait for
> the first time the irrelevant-joinqual got satisfied before the
> optimization kicked in.

No, the problem is that that needs to happen for *each* outer row,
and there's only one chance for it to happen.  Given the previous
example,

select ... from t1 left join t2 on t1.x = t2.x and t1.y < t2.y

once we've found an x match for a given outer row, there aren't going to
be any more and we should move on to the next outer row.  But as the patch
stands, we only recognize that if t1.y < t2.y happens to be true for that
particular row pair.  Otherwise we'll keep searching and we'll never find
another match for that outer row.  So if the y condition is, say, 50%
selective then the optimization only wins for 50% of the outer rows
(that have an x partner in the first place).

Now certainly that's better than a sharp stick in the eye, and
maybe we should just press forward anyway.  But it feels like
this is leaving a lot more on the table than I originally thought.
Especially for the inner-join case, where *all* the WHERE conditions
get a chance to break the optimization this way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 1:39 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jan 27, 2017 at 11:44 AM, Tom Lane  wrote:
>>> I'm afraid though that we may have to do something about the
>>> irrelevant-joinquals issue in order for this to be of much real-world
>>> use for inner joins.
>
>> Maybe, but it's certainly not the case that all inner joins are highly
>> selective.  There are plenty of inner joins in real-world applications
>> where the join product is 10% or 20% or 50% of the size of the larger
>> input.
>
> Um ... what's that got to do with the point at hand?

I thought it was directly relevant, but maybe I'm confused.  Further
up in that email, you wrote: "If there's a unique index on t2.x, we'll
be able to mark the join inner-unique.  However, short-circuiting
would only occur after finding a row that passes both joinquals.  If
the y condition is true for only a few rows, this would pretty nearly
disable the optimization.  Ideally we would short-circuit after
testing the x condition only, but there's no provision for that."

So I assumed from that that the issue was that you'd have to wait for
the first time the irrelevant-joinqual got satisfied before the
optimization kicked in.  But, if the join is emitting lots of rows,
that'll happen pretty quickly.  I mean, if the join emits even as many
20 rows, the time after the first one is, all things being equal, 95%
of the runtime of the join.  There could certainly be bad cases where
it takes a long time to produce the first row, but I wouldn't say
that's a particularly common thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-27 Thread Tom Lane
Haribabu Kommi  writes:
> On Fri, Jan 27, 2017 at 1:36 AM, Tom Lane  wrote:
>> It might make sense to proceed by writing a separate patch that just
>> refactors the existing code to have an API like that, and then revise
>> this patch to add an error message field to the per-line struct.  Or
>> maybe that's just extra work, not sure.

> Here I attached tokenize_file refactor patch to return single linked list
> that contains a structure and rebased pg_hba_rules patch on top it
> by adding an error message to that structure to hold the errors occurred
> during tokenization..

I pushed the first patch with some revisions.  You had the TokenizedLine
struct containing something that was still a three-level-nesting list,
whereas it only needs to be two levels, and you hadn't updated any of
the comments that the patch falsified.  Also, I figured we might as well
pass the TokenizedLine struct as-is to parse_hba_line and
parse_ident_line, because that was going to be the next step anyway
so they could pass back error messages.

Off to look at the second patch ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Tom Lane
Robert Haas  writes:
> On Fri, Jan 27, 2017 at 11:44 AM, Tom Lane  wrote:
>> I'm afraid though that we may have to do something about the
>> irrelevant-joinquals issue in order for this to be of much real-world
>> use for inner joins.

> Maybe, but it's certainly not the case that all inner joins are highly
> selective.  There are plenty of inner joins in real-world applications
> where the join product is 10% or 20% or 50% of the size of the larger
> input.

Um ... what's that got to do with the point at hand?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 12:32 PM, Stephen Frost  wrote:
> You're completely ignoring the use-cases for which these are being done.
>
> I've outlined the precise use-case for pgstattuple()'s usage across the
> entire database for which the admin has granted the EXECUTE access in.
> I've not yet seen a use-case for access to pg_ls_dir() across all
> directories pg_ls_dir() can access.  My recollection is that you brought
> up pg_wal, but that's hardly every directory, and some hypothetical tool
> which could intelligently figure out what orphaned files exist.  For the
> former, I would recommend a function for exactly that (or perhaps
> something better which provides more than just a count of files), for
> the latter, that's something that I would be very worried that someone
> trying to implement outside of core would get wrong or which could be
> version-dependent.  We've already got some code in core to find files
> left over from a crash and deal with them and perhaps that could be
> expanded to deal with other cases.

I agree that those things could be done other ways, but I don't think
that's a valid argument against what I'm proposing.  Sure, for any
given use case, you could come up with a way that the use case could
be satisfied without access to pg_ls_dir().  No question about it.
What I don't understand is why we should particularly want to go to
that trouble.  pg_ls_dir() has been serving authors of monitoring
tools well for many years, it presents minimal security risks, and it
could be useful for other purposes.  A new thing won't be immediately
adopted, has no real advantage on the security front because
pg_ls_dir() isn't actually dangerous, and is by design single-purpose.

I am completely flummoxed by the idea that giving out pg_ls_dir()
access to all directories it can access -- namely the data and log
directories -- is some kind of massive security problem, whereas
giving out superuser access for the same purpose apparently is no
problem at all.  And that is CLEARLY what is happening.  The
documentation for check_postgres.pl recommends it!

> With an appropriate use-case for it, I wouldn't be against it.  I linked
> to both use-cases and a provided patch for finer-grained access to
> pg_ls_dir() and friends three years ago, which got shot down.  I'm not
> against it if the community wishes to reconsider that decision and
> support having filesystem-like access where there's an appropriate
> use-case for it, and with fine-grained access control provided to meet
> that use-case.

I hope you're not saying that you plan to hold this patch hostage
until I agree to something you want to do, because that would be
uncool.  At any rate, I don't even think that patch really got shot
down; the major complaint about that patch was that it arrived out of
nowhere, fully formed, with no prior discussion of the design, and
several people - including me - were concerned that you might commit
it before consensus had been reached, as you had recently done with
another large patch that I was in the middle of reviewing.  Aside from
the "please don't commit this because it isn't agreed" objection,
there were comments like (1) DIRALIAS is not a word, but all other SQL
objects have names which are words or phrases and (2) hey, maybe we
could use a couple of GUCs for this instead of inventing a whole new
SQL object type and (3) what about adding some syntax for COPY that
uses these directory alias things, none of which qualify as trying to
stop the project dead in its tracks.  The only person who was dead set
against the whole concept was Tom, and several people including me
expressed at least some doubt about whether the problems with the idea
were so intractable as he suggested.

For the record, I'm not as keen on the GUC idea as I was when we had
this previous discussion.  This is really a privilege and probably
deserves to be handled as part of the privileges system rather than
crammed into a GUC mechanism which isn't intended for that purpose.  I
am also somewhat more in favor of the idea overall than I was back
then; as we work toward giving users as little privilege as possible,
it's certainly useful to be able to let a monitoring user read the
logs but not the data directory, or the toplevel files (i.e. config
files) in the data directory but not files in subdirectories.   But I
really can't see a lick of need for that kind of fine-grained control
for pg_ls_dir() or pg_stat_files(), which are six kinds of harmless,
and I'm not sure that pg_read_file() permissions really need to be as
granular as allowing each directory on the filesystem to have
individual permissions.  There are a couple of broad categories that
might need to be distinguished: log file, config file, data file, file
I want to COPY, but there might be ways to handle those cases that are
simpler than a full-blown new SQL object type.

> Further, as it relates to goal-posts, if your goal is to let an admin
> grant out the 

Re: [HACKERS] pg_background contrib module proposal

2017-01-27 Thread Andrew Borodin
2017-01-27 19:14 GMT+05:00 Peter Eisentraut :
> I suppose we should decide first whether we want pg_background as a
> separate extension or rather pursue extending dblink as proposed elsewhere.
>
> I don't know if pg_background allows any use case that dblink can't
> handle (yet).
pg_background in it's current version is just a start of a feature.
The question is: are they coherent in desired features? I do not know.
E.g. will it be possible to copy from stdin in dblink and possible
incarnations of pg_background functionality?)

2017-01-27 19:38 GMT+05:00 Robert Haas :
> For the record, I have no big problem with extending dblink to allow
> this instead of adding pg_background.  But I think we should try to
> get one or the other done in time for this release.
+1!
that's why I hesitate between not saying my points and making
controversy...need to settle it somehow.

Parallelism is a "selling" feature, everything has to be parallel for
a decade already (don't we have parallel sequential scan yet?).
It's fine to go with dblink, but dblink docs start with roughly "this
is an outdated substandard feature"(not a direct quote[0)].
What will we add there? "Do not use dblink for linking to databases.
This is the standard for doing concurrency." ?
Please excuse me for exaggeration. BTW, pg_background do not have docs at all.

[0] https://www.postgresql.org/docs/devel/static/dblink.html

Best regards, Andrey Borodin.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Dave Page


> On 27 Jan 2017, at 17:39, Stephen Frost  wrote:
> 
> Greetings,
> 
> * Simon Riggs (si...@2ndquadrant.com) wrote:
>>> On 27 January 2017 at 14:09, Dave Page  wrote:
 On Fri, Jan 27, 2017 at 1:18 PM, Simon Riggs  wrote:
 
 If the monitoring tool requires superuser then that is a problem, so
 it would be helpful if it didn't do that, please. Not much use having
 a cool tool if it don't work with the server.
>>> 
>>> Sure, that's what I want - to provide the management and monitoring
>>> capabilities without requiring superuser. Limiting the capability of
>>> the tools is not an option when you talk to users - however for some
>>> of them, having to use full superuser accounts is a problem as well
>>> (especially for those who are used to other DBMSs that do offer more
>>> fine-grained permissions).
> 
> Right, I'm all about providing fine-grained permissions and granting
> those out to monitoring users, but we need to have an understanding of
> what the monitoring really needs (and doesn't need...) to be able to
> ensure that the fine-grained permission system which is built matches
> those needs and allows the admin to grant out exactly the rights needed.
> 
 The management and monitoring tool could be more specific about what
 it actually needs, rather than simply requesting generic read and
 write against the filesystem. Then we can put those specific things
 into the server and we can all be happy. Again, a detailed list would
 help here.
>>> 
>>> Agreed - I do need to do that, and it's on my (extremely long) list.
>>> I'm just chiming in on this thread as requested!
> 
> That would certainly be really nice to have.  I have some ideas, and
> I've been meaning to try and work towards them, but knowing what other
> monitoring systems do would be great.
> 
>> So I think it would be useful to have two modes in tools, one where
>> they know they have superuser and one where they know we don't have
>> it. At least we'll know we can't do certain things rather than just
>> have them fail.
> 
> Having such a flag in monitoring tools where it makes sense sounds
> reasonable to me, though there isn't really anything different for the
> backend to do to support this (I don't think..?).

No, that's exactly what we don't want, because then users cannot do anything 
that we can currently grant them permissions for - it's the all-or-nothing 
approach.

What we currently do is allow users to try every thing, then let the backend 
complain if it wants, and relay the access denied message to the user.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Christoph Berg
Re: Stephen Frost 2017-01-27 <20170127160544.gi9...@tamriel.snowman.net>
> > > Uh, I figured it was more like \g, which just re-runs the last query..
> > > As in, you'd do:
> > > 
> > > table pg_proc; % blargh, I can't read it like this
> > > \G % ahh, much nicer
> > 
> > Sure, that's exactly the same thing.  (You can omit the query in either
> > case which causes the previous query to be re-ran.  \crosstabview,
> > \gexec etc also work like that).
> 
> Right, I agree it's the same thing, but (clearly), not everyone
> discussing this realized that and, well, the \G-by-itself is a lot
> easier for me, at least.  I have a really hard time not ending things
> with a semi-colon. ;)

Heh, tbh even I as the patch other didn't realize that \G-by-itself
just works, my intention was that it replaces the semicolon. :)

So, to clarify, both ways work:

select * from pg_class where relname = 'pg_class';
-- dang, much too wide, press 
select * from pg_class where relname = 'pg_class' \G
-- ah nice!

select * from pg_class where relname = 'pg_class';
-- dang, much too wide
\G
-- ah nice!

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence data type

2017-01-27 Thread Peter Eisentraut
On 1/25/17 11:57 PM, Michael Paquier wrote:
> @@ -15984,6 +15992,9 @@ dumpSequence(Archive *fout, TableInfo *tbinfo)
>   "CREATE SEQUENCE %s\n",
>   fmtId(tbinfo->dobj.name));
> 
> +   if (strcmp(seqtype, "bigint") != 0)
> +   appendPQExpBuffer(query, "AS %s\n", seqtype);
> Wouldn't it be better to assign that unconditionally? There is no
> reason that a dump taken from pg_dump version X will work on X - 1 (as
> there is no reason to not make the life of users uselessly difficult
> as that's your point), but that seems better to me than rely on the
> sequence type hardcoded in all the pre-10 dump queries for sequences.

Generally, we don't add default values, to keep the dumps from being too
verbose.

(I also think that being able to restore dumps to older versions would
be nice, but that's another discussion.)

> Could you increase the regression test coverage to cover some of the
> new code paths? For example such cases are not tested:
> =# create sequence toto as smallint;
> CREATE SEQUENCE
> =# alter sequence toto as smallint maxvalue 1000;
> ERROR:  22023: RESTART value (2147483646) cannot be greater than MAXVALUE 
> (1000)
> LOCATION:  init_params, sequence.c:1537

Yeah, I had taken some notes along the way to add more test coverage, so
since you're interested, attached is a patch.  It's independent of the
current patch and overlaps slightly.  The example you show isn't really
a new code path, just triggered differently, but the enhanced tests will
cover it nonetheless.

> =# alter sequence toto as smallint;
> ERROR:  22023: MAXVALUE (2147483647) is too large for sequence data
> type smallint
> LOCATION:  init_params, sequence.c:1407
> 
> +   if ((seqform->seqtypid == INT2OID && seqform->seqmin < PG_INT16_MIN)
> +   || (seqform->seqtypid == INT4OID && seqform->seqmin < PG_INT32_MIN)
> +   || (seqform->seqtypid == INT8OID && seqform->seqmin < PG_INT64_MIN))
> +   {
> +   charbufm[100];
> +
> +   snprintf(bufm, sizeof(bufm), INT64_FORMAT, seqform->seqmin);
> +
> +   ereport(ERROR,
> +   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +errmsg("MINVALUE (%s) is too large for sequence data type 
> %s",
> +   bufm, format_type_be(seqform->seqtypid;
> +   }
> "large" does not apply to values lower than the minimum, no? The int64
> path is never going to be reached (same for the max value), it doesn't
> hurt to code it I agree.

Yeah, I think that should be rephrased as something like "out of
bounds", which is the term used elsewhere.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 533c4dcd04ffab9b8177307039f6c9ebcd6808b9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 27 Jan 2017 12:43:14 -0500
Subject: [PATCH] Additional test coverage for sequences

---
 src/test/regress/expected/sequence.out | 231 -
 src/test/regress/sql/sequence.sql  |  95 --
 2 files changed, 286 insertions(+), 40 deletions(-)

diff --git a/src/test/regress/expected/sequence.out b/src/test/regress/expected/sequence.out
index ad03a31a4e..c19b5f5ef6 100644
--- a/src/test/regress/expected/sequence.out
+++ b/src/test/regress/expected/sequence.out
@@ -1,3 +1,33 @@
+--
+-- CREATE SEQUENCE
+--
+-- various error cases
+CREATE UNLOGGED SEQUENCE sequence_testx;
+ERROR:  unlogged sequences are not supported
+CREATE SEQUENCE sequence_testx INCREMENT BY 0;
+ERROR:  INCREMENT must not be zero
+CREATE SEQUENCE sequence_testx INCREMENT BY -1 MINVALUE 20;
+ERROR:  MINVALUE (20) must be less than MAXVALUE (-1)
+CREATE SEQUENCE sequence_testx INCREMENT BY 1 MAXVALUE -20;
+ERROR:  MINVALUE (1) must be less than MAXVALUE (-20)
+CREATE SEQUENCE sequence_testx INCREMENT BY -1 START 10;
+ERROR:  START value (10) cannot be greater than MAXVALUE (-1)
+CREATE SEQUENCE sequence_testx INCREMENT BY 1 START -10;
+ERROR:  START value (-10) cannot be less than MINVALUE (1)
+CREATE SEQUENCE sequence_testx CACHE 0;
+ERROR:  CACHE (0) must be greater than zero
+-- OWNED BY errors
+CREATE SEQUENCE sequence_testx OWNED BY nobody;  -- nonsense word
+ERROR:  invalid OWNED BY option
+HINT:  Specify OWNED BY table.column or OWNED BY NONE.
+CREATE SEQUENCE sequence_testx OWNED BY pg_tables.tablename;  -- not a table
+ERROR:  referenced relation "pg_tables" is not a table or foreign table
+CREATE SEQUENCE sequence_testx OWNED BY pg_class.relname;  -- not same schema
+ERROR:  sequence must be in same schema as table it is linked to
+CREATE TABLE sequence_test_table (a int);
+CREATE SEQUENCE sequence_testx OWNED BY sequence_test_table.b;  -- wrong column
+ERROR:  column "b" of relation "sequence_test_table" does not exist
+DROP TABLE sequence_test_table;
 ---
 --- test creation of SERIAL column
 ---
@@ -242,17 +272,38 @@ DROP SEQUENCE myseq2;
 -- Alter sequence
 --
 ALTER SEQUENCE IF EXISTS 

Re: [HACKERS] Cache Hash Index meta page.

2017-01-27 Thread Robert Haas
On Thu, Jan 26, 2017 at 1:48 PM, Mithun Cy  wrote:
> _v11 API's was self-sustained one but it does not hold pins on the
> metapage buffer. Whereas in _v12 we hold the pin for two consecutive
> reads of metapage. I have taken your advice and producing 2 different
> patches.

Hmm.  I think both of these APIs have some advantages.  On the one
hand, passing metabuf sometimes allows you to avoid pin/unpin cycles -
e.g. in hashbulkdelete it makes a fair amount of sense to keep the
metabuf pinned once we've had to read it, just in case we need it
again.  On the other hand, it's surprising that passing a value for
the metabuf forces the cached to be refreshed.  I wonder if a good API
might be something like this:

HashMetaPage _hash_getcachedmetap(Relation rel, Buffer *metabuf, bool
force_refresh);

If the cache is initialized and force_refresh is not true, then this
just returns the cached data, and the metabuf argument isn't used.
Otherwise, if *metabuf == InvalidBuffer, we set *metabuf to point to
the metabuffer, pin and lock it, use it to set the cache, release the
lock, and return with the pin still held.  If *metabuf !=
InvalidBuffer, we assume it's pinned and return with it still pinned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Stephen Frost
Greetings,

* Simon Riggs (si...@2ndquadrant.com) wrote:
> On 27 January 2017 at 14:09, Dave Page  wrote:
> > On Fri, Jan 27, 2017 at 1:18 PM, Simon Riggs  wrote:
> >
> >> If the monitoring tool requires superuser then that is a problem, so
> >> it would be helpful if it didn't do that, please. Not much use having
> >> a cool tool if it don't work with the server.
> >
> > Sure, that's what I want - to provide the management and monitoring
> > capabilities without requiring superuser. Limiting the capability of
> > the tools is not an option when you talk to users - however for some
> > of them, having to use full superuser accounts is a problem as well
> > (especially for those who are used to other DBMSs that do offer more
> > fine-grained permissions).

Right, I'm all about providing fine-grained permissions and granting
those out to monitoring users, but we need to have an understanding of
what the monitoring really needs (and doesn't need...) to be able to
ensure that the fine-grained permission system which is built matches
those needs and allows the admin to grant out exactly the rights needed.

> >> The management and monitoring tool could be more specific about what
> >> it actually needs, rather than simply requesting generic read and
> >> write against the filesystem. Then we can put those specific things
> >> into the server and we can all be happy. Again, a detailed list would
> >> help here.
> >
> > Agreed - I do need to do that, and it's on my (extremely long) list.
> > I'm just chiming in on this thread as requested!

That would certainly be really nice to have.  I have some ideas, and
I've been meaning to try and work towards them, but knowing what other
monitoring systems do would be great.

> So I think it would be useful to have two modes in tools, one where
> they know they have superuser and one where they know we don't have
> it. At least we'll know we can't do certain things rather than just
> have them fail.

Having such a flag in monitoring tools where it makes sense sounds
reasonable to me, though there isn't really anything different for the
backend to do to support this (I don't think..?).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Jan 27, 2017 at 11:34 AM, Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> >> OK, fair enough.  get_raw_page() is clearly not something that we
> >> really want everybody to have access to by default, but if it were up
> >> to me, I'd change the permissions check inside the function to do a
> >> check for select privileges on the relation, and then I'd change the
> >> SQL script to revoke access from everybody but the superuser.
> >
> > The way to do properly do this would not be to have some conflation of
> > the right to execute get_raw_page() combined with a typically granted
> > right like 'SELECT'.  Instead, it would be to have an explicit RAW_PAGE
> > or similar permission which could be GRANT'd to a user for a particular
> > relation, and then we could change the superuser() check into a check
> > against that right, and call it done.
> 
> That's moving the goalposts a very large distance.
> 
> >> Actually, I think that's Stephen should have done something similar
> >> for pgstattuple in commit fd321a1dfd64d30bf1652ea6b39b654304f68ae4, ...
> >
> > Requiring pgstattuple() to check if a user has any rights on a table
> > means that an existing non-superuser monitoring tool that's calling
> > pgstattuple() for summary information would be required to have SELECT
> > rights across, most likely, all tables, even though the monitoring
> > user's got no need for that level of access.  Now, we could argue that
> > having access to just one column would be enough for that user, but
> > that's still more access than the monitoring user needs, and there's
> > still the issue of new tables (and default privileges don't currently
> > support column-level privileges anyway).
> 
> ...whereas here you don't want to move the goalposts at all.  I can't
> understand this.  When I want the superuser to have the option to hand
> out pg_ls_dir() access for all directories pg_ls_dir() can access, you
> complain that's too broad.  Yet your own previous commit, which allows
> pgstattuple() access to be granted, makes no provision for any
> granularity of access control at all.  And I think there is an
> excellent argument - which I have made - that pgstattuple() is more
> likely to expose sensitive information than pg_ls_dir().

You're completely ignoring the use-cases for which these are being done.

I've outlined the precise use-case for pgstattuple()'s usage across the
entire database for which the admin has granted the EXECUTE access in.
I've not yet seen a use-case for access to pg_ls_dir() across all
directories pg_ls_dir() can access.  My recollection is that you brought
up pg_wal, but that's hardly every directory, and some hypothetical tool
which could intelligently figure out what orphaned files exist.  For the
former, I would recommend a function for exactly that (or perhaps
something better which provides more than just a count of files), for
the latter, that's something that I would be very worried that someone
trying to implement outside of core would get wrong or which could be
version-dependent.  We've already got some code in core to find files
left over from a crash and deal with them and perhaps that could be
expanded to deal with other cases.

> Even if we someday have the capability for people to grant pg_ls_dir()
> access on a directory-by-directory basis, I am sure there will still
> be a way for them to grant access on all the directories to which
> pg_ls_dir() can access today; after all, that's just two directories,
> and their subdirectories.  At most somebody would have to make two
> GRANTs.  Removing the hard-coded superuser() checks allows that use
> case now, and doesn't prohibit you from implementing the other thing
> later when and if and when we reach agreement on it.

With an appropriate use-case for it, I wouldn't be against it.  I linked
to both use-cases and a provided patch for finer-grained access to
pg_ls_dir() and friends three years ago, which got shot down.  I'm not
against it if the community wishes to reconsider that decision and
support having filesystem-like access where there's an appropriate
use-case for it, and with fine-grained access control provided to meet
that use-case.

Further, as it relates to goal-posts, if your goal is to let an admin
grant out the ability to see how many files are in pg_wal, you're
spending a great deal more effort than that would require.  I wouldn't
object (and would actually appreciate) a function which is able to do
exactly that, and I'd go work with Greg S-M right away to make sure that
check_postgres.pl knows about that function and uses it in PG10 and
above.

I do object to someone coming along and saying "let's rip out all the
superuser() checks" and it would seem that I'm not alone.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Generic type subscription

2017-01-27 Thread Dmitry Dolgov
> On 24 January 2017 at 02:07, Tom Lane  wrote:
> I took an extremely quick look over the patch

Thank you for the feedback. It took some time for me to think about all
suggestions and notes.

> 1. As I mentioned previously, it's a seriously bad idea that ArrayRef
> is used for both array subscripting and array assignment.  Let's fix
> that while we're at it, rather than setting that mistake in even more
> stone by embedding it in a datatype extension API.

Sure. But I assume that `SubscriptingRef` and `SubscriptingAssignmentRef`
will
be almost identical since they carry the same information to get a value
and to assign a new value (so, probably it will be just an alias with a
different related function).

> 2. I'm not very pleased that the subscripting functions have signature
> "subscripting(internal) returns internal";

Basically, current implementation of subscripting operation contains node
related logic (e.g. like to verify that we're not using slice syntax for
jsonb)
and data type related logic (e.g. to get/to assign a value in an array).
And if
it's easy enough to use:
`array_subscript(anyarray, internal) returns anyelement`
`array_assign(anyarray, internal, anyelement) returns anyarray`
form for the second one, the first one should accept node as an argument and
return node - I'm not sure if it's possible to use something else than
`internal` here. Speaking about other signature issues, sure, I'll fix them.

> 3. The contents of ArrayRef are designed on the assumption that the same
> typmod and collation values apply to both an array and its elements.

Yes, I missed that. It looks straightforward for me, we can just split
`refcollid` and `reftypmod` to `refcontainercollid`, `refelementcollid` and
`refcontainertypmod`, `refelementtypmod`.

> 4. It looks like your work on the node processing infrastructure has been
> limited to s/ArrayRef/SubscriptingRef/g, but that's not nearly enough.
> SubscriptingRef needs to be regarded as an opportunity to invoke a
> user-defined function, which means that it now acts quite a bit like
> FuncExpr.  For example, the function-to-be-invoked needs to be checked for
> volatility, leakproofness, parallel safety, etc in operations that want to
> know about such things.
> 
> I noted yesterday, you're missing a lot of plan-time manipulations that
need
> to happen for a generic function call.

Yes, I totally missed these too. I'm going to improve this situation soon.

> Actually, after thinking about that a bit more: you've really squeezed
> *three* different APIs into one function.  Namely, subscript-reference
> parse analysis, array subscripting execution, and array assignment
> execution.  It would be cleaner, and would reduce runtime overhead a bit,
> if those were embodied as three separate functions.
>
> It might be possible to get away with having only one pg_type column,
> pointing at the parse-analysis function.  That function would generate
> a SubscriptingRef tree node containing the OID of the appropriate
> execution function, which execQual.c could call without ever knowing
> its name explicitly.
>
> This clearly would work for built-in types, since the parse-analysis
> function could rely on fmgroids.h for the OIDs of the execution functions.
> But I'm less sure if it would work in extensions, which won't have
> predetermined OIDs for their functions.  Is there a way for a function
> in an extension to find the OID of one of its sibling functions?
>
> On 24 January 2017 at 07:54, Jim Nasby  wrote:
>
> Obviously there's regprocedure (or it's C equivalent), but then you're
stuck
> re-computing at runtime. I've messed around with that a bit in an effort
to
> have an extension depend on another extension that allows the user to
specify
> it's schema. If you're already doing metaprogramming it's not an enormous
> problem... if you're not already doing that it sucks. Trying to make that
> work in C would be somewhere between impossible and a nightmare.

The idea of having one function that will generate SubscriptingRef node
sounds
good. But I'm afraid of consequences about using it for extensions
(especially
since the request for general subscripting implementation came also from
their side). Is there a way to make it less troublesome?

To summarize, right now there are three options to handle a
`SubscriptingRef`
node analysis, subscripting execution and assignment execution:

* one `pg_type` column with an OID of corresponding function for each
purpose
  (which isn't cool)

* one "controller" function that will call directly another function with
  required logic (which is a "squeezing" and it's the current
implementation)

* one function that will return `SubscriptingRef` with an OID of required
  function (which is potentially troublesome for extensions)

> BTW, a different approach that might be worth considering is to say that
> the nodetree representation of one of these things *is* a FuncExpr, and

Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Simon Riggs
On 27 January 2017 at 14:09, Dave Page  wrote:
> On Fri, Jan 27, 2017 at 1:18 PM, Simon Riggs  wrote:
>
>> If the monitoring tool requires superuser then that is a problem, so
>> it would be helpful if it didn't do that, please. Not much use having
>> a cool tool if it don't work with the server.
>
> Sure, that's what I want - to provide the management and monitoring
> capabilities without requiring superuser. Limiting the capability of
> the tools is not an option when you talk to users - however for some
> of them, having to use full superuser accounts is a problem as well
> (especially for those who are used to other DBMSs that do offer more
> fine-grained permissions).
>
>> The management and monitoring tool could be more specific about what
>> it actually needs, rather than simply requesting generic read and
>> write against the filesystem. Then we can put those specific things
>> into the server and we can all be happy. Again, a detailed list would
>> help here.
>
> Agreed - I do need to do that, and it's on my (extremely long) list.
> I'm just chiming in on this thread as requested!

So I think it would be useful to have two modes in tools, one where
they know they have superuser and one where they know we don't have
it. At least we'll know we can't do certain things rather than just
have them fail.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow interrupts on waiting standby

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 8:17 AM, Michael Paquier
 wrote:
> There are no default clauses in the pgstat_get_wait_* routines so my
> compiler is actually complaining...

That's exactly WHY there are no default clauses there.   :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 11:44 AM, Tom Lane  wrote:
> I'm afraid though that we may have to do something about the
> irrelevant-joinquals issue in order for this to be of much real-world
> use for inner joins.

Maybe, but it's certainly not the case that all inner joins are highly
selective.  There are plenty of inner joins in real-world applications
where the join product is 10% or 20% or 50% of the size of the larger
input.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 11:34 AM, Stephen Frost  wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
>> OK, fair enough.  get_raw_page() is clearly not something that we
>> really want everybody to have access to by default, but if it were up
>> to me, I'd change the permissions check inside the function to do a
>> check for select privileges on the relation, and then I'd change the
>> SQL script to revoke access from everybody but the superuser.
>
> The way to do properly do this would not be to have some conflation of
> the right to execute get_raw_page() combined with a typically granted
> right like 'SELECT'.  Instead, it would be to have an explicit RAW_PAGE
> or similar permission which could be GRANT'd to a user for a particular
> relation, and then we could change the superuser() check into a check
> against that right, and call it done.

That's moving the goalposts a very large distance.

>> Actually, I think that's Stephen should have done something similar
>> for pgstattuple in commit fd321a1dfd64d30bf1652ea6b39b654304f68ae4, ...
>
> Requiring pgstattuple() to check if a user has any rights on a table
> means that an existing non-superuser monitoring tool that's calling
> pgstattuple() for summary information would be required to have SELECT
> rights across, most likely, all tables, even though the monitoring
> user's got no need for that level of access.  Now, we could argue that
> having access to just one column would be enough for that user, but
> that's still more access than the monitoring user needs, and there's
> still the issue of new tables (and default privileges don't currently
> support column-level privileges anyway).

...whereas here you don't want to move the goalposts at all.  I can't
understand this.  When I want the superuser to have the option to hand
out pg_ls_dir() access for all directories pg_ls_dir() can access, you
complain that's too broad.  Yet your own previous commit, which allows
pgstattuple() access to be granted, makes no provision for any
granularity of access control at all.  And I think there is an
excellent argument - which I have made - that pgstattuple() is more
likely to expose sensitive information than pg_ls_dir().

Even if we someday have the capability for people to grant pg_ls_dir()
access on a directory-by-directory basis, I am sure there will still
be a way for them to grant access on all the directories to which
pg_ls_dir() can access today; after all, that's just two directories,
and their subdirectories.  At most somebody would have to make two
GRANTs.  Removing the hard-coded superuser() checks allows that use
case now, and doesn't prohibit you from implementing the other thing
later when and if and when we reach agreement on it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-01-27 Thread Robert Haas
On Tue, Jan 24, 2017 at 9:59 PM, Mithun Cy  wrote:
> On Thu, Jan 19, 2017 at 5:08 PM, Ashutosh Sharma  
> wrote:
>
> Thanks, Ashutosh and Jesper. I have tested the patch I do not have any
> more comments so making it ready for committer.

I took a look at this patch.  I think hash_page_items() is buggy.
It's a little confused about whether it is building an array of datums
(which can be assembled into a tuple using heap_form_tuple) or an
array of cstrings (which can be assembled into a tuple using
BuildTupleFromCStrings).  It's mostly the former: the array is of type
Datum, and two of the three values put into it are datums.  But the
code that puts the TID in there is stolen from bt_page_items(), which
is constructing an array of cstrings, so it's wrong here.  The
practical consequence of this is, I believe, that the TIDs output by
this function will be totally garbled and wrong, which is possibly why
the example in the documentation looks so wacky:

+  1 | (3407872,12584) | 1053474816
+  2 | (3407872,12584) | 1053474816

The heap tuple is on page 3407872 at line pointer offset 12584?
That's an awfully but not implausibly big page number (about 26GB into
the relation) and an awfully and implausibly large line pointer offset
(how do we fit 12584 index tuples into an 8192-byte page?).  Also, the
fact that this example has two index entries with the same hash code
pointing at the same heap TID seems wrong; wouldn't that result in
index scans returning duplicates?  I think what's actually happening
here is that (3407872,12584) is actually the attempt to interpret some
chunk of memory containing the text representation of a TID as an
actual TID.  When I print those numbers as hex, I get 0x343128, and
those are the digits "4" and "1" and an opening parenthesis ")" as
ASCII, so that might fit this theory.

The code that tries to extract the hashcode from the item also looks
suspicious.  I'm not 100% sure whether it's wrong or just
strangely-written, but it doesn't make much sense to extract the item
contents, then using sprintf() to turn that into a hex string of
bytes, then parse the hex string using strtol() to get an integer
back.  I think what it ought to be doing is getting a pointer to the
index tuple and then calling _hash_get_indextuple_hashkey.

Another minor complaint about hash_page_items is that it doesn't
pfree(uargs->page).  I'm not sure it's necessary to pfree anything
here, but if we're going to do it I think we might as well be
consistent with the btree case.  Anyway it can hardly make sense to
free the 8-byte structure and not the 8192-byte page to which it's
pointing.

In hash_bimap_info(), we go to the trouble of creating a raw page but
never do anything with it.  I guess the idea here is just to error out
if the supplied page number is not an overflow page, but there are no
comments explaining that.  Anyway, I'm not sure it's a very good idea,
because it means that if you try to write a query to interrogate the
status of all the bitmap pages, it's going to read all of the overflow
pages to which they point, which makes the operation a LOT more
expensive.  I think it would be better to leave this part out; if the
user wants to know which pages are actually overflow pages, they can
use hash_page_type() to figure it out.  Let's make it the job of this
function just to check the available/free status of the page in the
bitmap, without reading the bitmap itself.

In doing that, I think it should either return (bitmapblkno,
bitmapbit, bit) or just bit.  Returning bitmapblkno but not bitmapbit
seems strange.  Also, I think it should return bit as a bool, not
int4.

Another general note is that, in general, if you use the
BlahGetDatum() function to construct a datum, the SQL type should be
match the macro you picked - e.g. if the SQL type is int4, the macro
used should be Int32GetDatum(), not UInt32GetDatum() or anything else.
If the SQL type is bool, you should be using BoolGetDatum().

Apart from the above, I did a little work to improve the reporting
when a page of the wrong type is verified.  Updated patch with those
changes attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pageinspect-hashindex-rmh.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] potential hardware donation

2017-01-27 Thread Stephen Frost
Dan,

* Dan Langille (d...@langille.org) wrote:
> If someone wanted to donate a SuperServer 6028TR-D72R 
> (http://www.supermicro.com/products/system/2U/6028/SYS-6028TR-D72R.cfm) to 
> the PostgreSQL project, would it be used?

Possibly, but if it's really for PG infrastructure uses, the question
should go to sysadmins@, not -hackers.  If you're offering it to "anyone
who's on -hackers", then carry on.

I will say that pginfra has been contemplating if an additional
project-owned server would be a good thing to have, though we do tend to
have relativly high expectations that the system is current and in good
condition.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] potential hardware donation

2017-01-27 Thread Dan Langille
If someone wanted to donate a SuperServer 6028TR-D72R 
(http://www.supermicro.com/products/system/2U/6028/SYS-6028TR-D72R.cfm) to the 
PostgreSQL project, would it be used?

-- 
Dan Langille - BSDCan / PGCon
d...@langille.org




Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Simon Riggs  writes:
> > [ good general plan ]
> 
> > 3. Make a list of all functions that would cause security problems.
> > One by one, precisely. If we did remove all superuser checks we would
> > need this list documented to advise people of the risks, so it must
> > exist before any commit can be made, assuming we believe in
> > documentation. Notice that I am after risk documentation,
> 
> Yeah, I think documentation is the crux of the issue.  If there is some
> non-obvious reason why letting somebody use pg_ls_dir() is more of a
> security hazard than it appears on the surface, the answer is to document
> that so DBAs can decide for themselves whether to take the risk.
> 
> Count me +1 for removing hard-wired superuser checks, but carefully
> and with an overall plan.

To be clear, I'm also in agreement with (and have been making efforts
to, for quite a few years, with some progress) removing the hard-wired
superuser checks.  I've tried to do so carefully and with a guiding
principle that the existing superuser-only capabilities should be split
up in a way which makes sense for particular use-cases and allows the
administrator to grant only precisely what they wish to for the use-case
in question.  The individual-level grants for things like
pg_start/stop_backup(), along with some others, were really already at
the "right" level for the use-case in question, which meant that what
was needed was a way to revoke EXECUTE from public initially and to
maintain what the administrator wished the rights to be, which is what
was done.

I do like the idea which Dave mentioned up-thread of having default
roles which start out with the necessary privileges, so that the user
doesn't have to grant EXECUTE on pg_start_backup(), and then also grant
EXECUTE on pg_stop_backup().  We've gotten to the point where that might
actually be possible, now that we have a couple of existing default
roles, and a namespace in which to create more, if we choose to.

That a monitoring tool needs to be able to list the contents of the
pg_wal PG directory is great, but if that's what it needs, then that's
what the admin should be able to provide it with, and not have to
provide it with the ability to list every directory on the system, or
even every other directory in PG's data directory.  Privileges should be
granted out, as much as possible, to match what the requirement is, and
tossing everything out the window because we happen to have an existing
function like pg_ls_dir() that just needs that pesky superuser() check
removed isn't a good answer to the use-case in question.  I'm all for
generalizing things and providing blanket access when that's what the
use-case calls for, to be clear, but that's not where this discussion
has (had?) been going.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Tom Lane
I wrote:
> David Rowley  writes:
>> hmm. I'm having trouble understanding why this is a problem for Unique
>> joins, but not for join removal?

> Ah, you know what, that's just mistaken.  I was thinking that we
> short-circuited the join on the strength of the hash (or merge) quals
> only, but actually we check all the joinquals first.  As long as the
> uniqueness proof uses only joinquals and not conditions that will end up
> as otherquals, it's fine.

Actually, after thinking about that some more, it seems to me that there
is a performance (not correctness) issue here: suppose that we have
something like

select ... from t1 left join t2 on t1.x = t2.x and t1.y < t2.y

If there's a unique index on t2.x, we'll be able to mark the join
inner-unique.  However, short-circuiting would only occur after
finding a row that passes both joinquals.  If the y condition is
true for only a few rows, this would pretty nearly disable the
optimization.  Ideally we would short-circuit after testing the x
condition only, but there's no provision for that.

This might not be a huge problem for outer joins.  My sense of typical
SQL style is that the joinquals (ON conditions) are likely to be
exactly what you need to prove inner uniqueness, while random other
conditions will be pushed-down from WHERE and hence will be otherquals.
But I'm afraid it is quite a big deal for inner joins, where we dump
all available conditions into the joinquals.  We might need to rethink
that choice.

At least for merge and hash joins, it's tempting to think about a
short-circuit test being made after testing just the merge/hash quals.
But we'd have to prove uniqueness using only the merge/hash quals,
so the planning cost might be unacceptably high --- particularly for
merge joins which often don't use all available mergeable quals.
In the end I think we probably want to keep the short-circuit in the
same place where it is for SEMI/ANTI cases (which have to have it
exactly there for semantic correctness).

I'm afraid though that we may have to do something about the
irrelevant-joinquals issue in order for this to be of much real-world
use for inner joins.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> OK, fair enough.  get_raw_page() is clearly not something that we
> really want everybody to have access to by default, but if it were up
> to me, I'd change the permissions check inside the function to do a
> check for select privileges on the relation, and then I'd change the
> SQL script to revoke access from everybody but the superuser.

The way to do properly do this would not be to have some conflation of
the right to execute get_raw_page() combined with a typically granted
right like 'SELECT'.  Instead, it would be to have an explicit RAW_PAGE
or similar permission which could be GRANT'd to a user for a particular
relation, and then we could change the superuser() check into a check
against that right, and call it done.

The difficulty with the approach you're advocating is that a great many
users will likely have SELECT rights on a great many relations, but that
doesn't mean I really want them to have RAW_PAGE rights on all of them.

Of course, what this is really missing is an actual use-case where it
makes sense to grant out get_raw_page() to anyone.  Without that, I've
got a really hard time agreeing that it makes sense to remove the
existing superuser check or to spend any effort here.

> Actually, I think that's Stephen should have done something similar
> for pgstattuple in commit fd321a1dfd64d30bf1652ea6b39b654304f68ae4,
> and I think we should go back and fix it that way before that code
> lands in a released branch.  As I mentioned upthread, a user could run
> that even on sensitive tables like pg_authid.  That doesn't directly
> reveal any data, but it does let you take a lock on a table which you
> otherwise couldn't lock, and it reveals at least some information
> about the contents of the table.  If you ran it just before and after
> an insert or update, you might be able to infer the length of the new
> tuple.  That's probably not enough to give you a whole lotta help
> guessing what password that tuple contains ... but it's more than no
> information.  More fundamentally, I think it's entirely reasonable for
> the superuser to be able to decide who can and can't run the
> pgstattuple functions, but I strongly suspect that very few superusers
> would want to give users rights to run those functions on tables to
> which those users otherwise have no access.

Requiring pgstattuple() to check if a user has any rights on a table
means that an existing non-superuser monitoring tool that's calling
pgstattuple() for summary information would be required to have SELECT
rights across, most likely, all tables, even though the monitoring
user's got no need for that level of access.  Now, we could argue that
having access to just one column would be enough for that user, but
that's still more access than the monitoring user needs, and there's
still the issue of new tables (and default privileges don't currently
support column-level privileges anyway).

A new capability or way to grant "lock access" to all tables might be a
way to approach this, but it would have to be controllable on a
per-lock-mode basis, or a way to say "this user is allowed to lock and
extract high-level counts from every table in the system", though the
latter is essentially what GRANT'ing EXECUTE on the pgstattuple()
functions today is.  Perhaps the docs should be clearer as to what that
access means, but I don't see a need to include a warning with those
functions saying that they could be used to escalate privileges to a
superuser.

Perhaps a more generalized approach would be to structure a generic
function+table GRANT system, in addition to our existing
GRANT-RIGHT+table system, that allowed for per-function-per-table-level
grants.  My thinking is something along the lines of:

GRANT pgstattuple() ON TABLE q;
GRANT pgstattuple() ON ALL TABLES IN SCHEMA x;
GRANT pgstattuple() ON ALL TABLES IN DATABASE db;

Which would specifically tie together the two.  Of course, we'd need
similar support in ALTER DEFAULT PRIVILEGES to allow the capability to
be maintained.

What is unfortunate with any of this is that we don't, currently, have
any way to specify DEFAULT PRIVILEGES on pg_catalog, meaning that new
tables which are added to pg_catalog down the road won't get these
privileges.  For a monitoring tool that's trying to check $something
(bloat, for example) across all tables, that's a bit unfortunate.
Generally speaking, there's only a few catalog tables that tend to have
bloat issues and those are well-defined and don't change, so perhaps
this issue could be ignored, but a more complete solution would find a
way to address that case also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> > * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> 
> > > I think the suggestion is that \G replaces \g (which is the same thing
> > > as the semicolon).  So you would do this:
> > > 
> > > SELECT * FROM table WHERE table_status = 1; % get a short list; normal 
> > > output
> > > SELECT * FROM table WHERE table_id = 123 \G % drill down to one ID
> > 
> > Uh, I figured it was more like \g, which just re-runs the last query..
> > As in, you'd do:
> > 
> > table pg_proc; % blargh, I can't read it like this
> > \G % ahh, much nicer
> 
> Sure, that's exactly the same thing.  (You can omit the query in either
> case which causes the previous query to be re-ran.  \crosstabview,
> \gexec etc also work like that).

Right, I agree it's the same thing, but (clearly), not everyone
discussing this realized that and, well, the \G-by-itself is a lot
easier for me, at least.  I have a really hard time not ending things
with a semi-colon. ;)

Thanks!;

Stephen;


signature.asc
Description: Digital signature


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Tom Lane
David Rowley  writes:
> On 27 January 2017 at 12:39, Tom Lane  wrote:
>> 2. In these same cases (unique/semi/anti joins), it is possible to avoid
>> mark/restore overhead in a mergejoin, because we can tweak the executor
>> logic to not require backing up the inner side.

> I've made modifications in the attached to add this optimization, and
> it's quite a significant improvement.

Cool ... validates my gut feeling that that was worth incorporating.

>> ... IOW, would it hurt to drop the SpecialJoinInfo
>> tie-in and just rely on the generic cache?

> I agree that special handling of one join type is not so pretty.
> However, LEFT JOINs still remain a bit special as they're the only
> ones we currently perform join removal on, and the patch modifies that
> code to make use of the new flag for those. This can improve planner
> performance of join removal when a join is removed successfully, as
> the previous code had to recheck uniqueness of each remaining LEFT
> JOIN again, whereas the new code only checks uniqueness of ones not
> previously marked as unique. This too likely could be done with the
> cache, although I'm a bit concerned with populating the cache, then
> performing a bunch of LEFT JOIN removals and leaving relids in the
> cache which no longer exist. Perhaps it's OK. I've just not found
> proofs in my head yet that it is.

TBH, I do not like that tie-in at all.  I don't believe that it improves
any performance, because if analyzejoins.c detects that the join is
unique, it will remove the join; therefore there is nothing to cache.
(This statement depends on the uniqueness test being the last removability
test, but it is.)  And running mark_unique_joins() multiple times is ugly
and adds cycles whenever it cannot prove a join unique, because it'll keep
trying to do so.  So I'm pretty inclined to drop the connection to
analyzejoins.c altogether, along with mark_unique_joins(), and just use
the generic positive/negative cache mechanism you added for all join types.

It's possible that it'd make sense for analyzejoins.c to add a negative
cache entry about any join that it tries and fails to prove unique.
Your point about cache maintenance is valid, but a simple answer there
would just be to flush the cache whenever we remove a rel.  (Although
I'm dubious that we need to: how could a removable rel be part of the
min outer rels for any surviving rel?)

>> * Because of where we apply the short-circuit logic in the executor,
>> it's only safe to consider the inner rel as unique if it is provably
>> unique using only the join clauses that drive the primary join mechanism
>> (ie, the "joinquals" not the "otherquals").  We already do ignore quals
>> that are pushed-down to an outer join, so that's good, but there's an
>> oversight: we will use a qual that is mergejoinable even if it's not
>> hashjoinable. That means we could get the wrong answers in a hash join.

> hmm. I'm having trouble understanding why this is a problem for Unique
> joins, but not for join removal?

Ah, you know what, that's just mistaken.  I was thinking that we
short-circuited the join on the strength of the hash (or merge) quals
only, but actually we check all the joinquals first.  As long as the
uniqueness proof uses only joinquals and not conditions that will end up
as otherquals, it's fine.

> However you mentioning this cause me to notice that this is only true
> in the patch for left joins, and the other join types are not
> consistent with that.

Hm, perhaps, but the example you show isn't proving that, because it's
choosing to put a1 on the inside in the innerjoin case, and a1 certainly
isn't unique for this query.  We can't see whether a2 was detected as
unique.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Stephen Frost
* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Fri, Jan 27, 2017 at 8:31 AM, Stephen Frost  wrote:
> 
> > * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > > D'Arcy Cain wrote:
> > >
> > > > I am a pretty heavy user of psql but I don't think that that would be
> > so
> > > > helpful.  I assume you mean a new option, let's call it "\X" the
> > causes the
> > > > next query to be expanded.  I type "\X" then a query.  I realize that
> > I made
> > > > a mistake and have to redo the query so I have to type "\X" again.  If
> > I
> > > > forget then I have to run the query yet again.
> > >
> > > I think the suggestion is that \G replaces \g (which is the same thing
> > > as the semicolon).  So you would do this:
> > >
> > > SELECT * FROM table WHERE table_status = 1; % get a short list; normal
> > output
> > > SELECT * FROM table WHERE table_id = 123 \G % drill down to one ID
> >
> > Uh, I figured it was more like \g, which just re-runs the last query..
> > As in, you'd do:
> >
> > table pg_proc; % blargh, I can't read it like this
> > \G % ahh, much nicer
> >
> 
> This information surprised me.  It was unexpected that the last
> successfully executed query remains in the query buffer until the next SQL
> (and not meta) command is started.  I was expecting that as soon as result
> was returned to the screen the current query buffer would be cleared in
> preparation for the next query.

Well, I did get the impression that you weren't thinking about that,
which is actually kind of surpirsing to me.  Lots of things work on "the
current query buffer", which is the last query (successful or not, to be
clear..):

\crosstabview
\e
\g
\gexec
\gset
\p
\w
\watch

It's not entirely clear to me why the docs sometimes say "current query
buffer" and somtimes say "current query input buffer".

> A sentence or two describing this behavior (or, more generally the query
> buffer itself), probably placed at the end of the "Entering SQL Commands"
> section, would help to make this common knowledge.

Generally speaking, I agree that we should be more consistent in the
docs, use one term where we mean one thing, and define that term
somewhere.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-27 Thread Haribabu Kommi
On Fri, Jan 27, 2017 at 6:10 PM, Kuntal Ghosh 
wrote:

> On Wed, Jan 25, 2017 at 6:00 PM, Haribabu Kommi
>  wrote:
>
> > Corrected as suggested.
> >
> > Updated patch attached. There is no change in the contrib patch.
> Got whitspace error warning while applying contrib_macaddr8_1.patch:184.
>

Corrected.


> diff --git a/contrib/btree_gist/btree_gist.control
> b/contrib/btree_gist/btree_gist.control
> index ddbf83d..fdf0e6a 100644
> --- a/contrib/btree_gist/btree_gist.control
> +++ b/contrib/btree_gist/btree_gist.control
> @@ -1,5 +1,5 @@
>  # btree_gist extension
>  comment = 'support for indexing common datatypes in GiST'
> -default_version = '1.3'
> +default_version = '1.4'
> btree_gin.control should be updated as well. Otherwise, make
> check-world is throwing error.
>
> After making the above changes, I'm able to run regression test
> successfully without any error/warning.


Corrected the change in btree_gin.control file also.
Thanks for the review.

Updated patches are attached.


Regards,
Hari Babu
Fujitsu Australia


mac_eui64_support_9.patch
Description: Binary data


contrib_macaddr8_2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support

2017-01-27 Thread Haribabu Kommi
On Thu, Jan 26, 2017 at 9:42 PM, Vitaly Burovoy 
wrote:

> On 1/25/17, Haribabu Kommi  wrote:
> > On Wed, Jan 25, 2017 at 6:43 PM, Vitaly Burovoy <
> vitaly.buro...@gmail.com>
> > wrote:
> >
> I'm going to do (I hope) a final review tonight.
> Please, remove initialization of the variables "d" and "e" since there
> is really no reason to keep them be zero for a short time.


Thanks for the review. Corrected in the latest patch.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread David G. Johnston
On Fri, Jan 27, 2017 at 8:31 AM, Stephen Frost  wrote:

> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > D'Arcy Cain wrote:
> >
> > > I am a pretty heavy user of psql but I don't think that that would be
> so
> > > helpful.  I assume you mean a new option, let's call it "\X" the
> causes the
> > > next query to be expanded.  I type "\X" then a query.  I realize that
> I made
> > > a mistake and have to redo the query so I have to type "\X" again.  If
> I
> > > forget then I have to run the query yet again.
> >
> > I think the suggestion is that \G replaces \g (which is the same thing
> > as the semicolon).  So you would do this:
> >
> > SELECT * FROM table WHERE table_status = 1; % get a short list; normal
> output
> > SELECT * FROM table WHERE table_id = 123 \G % drill down to one ID
>
> Uh, I figured it was more like \g, which just re-runs the last query..
> As in, you'd do:
>
> table pg_proc; % blargh, I can't read it like this
> \G % ahh, much nicer
>

​This information surprised me.  It was unexpected that the last
successfully executed query remains in the query buffer until the next SQL
(and not meta) command is started.  I was expecting that as soon as result
was returned to the screen the current query buffer would be cleared in
preparation for the next query.​

A sentence or two describing this behavior (or, more generally the query
buffer itself), probably placed at the end of the "Entering SQL Commands"
section, would help to make this common knowledge.

David J.


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Alvaro Herrera
Stephen Frost wrote:
> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:

> > I think the suggestion is that \G replaces \g (which is the same thing
> > as the semicolon).  So you would do this:
> > 
> > SELECT * FROM table WHERE table_status = 1; % get a short list; normal 
> > output
> > SELECT * FROM table WHERE table_id = 123 \G % drill down to one ID
> 
> Uh, I figured it was more like \g, which just re-runs the last query..
> As in, you'd do:
> 
> table pg_proc; % blargh, I can't read it like this
> \G % ahh, much nicer

Sure, that's exactly the same thing.  (You can omit the query in either
case which causes the previous query to be re-ran.  \crosstabview,
\gexec etc also work like that).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_file_settings view patch

2017-01-27 Thread Haribabu Kommi
On Fri, Jan 27, 2017 at 1:36 AM, Tom Lane  wrote:

> Haribabu Kommi  writes:
> > This patch currently doesn't have the code for reporting the two log
> > messages that can occur in tokenize_file function. To support the same,
> > I am thinking of changing line_nums list to line_info list that can
> > contain both line number and the error message that occurred during the
> > tokenize. This list data is used to identify whether that line is having
> > any error or not before parsing that hba line, and directly report that
> > line as error in the view.
>
> Yeah, perhaps.  tokenize_file() has pushed the return-parallel-lists
> notion to the limit of sanity already.  It would make more sense to
> change it to return a single list containing one struct per line,
> which would include the token list, raw line text, and line number.
>
> It might make sense to proceed by writing a separate patch that just
> refactors the existing code to have an API like that, and then revise
> this patch to add an error message field to the per-line struct.  Or
> maybe that's just extra work, not sure.
>

Here I attached tokenize_file refactor patch to return single linked list
that contains a structure and rebased pg_hba_rules patch on top it
by adding an error message to that structure to hold the errors occurred
during tokenization..

I came up with TokenizedLline as a structure name that works with all
configuration files and member names (I hope). If it needs any better
names please let me know.

Updated patches are attached.

Regards,
Hari Babu
Fujitsu Australia


tokenize_file_refactor_1.patch
Description: Binary data


pg_hba_rules_13.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> D'Arcy Cain wrote:
> 
> > I am a pretty heavy user of psql but I don't think that that would be so
> > helpful.  I assume you mean a new option, let's call it "\X" the causes the
> > next query to be expanded.  I type "\X" then a query.  I realize that I made
> > a mistake and have to redo the query so I have to type "\X" again.  If I
> > forget then I have to run the query yet again.
> 
> I think the suggestion is that \G replaces \g (which is the same thing
> as the semicolon).  So you would do this:
> 
> SELECT * FROM table WHERE table_status = 1; % get a short list; normal output
> SELECT * FROM table WHERE table_id = 123 \G % drill down to one ID

Uh, I figured it was more like \g, which just re-runs the last query..
As in, you'd do:

table pg_proc; % blargh, I can't read it like this
\G % ahh, much nicer

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Alvaro Herrera
D'Arcy Cain wrote:

> I am a pretty heavy user of psql but I don't think that that would be so
> helpful.  I assume you mean a new option, let's call it "\X" the causes the
> next query to be expanded.  I type "\X" then a query.  I realize that I made
> a mistake and have to redo the query so I have to type "\X" again.  If I
> forget then I have to run the query yet again.

I think the suggestion is that \G replaces \g (which is the same thing
as the semicolon).  So you would do this:

SELECT * FROM table WHERE table_status = 1; % get a short list; normal output
SELECT * FROM table WHERE table_id = 123 \G % drill down to one ID

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread D'Arcy Cain

On 2017-01-27 10:05 AM, David Fetter wrote:

On Fri, Jan 27, 2017 at 02:27:37PM +0100, Christoph Berg wrote:

I frequently find myself in the situation that I want the "\x"
expanded output mode activated just for one query. There's little
wrong with typing "\x" and re-executing the query in that case, but
then I'm always annoyed that the expanded output is still active for
the next query after that.


+1

Your situation is familiar to me, and likely common among heavy users
of psql.


I am a pretty heavy user of psql but I don't think that that would be so 
helpful.  I assume you mean a new option, let's call it "\X" the causes 
the next query to be expanded.  I type "\X" then a query.  I realize 
that I made a mistake and have to redo the query so I have to type "\X" 
again.  If I forget then I have to run the query yet again.


What would be useful for me is a control that causes every query to be 
expanded if it returns exactly one row.  Now I can do this:


SELECT * FROM table WHERE table_status = 1; % get a short list
SELECT * FROM table WHERE table_id = 123; % drill down to one ID

The first would give me a normal list and the second would give me an 
expanded look at the one I want to see in detail.


--
D'Arcy J.M. Cain  |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 788 2246 (DoD#0082)(eNTP)   |  what's for dinner.
IM: da...@vex.net, VoIP: sip:da...@druid.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-27 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> By the way the existing comment for the hook,

>> *
>> * We provide a function hook variable that lets loadable plugins get
>> * control when ProcessUtility is called.  Such a plugin would normally
>> * call standard_ProcessUtility().
>> */

> This is quite a matter of course for developers. planner_hook and
> other ones don't have such a comment or have a one-liner at
> most. Since this hook gets a good amount of commnet, it seems
> better to just remove the existing one..

Hm?  I see pretty much the same wording for planner_hook:

 * To support loadable plugins that monitor or modify planner behavior,
 * we provide a hook variable that lets a plugin get control before and
 * after the standard planning process.  The plugin would normally call
 * standard_planner().

I think other places have copied-and-pasted this as well.

The real problem with this is it isn't a correct specification of the
contract: in most cases, the right thing to do is more like "call the
previous occupant of the ProcessUtility_hook, or standard_ProcessUtility
if there was none."

Not sure if it's worth trying to be more precise in these comments,
but if we do something about it, we need to hit all the places with
this wording.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread David Fetter
On Fri, Jan 27, 2017 at 02:27:37PM +0100, Christoph Berg wrote:
> I frequently find myself in the situation that I want the "\x"
> expanded output mode activated just for one query. There's little
> wrong with typing "\x" and re-executing the query in that case, but
> then I'm always annoyed that the expanded output is still active for
> the next query after that.

+1

Your situation is familiar to me, and likely common among heavy users
of psql.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries

2017-01-27 Thread Tom Lane
Craig Ringer  writes:
> On 27 Jan. 2017 14:34, "Tom Lane"  wrote:
>> "The same queryString may be passed to multiple invocations of
>> ProcessUtility when processing a query string containing multiple
>> semicolon-separated statements; one should use pstmt->stmt_location and
>> pstmt->stmt_len to identify the substring containing the current
>> statement.  Keep in mind also that some utility statements (e.g.,
>> CREATE SCHEMA) will recurse to ProcessUtility to process sub-statements,
>> often passing down the same queryString, stmt_location, and stmt_len
>> that were given for the whole statement."

> Much better wording. I like that.

OK, done that way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Christoph Berg
Re: To PostgreSQL Hackers 2017-01-27 
<20170127132737.6skslelaf4txs...@msg.credativ.de>
> The same idea was discussed back in 2008. Back then the outcome was
> that "\x auto" was implemented, but I still think that \G is a useful
> feature to have on its own, and several people in the thread seem to
> have agreed back then.

I forgot to add the archive URL here:
https://www.postgresql.org/message-id/758d5e7f0804030023j659d72e6nd66a9d6b93b30886%40mail.gmail.com

Mit freundlichen Grüßen,
Christoph Berg
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 9:42 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> The problem is if the interpretation functions aren't completely
>> bulletproof, they might do things like crash the server if you use
>> them to read a corrupt page.  That is not any more appealing if you
>> happen to be running as superuser() than otherwise.
>
> I'm not aware that they're likely to crash the server, and if they
> are, so would any regular access to the page in question.  The
> things we were worried about were more along the lines of unexpected
> information disclosure.
>
> This is not to say that I'm against making those functions more
> bulletproof.  I'm just saying that I find little point in reducing
> their superuser checks if we can't get rid of the one in get_raw_page.

OK, fair enough.  get_raw_page() is clearly not something that we
really want everybody to have access to by default, but if it were up
to me, I'd change the permissions check inside the function to do a
check for select privileges on the relation, and then I'd change the
SQL script to revoke access from everybody but the superuser.  That
way, by default, only superusers could use the function, but if they
wanted to grant access to others, they could do so.  However, even
with an access grant, people could only read raw pages from relations
from which they can read data in general.

Actually, I think that's Stephen should have done something similar
for pgstattuple in commit fd321a1dfd64d30bf1652ea6b39b654304f68ae4,
and I think we should go back and fix it that way before that code
lands in a released branch.  As I mentioned upthread, a user could run
that even on sensitive tables like pg_authid.  That doesn't directly
reveal any data, but it does let you take a lock on a table which you
otherwise couldn't lock, and it reveals at least some information
about the contents of the table.  If you ran it just before and after
an insert or update, you might be able to infer the length of the new
tuple.  That's probably not enough to give you a whole lotta help
guessing what password that tuple contains ... but it's more than no
information.  More fundamentally, I think it's entirely reasonable for
the superuser to be able to decide who can and can't run the
pgstattuple functions, but I strongly suspect that very few superusers
would want to give users rights to run those functions on tables to
which those users otherwise have no access.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-27 Thread Thomas Kellerer
Greg Stark wrote
> I don't think this even needs to be tied to currencies. I've often
> thought this would be generally useful for any value with units. This
> would prevent you from accidentally adding miles to kilometers or
> hours to parsecs which is just as valid as preventing you from adding
> CAD to USD.

There is already such a concept - not tied to currencies or units in
general. The SQL standard calls it DISTINCT types. And it can prevent
comparing apples to oranges. 

I don't have the exact syntax at hand, but it's something like this:

create distinct type customer_id_type as integer;
create distinct type order_id_type as integer;

create table customers (id customer_id_type primary key);
create table orders (id order_id_type primary key, customer_id
customer_id_type not null);

And because those columns are defined with different types, the database
will refuse to compare customers.id with orders.id (just like it would
refuse to compare an integer with a date). 

So an accidental join like this:

  select *
  from orders o
join customers c using (id);

would throw an error because the data types of the IDs can not be compared.









--
View this message in context: 
http://postgresql.nabble.com/GSoC-2017-tp5938331p5941383.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Tom Lane
Robert Haas  writes:
> The problem is if the interpretation functions aren't completely
> bulletproof, they might do things like crash the server if you use
> them to read a corrupt page.  That is not any more appealing if you
> happen to be running as superuser() than otherwise.

I'm not aware that they're likely to crash the server, and if they
are, so would any regular access to the page in question.  The
things we were worried about were more along the lines of unexpected
information disclosure.

This is not to say that I'm against making those functions more
bulletproof.  I'm just saying that I find little point in reducing
their superuser checks if we can't get rid of the one in get_raw_page.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-01-27 Thread Alvaro Herrera
Peter Eisentraut wrote:

> Right now, however, the patch isn't moving at all, and I don't see it
> going into PG10, so I'm fine with returning with feedback.

There are a bunch of side patches that we should apply separately, such
as the pgcrypto fix.  I don't understand why they are part of this patch
series, really, given that they seem largely independent prerequisites.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background contrib module proposal

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 9:14 AM, Peter Eisentraut
 wrote:
> On 1/19/17 12:47 PM, Andrey Borodin wrote:
>> 4. There is some controversy on where implemented feature shall be: in 
>> separate extension (as in this patch), in db_link, in some PL API, in FDW or 
>> somewhere else. I think that new extension is an appropriate place for the 
>> feature. But I’m not certain.
>
> I suppose we should decide first whether we want pg_background as a
> separate extension or rather pursue extending dblink as proposed elsewhere.
>
> I don't know if pg_background allows any use case that dblink can't
> handle (yet).

For the record, I have no big problem with extending dblink to allow
this instead of adding pg_background.  But I think we should try to
get one or the other done in time for this release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 9:35 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> - contrib/pageinspect has lots of superuser checks, basically because
>> they have known input-validation weaknesses.  See
>> 3e1338475ffc2eac25de60a9de9ce689b763aced for the rationale in detail.
>
> FWIW, I think that's a bit of an oversimplification.  There are two
> components to contrib/pageinspect: get_raw_page() and a pile of page
> interpretation functions.  The input-validation issue appies primarily
> to the interpretation functions.
>
> In principle, if we could make the interpretation functions completely
> bulletproof, we could remove all security checks for them.  However,
> they're largely useless without get_raw_page(), so it's not apparent
> what's the point of doing a lot of work (and taking the risk of missing
> something) unless we first get rid of get_raw_page()'s superuser check.
> And that seems fraught with questions.  Maybe it'd be all right to
> allow it for tables you have full read access to (can select all columns,
> no RLS) but I'm not convinced.  For example, full read access doesn't
> allow you to see values that were in the table in the past, before you
> were granted read access ... but get_raw_page() might.

The problem is if the interpretation functions aren't completely
bulletproof, they might do things like crash the server if you use
them to read a corrupt page.  That is not any more appealing if you
happen to be running as superuser() than otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Robert Haas
On Fri, Jan 27, 2017 at 8:18 AM, Simon Riggs  wrote:
> This is still just the Adminpack argument. This has been going on for
> about a decade? Longer.

Right.

> If the monitoring tool requires superuser then that is a problem, so
> it would be helpful if it didn't do that, please. Not much use having
> a cool tool if it don't work with the server.

The argument keeps going on because users aren't willing to give up
some of the monitoring capabilities that currently require
functionality which is arbitrarily restricted to superusers, and not
everyone here is willing to recognize those monitoring needs as
legitimate.  I believe we should take it as a given that any check
somebody's bothered building into a popular monitoring tool like
check_postgres.pl is probably a good check written by a smart
developer, and the question we should ask ourselves here is "what can
we do in the core project to let those checks run without needing
superuser privileges?".  Instead, we say things like...

> The management and monitoring tool could be more specific about what
> it actually needs, rather than simply requesting generic read and
> write against the filesystem.

...which basically means we think the current checks are written is a
way that is wrong, and should be changed to use some not-yet-invented
alternative mechanism.  But that's position is a bit arrogant and a
bit counterfactual.  It's counterfactual because the functions in
question NOT give generic read and write access against the
filesystem; they are limited to the data directory and the log
directory.  IOW, significant efforts have already been made to impose
reasonable security precautions.  More could be done, but particularly
with regards to basically-harmless things like pg_ls_dir() and
pg_stat_file(), there's no real security benefit to doing so.  (I
agree that there could be some benefit for pg_read/write_file, but the
question in my mind is how much complexity it's worth adding for a
corner case; I'd be fine with a new mechanism if it's relatively
simple.)

And it's arrogant because it supposes that we know better than the
tool authors what the right way to do everything is.  If Greg Sabino
Mullane has had a check that uses pg_ls_dir() in check_postgres.pl for
ten years, it's probably a great way of checking the thing that he's
using it to check, because Greg Sabino Mullane is a smart,
experienced, long-time community member.  Any argument to the contrary
-- especially one offered only in the context of trying to make that
check work without requiring superuser privileges -- seems to me to be
Monday-morning-quarterbacking (apologies for the Americanism; I don't
know what the equivalent British expression would be).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Tom Lane
Robert Haas  writes:
> - contrib/pageinspect has lots of superuser checks, basically because
> they have known input-validation weaknesses.  See
> 3e1338475ffc2eac25de60a9de9ce689b763aced for the rationale in detail.

FWIW, I think that's a bit of an oversimplification.  There are two
components to contrib/pageinspect: get_raw_page() and a pile of page
interpretation functions.  The input-validation issue appies primarily
to the interpretation functions.

In principle, if we could make the interpretation functions completely
bulletproof, we could remove all security checks for them.  However,
they're largely useless without get_raw_page(), so it's not apparent
what's the point of doing a lot of work (and taking the risk of missing
something) unless we first get rid of get_raw_page()'s superuser check.
And that seems fraught with questions.  Maybe it'd be all right to
allow it for tables you have full read access to (can select all columns,
no RLS) but I'm not convinced.  For example, full read access doesn't
allow you to see values that were in the table in the past, before you
were granted read access ... but get_raw_page() might.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017

2017-01-27 Thread Brad DeJong
On January 27, 2017 07:08, Tom Lane wrote:
> ... The things I think are unique to the currency situation are: ...

Add the potential for regulatory requirements to change at any time - sort of 
like timezone information. So no hard coded behavior.
rounding method/accuracy
storage precision different than display precision
conversion method (multiply, divide, triangulate, other)
use of spot rates (multiple rate sources) rather than/in addition to 
time-varying rates

responding to the overall idea of a currency type

Numeric values with units so that you get a warning/error when you mix 
different units in calculations? Ability to specify rounding methods and 
intermediate precisions for calculations?
+1 Good ideas with lots of potential applications.

Built-in currency type? 
-1 I suspect this is one of those things that seems like a good idea but really 
isn't.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background contrib module proposal

2017-01-27 Thread Peter Eisentraut
On 1/19/17 12:47 PM, Andrey Borodin wrote:
> 4. There is some controversy on where implemented feature shall be: in 
> separate extension (as in this patch), in db_link, in some PL API, in FDW or 
> somewhere else. I think that new extension is an appropriate place for the 
> feature. But I’m not certain.

I suppose we should decide first whether we want pg_background as a
separate extension or rather pursue extending dblink as proposed elsewhere.

I don't know if pg_background allows any use case that dblink can't
handle (yet).

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Robert Haas
On Thu, Jan 26, 2017 at 5:36 PM, Stephen Frost  wrote:
>> The first is that restricting the ability to GRANT access
>> to a function, even a function that allows escalation to superuser
>> privileges, doesn't improve security, because the superuser can still
>> grant those privileges with more work.
>
> Having various inconsistent and unclear ways to grant access to a
> privileged user is making security worse and that is very much part of
> my concern. I don't agree, regardless of how many times you claim it,
> that because the superuser could still grant superuser to someone (or
> could grant the individual privileges with more work) keeps things
> status-quo regarding security, or somehow that not making the changes
> you are proposing reduces existing security.

I think it certainly increases security, especially when the functions
don't themselves confer superuser access, because then people won't
use superuser access when something less will do.  I agree that
pg_read_file() will often allow an escalation to superuser, but not
always, and not necessarily easily.  If somebody breaks into the
account I'm using for monitoring my system, I'd way, way rather have
them get pg_read_file() than superuser.  And for pg_ls_dir(), about a
hundred times moreso.

> Why aren't you including the other functions mentioned?  Not including
> them, even if they're in contrib, would still end up with things in an
> inconssitent state, and it hardly seems like it'd be much effort to go
> through the rest of them with equal care.

The contrib stuff is harder to change because of
backward-compatibility with previous extensions.  I could go through
and analyze it if we had some consensus on the basic principle here,
but we don't seem to agree on anything.  Your current policy proposal,
as I understand it, is that things should have builtin superuser
privileges if (a) there is any chance they can be used to escalate to
superuser or (b) they involve any access to the filesystem, even
access that can't be used to escalate to superuser.  Even if I were to
agree to (a), (b) looks like a random wart to justify the status quo,
so I'm not real hopeful about further analysis finding any common
ground.  But that having been said, the broad picture here is that
there are two contrib modules which have hard-coded superuser checks
at the top of SQL-callable functions:

- contrib/adminpack has some very dangerous functions (pg_file_write,
pg_file_rename, pg_file_unlink) that have a hard-coded superuser
check.  It also has an apparently-not-that-dangerous function
(pg_logdir_ls) with such a check.

- contrib/pageinspect has lots of superuser checks, basically because
they have known input-validation weaknesses.  See
3e1338475ffc2eac25de60a9de9ce689b763aced for the rationale in detail.
The purist in me is inclined to think that the best answer here would
be to fix the functions so that they're safe, but that might not be
terribly easy to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2017-01-27 Thread Peter Eisentraut
On 1/24/17 8:37 AM, Tom Lane wrote:
> Craig Ringer  writes:
>> Personally I think we should aim to have this in as a non default build
>> mode in pg10 if it can be made ready, and aim to make it default in pg11 at
>> least for Windows.
> 
> AFAIK we haven't committed to accepting this at all, let alone trying
> to do so on a tight schedule.  And I believe there was general agreement
> that we would not accept it as something to maintain in parallel with
> the existing makefiles.  If we have to maintain two build systems, we
> have that already.

My preferred scenario would be to replace the Windows build system by
this first, then refine it, then get rid of Autoconf.

The ideal timeline would be to have a ready patch to commit early in a
development cycle, then get rid of the Windows build system by the end
of it.  Naturally, this would need buy-in from Windows developers.

I don't foresee replacing the Autoconf build system by this immediately.

Right now, however, the patch isn't moving at all, and I don't see it
going into PG10, so I'm fine with returning with feedback.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Dave Page
On Fri, Jan 27, 2017 at 1:18 PM, Simon Riggs  wrote:
> On 27 January 2017 at 12:56, Dave Page  wrote:
>
>> Probably the most common complaint I get from users
>> regarding the management & monitoring tools I work on is that they
>> have to use superuser accounts to get the full benefits, unlike other
>> DBMSs where you can create a role with just the required privileges
>> (or indeed, other DBMSs that ship with such roles pre-defined for
>> convenience).
>
> This is still just the Adminpack argument. This has been going on for
> about a decade? Longer.

Not entirely. Some simple examples:

- Controlling recovery. This is fixed in 9.6+, but prior to that,
granting execute privs on pg_xlog_replay_pause() and/or
pg_xlog_replay_resume() would be accepted but wouldn't work.

- Access to certain GUCs. For example, it could be argued that "SHOW
log_directory" is quite reasonable for a monitoring tool to run, for
the purposes of auditing/alerting admins to any changes made by a
rogue superuser.

- ALTER SYSTEM - clearly there is a use case for allow certain users
to configure the database server, but not necessarily have the full
rights of a superuser that would allow them access to all the data
(yeah, I know that separation is far more complex than that alone, but
I hope you get the point).

> If the monitoring tool requires superuser then that is a problem, so
> it would be helpful if it didn't do that, please. Not much use having
> a cool tool if it don't work with the server.

Sure, that's what I want - to provide the management and monitoring
capabilities without requiring superuser. Limiting the capability of
the tools is not an option when you talk to users - however for some
of them, having to use full superuser accounts is a problem as well
(especially for those who are used to other DBMSs that do offer more
fine-grained permissions).

> The management and monitoring tool could be more specific about what
> it actually needs, rather than simply requesting generic read and
> write against the filesystem. Then we can put those specific things
> into the server and we can all be happy. Again, a detailed list would
> help here.

Agreed - I do need to do that, and it's on my (extremely long) list.
I'm just chiming in on this thread as requested!

> Does the latest version of pgadmin provide access to log files? I
> can't see much that really needs Adminpack anymore, though I've not
> done a thorough analysis at all.

Not yet.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Stephen Frost
Christoph,

* Christoph Berg (christoph.b...@credativ.de) wrote:
> The same idea was discussed back in 2008. Back then the outcome was
> that "\x auto" was implemented, but I still think that \G is a useful
> feature to have on its own, and several people in the thread seem to
> have agreed back then.

+1 for my part.  I often run a query and then realize that I really
wanted \x output.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] nodes.h - comments comment

2017-01-27 Thread Tom Lane
Erik Rijkers  writes:
> Orthography fix in nodes.h comment block.

Pushed with some further adjustments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] One-shot expanded output in psql using \G

2017-01-27 Thread Christoph Berg
I frequently find myself in the situation that I want the "\x"
expanded output mode activated just for one query. There's little
wrong with typing "\x" and re-executing the query in that case, but
then I'm always annoyed that the expanded output is still active for
the next query after that.

"\x auto" is not a fix for the problem; I have set up the pager to use
"less -S" (non-wrapping) by default so I can scroll right/left through
the query result instead of having it spread over several terminal
lines.

A workaround is to submit queries using "\x\g\x", but that's ugly,
clutters the output with toggle messages, and will forget that "\x
auto" was set.

Mysql's CLI client is using \G for this purpose, and adding the very
same functionality to psql fits nicely into the set of existing
backslash commands: \g sends the query buffer, \G will do exactly the
same as \g (including parameters), but forces expanded output just for
this query.

The same idea was discussed back in 2008. Back then the outcome was
that "\x auto" was implemented, but I still think that \G is a useful
feature to have on its own, and several people in the thread seem to
have agreed back then.

Patch attached, I'll add it to the next commit fest.

Mit freundlichen Grüßen,
Christoph Berg
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index 640fe12..af85888
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** Tue Oct 26 21:40:57 CEST 1999
*** 1891,1896 
--- 1891,1908 
  
  

+ \G [ filename ]
+ \G [ |command ]
+ 
+ 
+ \G is equivalent to \g, but
+ forces expanded output mode for this query.
+ 
+ 
+   
+ 
+ 
+   
  \gexec
  
  
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
new file mode 100644
index 0c164a3..912f672
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** exec_command(const char *cmd,
*** 904,911 
  		free(fname);
  	}
  
! 	/* \g [filename] -- send query, optionally with output to file/pipe */
! 	else if (strcmp(cmd, "g") == 0)
  	{
  		char	   *fname = psql_scan_slash_option(scan_state,
     OT_FILEPIPE, NULL, false);
--- 904,914 
  		free(fname);
  	}
  
! 	/*
! 	 * \g [filename] -- send query, optionally with output to file/pipe
! 	 * \G [filename] -- same as \g, with expanded mode forced
! 	 */
! 	else if (strcasecmp(cmd, "g") == 0)
  	{
  		char	   *fname = psql_scan_slash_option(scan_state,
     OT_FILEPIPE, NULL, false);
*** exec_command(const char *cmd,
*** 918,923 
--- 921,928 
  			pset.gfname = pg_strdup(fname);
  		}
  		free(fname);
+ 		if (strcmp(cmd, "G") == 0)
+ 			pset.g_expanded = true;
  		status = PSQL_CMD_SEND;
  	}
  
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
new file mode 100644
index e1b04de..4a75470
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** PrintQueryTuples(const PGresult *results
*** 770,775 
--- 770,779 
  {
  	printQueryOpt my_popt = pset.popt;
  
+ 	/* one-shot expanded output requested via \G */
+ 	if (pset.g_expanded)
+ 		my_popt.topt.expanded = true;
+ 
  	/* write output to \g argument, if any */
  	if (pset.gfname)
  	{
*** sendquery_cleanup:
*** 1411,1416 
--- 1415,1423 
  		pset.gfname = NULL;
  	}
  
+ 	/* reset \G's expanded-mode flag */
+ 	pset.g_expanded = false;
+ 
  	/* reset \gset trigger */
  	if (pset.gset_prefix)
  	{
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
new file mode 100644
index 5365629..5e9c249
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** slashUsage(unsigned short int pager)
*** 174,179 
--- 174,180 
  	fprintf(output, _("  \\copyright show PostgreSQL usage and distribution terms\n"));
  	fprintf(output, _("  \\errverboseshow most recent error message at maximum verbosity\n"));
  	fprintf(output, _("  \\g [FILE] or ; execute query (and send results to file or |pipe)\n"));
+ 	fprintf(output, _("  \\G [FILE]  as \\g, but force expanded output\n"));
  	fprintf(output, _("  \\gexec execute query, then execute each value in its result\n"));
  	fprintf(output, _("  \\gset [PREFIX] execute query and store results in psql variables\n"));
  	fprintf(output, _("  \\q quit psql\n"));
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
new file mode 100644
index 4c7c3b1..a13da33
*** a/src/bin/psql/settings.h
--- b/src/bin/psql/settings.h
*** typedef struct _psqlSettings
*** 91,96 
--- 91,97 
  	

Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread David Rowley
On 27 January 2017 at 12:39, Tom Lane  wrote:
> 2. In these same cases (unique/semi/anti joins), it is possible to avoid
> mark/restore overhead in a mergejoin, because we can tweak the executor
> logic to not require backing up the inner side.  This goes further than
> just tweaking the executor logic, though, because if we know we don't
> need mark/restore then that actually makes some plan shapes legal that
> weren't before: we don't need to interpose a Material node to protect
> join inputs that can't mark/restore.

I've made modifications in the attached to add this optimization, and
it's quite a significant improvement.

Setup:

create table t1 (id text primary key);
insert into t1 select x from generate_series(1,100) x(x);
create table t2 (id text primary key);
insert into t2 select x from generate_series(1,100) x(x);
vacuum freeze;

Query:
select count(*) from t1 inner join t2 on t1.id=t2.id;

Unpatched

Time: 369.618 ms
Time: 358.208 ms
Time: 357.094 ms
Time: 355.642 ms
Time: 358.193 ms
Time: 371.272 ms
Time: 354.386 ms
Time: 364.277 ms
Time: 346.091 ms
Time: 358.757 ms

Patched

Time: 273.154 ms
Time: 258.520 ms
Time: 269.456 ms
Time: 252.861 ms
Time: 271.015 ms
Time: 252.567 ms
Time: 271.132 ms
Time: 267.505 ms
Time: 265.295 ms
Time: 257.068 ms

About 26.5% improvement for this case.

> * The patch applies point #1 to only INNER and LEFT join modes, but
> I don't really see why the idea wouldn't work for RIGHT and FULL modes,
> ie the optimization seems potentially interesting for all executable join
> types.  Once you've got a match, you can immediately go to the next outer
> tuple instead of continuing to scan inner.  (Am I missing something?)

No I believe I started development with LEFT JOINs, moved to INNER,
then didn't progress to think of RIGHT or FULL. I've lifted this
restriction in the patch. It seems no other work is required to have
that just work.

> * Particularly in view of the preceding point, I'm not that happy with
> the way that management/caching of the "is it unique" knowledge is
> done completely differently for INNER and LEFT joins.  I wonder if
> there's actually a good argument for that or is it mostly a development
> sequence artifact.  IOW, would it hurt to drop the SpecialJoinInfo
> tie-in and just rely on the generic cache?

I agree that special handling of one join type is not so pretty.
However, LEFT JOINs still remain a bit special as they're the only
ones we currently perform join removal on, and the patch modifies that
code to make use of the new flag for those. This can improve planner
performance of join removal when a join is removed successfully, as
the previous code had to recheck uniqueness of each remaining LEFT
JOIN again, whereas the new code only checks uniqueness of ones not
previously marked as unique. This too likely could be done with the
cache, although I'm a bit concerned with populating the cache, then
performing a bunch of LEFT JOIN removals and leaving relids in the
cache which no longer exist. Perhaps it's OK. I've just not found
proofs in my head yet that it is.

> * Because of where we apply the short-circuit logic in the executor,
> it's only safe to consider the inner rel as unique if it is provably
> unique using only the join clauses that drive the primary join mechanism
> (ie, the "joinquals" not the "otherquals").  We already do ignore quals
> that are pushed-down to an outer join, so that's good, but there's an
> oversight: we will use a qual that is mergejoinable even if it's not
> hashjoinable. That means we could get the wrong answers in a hash join.
> I think probably the appropriate fix for the moment is just to consider
> only clauses that are both mergeable and hashable while trying to prove
> uniqueness.  We do have some equality operators that support only one
> or the other, but they're corner cases, and I'm dubious that it's worth
> having to make separate proofs for merge and hash joins in order to
> cater to those cases.

hmm. I'm having trouble understanding why this is a problem for Unique
joins, but not for join removal?

However you mentioning this cause me to notice that this is only true
in the patch for left joins, and the other join types are not
consistent with that.

create table a1 (a int, b int, primary key(a,b));
create table a2 (a int, b int, primary key(a,b));

explain (verbose, costs off) select * from a1 left join a2 on a1.a =
a2.a and a2.b=1 ;
QUERY PLAN
--
 Merge Left Join
   Output: a1.a, a1.b, a2.a, a2.b
   Inner Unique: Yes
   Merge Cond: (a1.a = a2.a)
   ->  Index Only Scan using a1_pkey on public.a1
 Output: a1.a, a1.b
   ->  Sort
 Output: a2.a, a2.b
 Sort Key: a2.a
 ->  Bitmap Heap Scan on public.a2
   Output: a2.a, a2.b
   Recheck Cond: (a2.b = 1)
   ->  Bitmap Index Scan on a2_pkey
 Index Cond: (a2.b = 1)



Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superuser check

2017-01-27 Thread Simon Riggs
On 27 January 2017 at 12:56, Dave Page  wrote:

> Probably the most common complaint I get from users
> regarding the management & monitoring tools I work on is that they
> have to use superuser accounts to get the full benefits, unlike other
> DBMSs where you can create a role with just the required privileges
> (or indeed, other DBMSs that ship with such roles pre-defined for
> convenience).

This is still just the Adminpack argument. This has been going on for
about a decade? Longer.

If the monitoring tool requires superuser then that is a problem, so
it would be helpful if it didn't do that, please. Not much use having
a cool tool if it don't work with the server.

The management and monitoring tool could be more specific about what
it actually needs, rather than simply requesting generic read and
write against the filesystem. Then we can put those specific things
into the server and we can all be happy. Again, a detailed list would
help here.

Does the latest version of pgadmin provide access to log files? I
can't see much that really needs Adminpack anymore, though I've not
done a thorough analysis at all.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow interrupts on waiting standby

2017-01-27 Thread Michael Paquier
On Fri, Jan 27, 2017 at 10:35 AM, Michael Paquier
 wrote:
> On Fri, Jan 27, 2017 at 4:36 AM, Simon Riggs  wrote:
>> On 26 January 2017 at 19:20, Andres Freund  wrote:
>>> On 2017-01-26 12:24:44 -0500, Robert Haas wrote:
 On Thu, Jan 26, 2017 at 7:18 AM, Simon Riggs  wrote:
 > Currently a waiting standby doesn't allow interrupts.
 >
 > Patch implements that.
 >
 > Barring objection, patching today with backpatches.

 "today" is a little quick, but the patch looks fine.  I doubt anyone's
 going to screech too loud about adding a CHECK_FOR_INTERRUPTS() call.
>>>
>>> I don't quite get asking for agreement, and then not waiting as
>>> suggested.  I'm personally fine with going with a CHECK_FOR_INTERRUPTS
>>> for now, but I think it'd better to replace it with a latch.
>>
>> I have waited, so not sure what you mean. Tomorrow is too late.
>
> This gives really little time for any feedback :(
>
>> Replacing with a latch wouldn't be backpatchable, IMHO.
>> I've no problem if you want to work on a deeper fix for future versions.
>
> A deeper fix for HEAD proves to not be that complicated. Please see
> the attached. The other two calls of pg_usleep() in standby.c are
> waiting with 5ms and 10ms, it is not worth switching them to a latch.

Two things I forgot in this patch:
- documentation for the new wait event
- the string for the wait event or this would show up as "???" in
pg_stat_activity.
There are no default clauses in the pgstat_get_wait_* routines so my
compiler is actually complaining...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance improvement for joins where outer side is unique

2017-01-27 Thread Antonin Houska
I thought about the patch from the perspective of "grouped relations"
(especially [1]). When looking for the appropriate context within the thread,
I picked this message.

David Rowley  wrote:

> On 12 March 2016 at 11:43, Tom Lane  wrote:
> >
> > It seems like the major intellectual complexity here is to figure out
> > how to detect inner-side-unique at reasonable cost.  I see that for
> > LEFT joins you're caching that in the SpecialJoinInfos, which is probably
> > fine.  But for INNER joins it looks like you're just doing it over again
> > for every candidate join, and that seems mighty expensive.

> ... I'll look into that.
> 
> The other thing I thought of was to add a dedicated list for unique
> indexes in RelOptInfo, this would also allow
> rel_supports_distinctness() to do something a bit smarter than just
> return false if there's no indexes. That might not buy us much though,
> but at least relations tend to have very little unique indexes, even
> when they have lots of indexes.

I'm thinking of a concept of "unique keys", similar to path keys that the
planner already uses. Besides the current evaluation of uniqueness of the
inner side of a join, the planner would (kind of) union the unique keys of the
joined rels, ie compute a list of expressions which generates an unique row
throughout the new join result. (Requirement is that each key must be usable
in join expression, as opposed to filter.)

To figure out whether at most one inner row exists per outer row, each unique
key of the inner relation which references the outer relation needs to match
an unique key of the outer relation (but it's probably wrong if multiple
unique keys of the inner rel reference the same key of the outer rel).

Like path key, the unique key would also point to an equivalence class. Thus
mere equality of the EC pointers could perhaps be used to evaluate the match
of the inner and outer keys.

Given that rel_is_distinct_for() currently does not accept joins, this change
would make the patch more generic. (BTW, with this approach, unique_rels and
non_unique_rels caches would have to be stored per-relation (RelOptInfo), as
opposed to PlannerInfo.)

The reason I'd like the unique keys is that - from the "grouped relation"
point of view - the relation uniqueness also needs to be checked against the
GROUP BY clause. Thus the "unique keys" concept seem to me like an useful
abstraction.

Does this proposal seem to have a serious flaw?

[1]
https://www.postgresql.org/message-id/CAKJS1f_h1CLff92B%3D%2BbdrMK2Nf3EfGWaJu2WbzQUYcSBUi02ag%40mail.gmail.com

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >