Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Paul Ramsey
On Tue, Jul 21, 2015 at 7:45 AM, Andres Freund and...@anarazel.de wrote:

 +
 + /* We need this relation to scan */
 + depRel = heap_open(DependRelationId, RowExclusiveLock);
 +
 + /* Scan the system dependency table for a all entries this operator */
 + /* depends on, then iterate through and see if one of them */
 + /* is a registered extension */
 + ScanKeyInit(key[0],
 + Anum_pg_depend_objid,
 + BTEqualStrategyNumber, F_OIDEQ,
 + ObjectIdGetDatum(procnumber));
 +
 + scan = systable_beginscan(depRel, DependDependerIndexId, true,
 +   
 GetCatalogSnapshot(depRel-rd_id), nkeys, key);
 +
 + while (HeapTupleIsValid(tup = systable_getnext(scan)))
 + {
 + Form_pg_depend foundDep = (Form_pg_depend) GETSTRUCT(tup);
 +
 + if ( foundDep-deptype == DEPENDENCY_EXTENSION )
 + {
 + List *extlist = fpinfo-extensions;
 + ListCell *ext;
 +
 + foreach(ext, extlist)
 + {
 + Oid extension_oid = (Oid) lfirst(ext);
 + if ( foundDep-refobjid == extension_oid )
 + {
 + nresults++;
 + }
 + }
 + }
 + if ( nresults  0 ) break;
 + }
 +
 + systable_endscan(scan);
 + relation_close(depRel, RowExclusiveLock);
 +
 + return nresults  0;
 +}

 Phew. That's mighty expensive to do at frequency.

 I guess it'll be more practical to expand this list once and then do a
 binary search on the result for the individual functions

So, right after reading the options in postgresGetForeignRelSize,
expand the extension list into a list of all ops/functions, in a
sorted list, and let that carry through to the deparsing instead? That
would happen once per query, right? But the deparsing also happens
once per query too, yes? Is the difference going to be that big? (I
speak not knowing the overhead of things like systable_beginscan, etc)

Thanks!

P


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Andres Freund
On 2015-07-21 07:55:17 -0700, Paul Ramsey wrote:
 On Tue, Jul 21, 2015 at 7:45 AM, Andres Freund and...@anarazel.de wrote:
 So, right after reading the options in postgresGetForeignRelSize,
 expand the extension list into a list of all ops/functions, in a
 sorted list, and let that carry through to the deparsing instead?

I'd actually try to make it longer lived, i.e. permanently. And just
deallocate when a catcache callback says it needs to be invalidated;
IIRC there is a relevant cache.

 That
 would happen once per query, right? But the deparsing also happens
 once per query too, yes? Is the difference going to be that big? (I
 speak not knowing the overhead of things like systable_beginscan, etc)

What you have here is an O(nmembers * nexpressions) approach. And each
of those scans is going to do non-neglegible work (an index scan of
pg_depend), that's fairly expensive.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Andres Freund
On 2015-07-21 17:00:51 +0200, Andres Freund wrote:
 On 2015-07-21 07:55:17 -0700, Paul Ramsey wrote:
  On Tue, Jul 21, 2015 at 7:45 AM, Andres Freund and...@anarazel.de wrote:
  So, right after reading the options in postgresGetForeignRelSize,
  expand the extension list into a list of all ops/functions, in a
  sorted list, and let that carry through to the deparsing instead?
 
 I'd actually try to make it longer lived, i.e. permanently. And just
 deallocate when a catcache callback says it needs to be invalidated;
 IIRC there is a relevant cache.

On second thought I'd not use a binary search but a hash table. If you
choose the right key a single table is enough for the lookup.

If you need references for invalidations you might want to look for
CacheRegisterSyscacheCallback callers. E.g. attoptcache.c


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for intarray with @@

2015-07-21 Thread Heikki Linnakangas

On 07/21/2015 03:44 PM, Alexander Korotkov wrote:

While Uriy is on vacation, I've revised this patch a bit.


I whacked this around quite a bit, and I think it's in a committable 
state now. But if you could run whatever tests you were using before on 
this, to make sure it still produces the same estimates, that would be 
great. I didn't change the estimates it should produce, only the code 
structure.


One thing that bothers me slightly with this patch is the way it peeks 
into the Most-Common-Elements arrays, which is produced by the built-in 
type analyze function. If we ever change what statistics are collected 
for arrays, or the way they are stored, this might break. In matchsel, 
why don't we just call the built-in estimator function for each element 
that we need to probe, and not look into the statistics ourselves at 
all? I actually experimented with that, and it did slash much of the 
code, and it would be more future-proof. However, it was also a lot 
slower for queries that contain multiple values. That's understandable: 
the built-in estimator will fetch the statistics tuple, parse the 
arrays, etc. separately for each value in the query_int, while this 
patch will do it only once for the whole query, and perform a simple 
binary search for each value. So overall, I think this is OK as it is. 
But if we find that we need to use the MCE list in this fashion in more 
places in the future, it might be worthwhile to add some support code 
for this in the backend to allow extracting the stats once, and doing 
multiple lightweight estimations using the extracted stats.


Some things I fixed/changed:

* I didn't like that transformOperator() function, which looked up the 
function's name. I replaced it with separate wrapper functions for each 
operator, so that the built-in operator's OID can be hardcoded into each.


* I refactored the matchsel function heavily. I think it's more readable 
now.


* I got rid of the Int4Freq array. It didn't seem significantly easier 
to work with than the separate values/numbers arrays, so I just used 
those directly.


* Also use the matchsel estimator for ~~ (the commutator of @@)

* Also use the estimators for the obsolete @ and ~ operators. Not that I 
care much about those as they are obsolete, but seems strange not to, as 
it's a trivial matter of setting the right estimator function.


* I added an ANALYZE in the regression test. It still won't 
systematically test all the cost estimation code, and there's nothing to 
check that the estimates make sense, but at least more of the code will 
now run.


- Heikki

diff --git a/contrib/intarray/Makefile b/contrib/intarray/Makefile
index 920c5b1..5ea7f2a 100644
--- a/contrib/intarray/Makefile
+++ b/contrib/intarray/Makefile
@@ -2,10 +2,10 @@
 
 MODULE_big = _int
 OBJS = _int_bool.o _int_gist.o _int_op.o _int_tool.o \
-	_intbig_gist.o _int_gin.o $(WIN32RES)
+	_intbig_gist.o _int_gin.o _int_selfuncs.o $(WIN32RES)
 
 EXTENSION = intarray
-DATA = intarray--1.0.sql intarray--unpackaged--1.0.sql
+DATA = intarray--1.1.sql intarray--1.0--1.1.sql intarray--unpackaged--1.0.sql
 PGFILEDESC = intarray - functions and operators for arrays of integers
 
 REGRESS = _int
diff --git a/contrib/intarray/_int_selfuncs.c b/contrib/intarray/_int_selfuncs.c
new file mode 100644
index 000..4896461
--- /dev/null
+++ b/contrib/intarray/_int_selfuncs.c
@@ -0,0 +1,334 @@
+/*-
+ *
+ * _int_selfuncs.c
+ *	  Functions for selectivity estimation of intarray operators
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  contrib/intarray/_int_selfuncs.c
+ *
+ *-
+ */
+#include postgres.h
+#include _int.h
+
+#include access/htup_details.h
+#include catalog/pg_operator.h
+#include catalog/pg_statistic.h
+#include catalog/pg_type.h
+#include utils/selfuncs.h
+#include utils/syscache.h
+#include utils/lsyscache.h
+#include miscadmin.h
+
+PG_FUNCTION_INFO_V1(_int_overlap_sel);
+PG_FUNCTION_INFO_V1(_int_contains_sel);
+PG_FUNCTION_INFO_V1(_int_contained_sel);
+PG_FUNCTION_INFO_V1(_int_overlap_joinsel);
+PG_FUNCTION_INFO_V1(_int_contains_joinsel);
+PG_FUNCTION_INFO_V1(_int_contained_joinsel);
+PG_FUNCTION_INFO_V1(_int_matchsel);
+
+Datum		_int_overlap_sel(PG_FUNCTION_ARGS);
+Datum		_int_contains_sel(PG_FUNCTION_ARGS);
+Datum		_int_contained_sel(PG_FUNCTION_ARGS);
+Datum		_int_overlap_joinsel(PG_FUNCTION_ARGS);
+Datum		_int_contains_joinsel(PG_FUNCTION_ARGS);
+Datum		_int_contained_joinsel(PG_FUNCTION_ARGS);
+Datum		_int_matchsel(PG_FUNCTION_ARGS);
+
+
+static Selectivity int_query_opr_selec(ITEM *item, Datum *values, float4 *freqs,
+	int nmncelems, float4 minfreq);
+static int	compare_val_int4(const void *a, const void *b);
+
+/*
+ * Wrappers around the default array 

[HACKERS] ALTER TABLE .. ADD PRIMARY KEY .. USING INDEX has dump-restore hazard

2015-07-21 Thread Robert Haas
Consider:

rhaas=# create table foo (a text);
CREATE TABLE
rhaas=# create unique index on foo (a collate C);
CREATE INDEX
rhaas=# alter table foo add primary key using index foo_a_idx;
ALTER TABLE
rhaas=# \d foo
Table public.foo
 Column | Type | Modifiers
+--+---
 a  | text | not null
Indexes:
foo_a_idx PRIMARY KEY, btree (a COLLATE C)

Now dump and restore this database.  Then:

rhaas=# \d foo
Table public.foo
 Column | Type | Modifiers
+--+---
 a  | text | not null
Indexes:
foo_a_idx PRIMARY KEY, btree (a)

Notice that the collation specifier is gone.  Oops.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for intarray with @@

2015-07-21 Thread Alexander Korotkov
On Tue, Jul 21, 2015 at 6:49 PM, Heikki Linnakangas hlinn...@iki.fi wrote:

 On 07/21/2015 03:44 PM, Alexander Korotkov wrote:

 While Uriy is on vacation, I've revised this patch a bit.


 I whacked this around quite a bit, and I think it's in a committable state
 now. But if you could run whatever tests you were using before on this, to
 make sure it still produces the same estimates, that would be great. I
 didn't change the estimates it should produce, only the code structure.

 One thing that bothers me slightly with this patch is the way it peeks
 into the Most-Common-Elements arrays, which is produced by the built-in
 type analyze function. If we ever change what statistics are collected for
 arrays, or the way they are stored, this might break. In matchsel, why
 don't we just call the built-in estimator function for each element that we
 need to probe, and not look into the statistics ourselves at all? I
 actually experimented with that, and it did slash much of the code, and it
 would be more future-proof. However, it was also a lot slower for queries
 that contain multiple values. That's understandable: the built-in estimator
 will fetch the statistics tuple, parse the arrays, etc. separately for each
 value in the query_int, while this patch will do it only once for the whole
 query, and perform a simple binary search for each value. So overall, I
 think this is OK as it is. But if we find that we need to use the MCE list
 in this fashion in more places in the future, it might be worthwhile to add
 some support code for this in the backend to allow extracting the stats
 once, and doing multiple lightweight estimations using the extracted
 stats.


Yeah, I see. We could end up with something like this. But probably we
would need something more general for extensions which wants to play with
statistics.
For instance, pg_trgm could estimate selectivity for text % text
operator. But in order to provide that it needs trigram statistics. Now it
could be implemented by defining separate datatype, but it's a kluge.
Probably, we would end up with custom additional statistics for datatypes.


 Some things I fixed/changed:

 * I didn't like that transformOperator() function, which looked up the
 function's name. I replaced it with separate wrapper functions for each
 operator, so that the built-in operator's OID can be hardcoded into each.

 * I refactored the matchsel function heavily. I think it's more readable
 now.

 * I got rid of the Int4Freq array. It didn't seem significantly easier to
 work with than the separate values/numbers arrays, so I just used those
 directly.

 * Also use the matchsel estimator for ~~ (the commutator of @@)


In this version of patch it's not checked if variable is actually and int[]
not query_int. See following test case.

# create table test2 as (select '1'::query_int val from
generate_series(1,100));
# analyze test2;

# explain select * from test2 where '{1}'::int[] @@ val;
ERROR:  unrecognized int query item type: 0

I've added this check.

* Also use the estimators for the obsolete @ and ~ operators. Not that I
 care much about those as they are obsolete, but seems strange not to, as
 it's a trivial matter of setting the right estimator function.

 * I added an ANALYZE in the regression test. It still won't systematically
 test all the cost estimation code, and there's nothing to check that the
 estimates make sense, but at least more of the code will now run.


You also forgot to include intarray--1.0--1.1.sql into patch. I've also
added it.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


intarray-sel-4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Heikki Linnakangas

On 07/21/2015 04:14 PM, Alexander Korotkov wrote:

On Tue, Jul 21, 2015 at 3:52 PM, Heikki Linnakangas hlinn...@iki.fi wrote:


On 07/21/2015 02:56 PM, Alexander Korotkov wrote:


Probably, but currently we are in quite unlogical situation. We have
default fillfactor = 90 for GiST where it has no use cases at all and
effectively is just a waste of space.



Why is it useless for GiST?



It's because all of GiST pages are produced by page splits. So, just after
CREATE INDEX GiST pages aren't tightly packed in average. Actually, they
could be tightly packed by incredible coincidence, but for large indexes
it's quite safe assumption that they are not. With GiST we don't have storm
of page splits after index creation with fillfactor = 100. So, why should
we reserve additional space with fillfactor = 90?


Aha, I see. Yeah, that's pretty useless. Ideally, we would make the GiST 
build algorithm smarter so that it would pack the pages more tightly. I 
have no idea how to do that, however.


Anyway, the fact that fillfactor is useless for GiST is more of an 
argument for removing it from GiST, than for adding it to GIN.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What is HeapScanDescData.rs_initblock good for?

2015-07-21 Thread Tom Lane
The BRIN patch added a HeapScanDescData field rs_initblock, but so far as
I can see it's utterly without use, and it's quite confusing (people might
mistake it for rs_startblock, for example).  Any objection to taking it
out again?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-21 Thread Merlin Moncure
On Tue, Jul 21, 2015 at 2:53 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 On 07/21/2015 10:38 AM, Pavel Stehule wrote:

 where we are with this patch? Can I do some for it?


 I still feel this approach is misguided, and we should be tweaking psql
 and/or libpq instead. I don't feel strongly though, and if some other
 committer wants to pick this up in its current form, I won't object. So this
 patch has reached an impasse, and if no-one else wants to pick this up, I'm
 going to mark this as Returned with Feedback and move on.

That's unfortunate.  Maybe I'm missing something:

What does a client side implementation offer that a server side
implementation does not offer?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-21 Thread Petr Jelinek

On 2015-07-20 17:23, Tom Lane wrote:


Doesn't look like it to me: heap_beginscan_sampling always passes
allow_sync = false to heap_beginscan_internal.



Oh, right.


More to the point, the existing coding of all these methods is such
that they would fail to be reproducible if syncscan were enabled,
because the scan start point wouldn't be the same.  That's fixable,
but it'll take more work than just dealing with the above oversight.
In any case, given that Simon has stated he wants support for sample
methods that don't try to be reproducible, we need the method
to be able to say whether syncscan is allowed.



I am not completely clear on how we do this tbh, do you mean we use the 
info about repeatability for this?


Another thing that's not clear to me after playing with this is how do 
we want to detect if to do pagemode scan or not. I understand that it's 
neat optimization to say pagemode = true in bernoulli when percentage is 
high and false when it's low but then this would have to come from the 
BeginSampleScan() in the proposed API, but then BeginSampleScan would 
not have access to HeapScanDesc as it would not be inited yet when it's 
called. The info that BeginSampleScan() needs can be obtained directly 
from ss_currentRelation I guess, but it's somewhat strange to pass 
semi-initialized SampleScanState to the BeginSampleScan().


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch is really in pretty sad shape

2015-07-21 Thread Tom Lane
Petr Jelinek p...@2ndquadrant.com writes:
 Another thing that's not clear to me after playing with this is how do 
 we want to detect if to do pagemode scan or not. I understand that it's 
 neat optimization to say pagemode = true in bernoulli when percentage is 
 high and false when it's low but then this would have to come from the 
 BeginSampleScan() in the proposed API, but then BeginSampleScan would 
 not have access to HeapScanDesc as it would not be inited yet when it's 
 called.

Right.

 The info that BeginSampleScan() needs can be obtained directly 
 from ss_currentRelation I guess, but it's somewhat strange to pass 
 semi-initialized SampleScanState to the BeginSampleScan().

Doesn't seem like a big problem from here.  The HeapScanDesc pointer
will be null at that point, so it's not like attempts to access it
would escape notice.

We could alternatively provide two scan-initialization callbacks,
one that analyzes the parameters before we do heap_beginscan,
and another that can do additional setup afterwards.  Actually,
that second call would really not be meaningfully different from
the ReScan call, so another solution would be to just automatically
invoke ReScan after we've created the HeapScanDesc.  TSMs could work
around not having this by complicating their NextBlock function a bit
(so that it would do some initialization on first call) but perhaps
it would be easier to have the additional init call.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Robert Haas
On Tue, Jul 21, 2015 at 10:42 AM, Fabien COELHO coe...@cri.ensmp.fr wrote:
   sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats

 That is a truly horrifying abuse of command-line arguments.  -1 from
 me, or minus more than one if I've got that many chits to burn.

 Are you against the -w, or against saying that pgbench execute scripts,
 whether internal or from files?

I'm against the idea that we accept multiple arguments for scripts,
and that a subsequent -w modifies the meaning of the
script-specifiying argument already read.  That strikes me as a very
unintuitive interface.  I'm not sure exactly what would be better at
the moment, but I think we need something better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO

5~5~5~

That is a truly horrifying abuse of command-line arguments.  -1 from
me, or minus more than one if I've got that many chits to burn.


Are you against the -w, or against saying that pgbench execute scripts,
whether internal or from files?


I'm against the idea that we accept multiple arguments for scripts,


Pgbench *currently* already accept multiple -f ... options, and this is 
a good thing to test realistic loads which may intermix several kind of 
transactions, say a lot of readonly and some update or insert, and very 
rare deletes...


Now if you do not need it you do not use it, and all is fine. Once you 
have several scripts, being able to weight them becomes useful for 
realism.


and that a subsequent -w modifies the meaning of the script-specifiying 
argument already read. That strikes me as a very unintuitive interface.


Ok, I understand this afterward modification objection.

What if the -w would be required *before*, and supply a weight for (the 
first/maybe all) script(s) specified *afterwards*, so it does not modify 
something already provided? I think it would be more intuitive, or at 
least less surprising.


I'm not sure exactly what would be better at the moment, but I think we 
need something better.


Maybe -f file.sql:weight (yuk from my point of view, but it can be 
done easily).


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Andres Freund
Hi,

On 2015-07-21 07:28:22 -0700, Paul Ramsey wrote:
  /*
 @@ -229,6 +236,9 @@ foreign_expr_walker(Node *node,
   Oid collation;
   FDWCollateState state;

 + /* Access extension metadata from fpinfo on baserel */
 + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo 
 *)(glob_cxt-foreignrel-fdw_private);
 +
   /* Need do nothing for empty subexpressions */
   if (node == NULL)
   return true;
 @@ -361,7 +371,7 @@ foreign_expr_walker(Node *node,
* can't be sent to remote because it might 
 have incompatible
* semantics on remote side.
*/
 - if (!is_builtin(fe-funcid))
 + if (!is_builtin(fe-funcid)  
 !is_in_extension(fe-funcid, fpinfo))
   return false;

...

  /*
 + * Returns true if given operator/function is part of an extension declared 
 in the
 + * server options.
 + */
 +static bool
 +is_in_extension(Oid procnumber, PgFdwRelationInfo *fpinfo)
 +{
 + static int nkeys = 1;
 + ScanKeyData key[nkeys];
 + HeapTuple tup;
 + Relation depRel;
 + SysScanDesc scan;
 + int nresults = 0;
 +
 + /* Always return false if we don't have any declared extensions */
 + if ( ! fpinfo-extensions )
 + return false;
 +
 + /* We need this relation to scan */
 + depRel = heap_open(DependRelationId, RowExclusiveLock);
 +
 + /* Scan the system dependency table for a all entries this operator */
 + /* depends on, then iterate through and see if one of them */
 + /* is a registered extension */
 + ScanKeyInit(key[0],
 + Anum_pg_depend_objid,
 + BTEqualStrategyNumber, F_OIDEQ,
 + ObjectIdGetDatum(procnumber));
 +
 + scan = systable_beginscan(depRel, DependDependerIndexId, true,
 +   
 GetCatalogSnapshot(depRel-rd_id), nkeys, key);
 +
 + while (HeapTupleIsValid(tup = systable_getnext(scan)))
 + {
 + Form_pg_depend foundDep = (Form_pg_depend) GETSTRUCT(tup);
 +
 + if ( foundDep-deptype == DEPENDENCY_EXTENSION )
 + {
 + List *extlist = fpinfo-extensions;
 + ListCell *ext;
 +
 + foreach(ext, extlist)
 + {
 + Oid extension_oid = (Oid) lfirst(ext);
 + if ( foundDep-refobjid == extension_oid )
 + {
 + nresults++;
 + }
 + }
 + }
 + if ( nresults  0 ) break;
 + }
 +
 + systable_endscan(scan);
 + relation_close(depRel, RowExclusiveLock);
 +
 + return nresults  0;
 +}

Phew. That's mighty expensive to do at frequency.

I guess it'll be more practical to expand this list once and then do a
binary search on the result for the individual functions

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solaris testers wanted for strxfrm() behavior

2015-07-21 Thread Bjorn Munch
On 27/06 12.51, Noah Misch wrote:
 
 PostgreSQL 9.5 adds a strxfrm() call in bttext_abbrev_convert(), which does
 not account for the Solaris bug.  I wish to determine whether that bug is
 still relevant today.  If you have access to Solaris with the is_IS.ISO8859-1
 locale installed (or root access to install it), please compile and run the
 attached test program on each Solaris version you have available.  Reply here
 with the program's output.  I especially need a report from Solaris 10, but
 reports from older and newer versions are valuable.  Thanks.

I ran this program on Solaris 9 U5 (September 2006) on Sparc and got:

---
SunOS tor10-z1 5.9 Generic_Virtual sun4v sparc SUNW,SPARC-Enterprise-T5120
locale is_IS.ISO8859-1: strxfrm returned 162; last modified byte at 106 
(0xffd2)
locale is_IS.ISO8859-1: strxfrm returned 162; last modified byte at 106 
(0xffd2)
locale : strxfrm returned 26; last modified byte at 58 (0x0)
---

NB this runs in a Container on a Solaris 10 host. On our oldest
Solaris 10 sparc box, running Solaris 10 U6 (October 2008) I get:

---
SunOS tor07 5.10 Generic_137137-09 sun4v sparc SUNW,SPARC-Enterprise-T5120
locale is_IS.ISO8859-1: strxfrm returned 216; last modified byte at 58 (0x0)
locale is_IS.ISO8859-1: strxfrm returned 216; last modified byte at 58 (0x0)
locale : strxfrm returned 26; last modified byte at 27 (0x0)
---

On x86: Solaris 10 U5 (May 2008) I get:

---
SunOS loki08 5.10 Generic_127128-11 i86pc i386 i86pc
locale is_IS.ISO8859-1: strxfrm returned 216; last modified byte at 58 (0x0)
locale is_IS.ISO8859-1: strxfrm returned 216; last modified byte at 58 (0x0)
locale : strxfrm returned 26; last modified byte at 27 (0x0)
---

- Bjorn


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE .. ADD PRIMARY KEY .. USING INDEX has dump-restore hazard

2015-07-21 Thread Michael Paquier
On Wed, Jul 22, 2015 at 1:23 AM, Robert Haas robertmh...@gmail.com wrote:
 Notice that the collation specifier is gone.  Oops.

As it is not possible to specify directly a constraint for a PRIMARY
KEY expression, what about switching dumpConstraint to have it use
first a CREATE INDEX query with the collation and then use ALTER TABLE
to attach the constraint to it? I am noticing that we already fetch
the index definition in indxinfo via pg_get_indexdef. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP

2015-07-21 Thread Andreas Karlsson

On 07/02/2015 06:13 PM, Peter Eisentraut wrote:

I think this would be a useful feature, and the implementation looks
sound.  But I don't like how the reload is organized.  Reinitializing
the context in the sighup handler, aside from questions about how much
work one should do in a signal handler, would cause SSL reinitialization
for unrelated reloads.  We have the GUC assign hook mechanism for
handling this sort of thing.  The trick would be that when multiple
SSL-related settings change, you only want to do one reinitialization.
You could either have the different assign hooks communicate with each
other somehow, or have them set a need SSL init flag that is checked
somewhere else.


It is not enough to just add a hook to the GUCs since I would guess most 
users would expect the certificate to be reloaded if just the file has 
been replaced and no GUC was changed. To support this we would need to 
also check the mtimes of the SSL files, would that complexity really be 
worth it?


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha2/Beta1

2015-07-21 Thread Noah Misch
On Tue, Jul 21, 2015 at 02:01:57PM -0700, Josh Berkus wrote:
 At the Developer's Meeting, we said we'd be releasing an alpha or beta
 each month until final 9.5 release.  As such, we should be releasing a
 9.5 alpha/beta in the first week of August.
 
 My question for Hackers is: should this be Alpha2 or Beta 1?

I vote for alpha2.  Comparing the Open Issues and resolved after 9.5alpha1
sections of https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items, we've
not transitioned to a qualitatively different level of API stability.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux kernel performance regression tests and Postgres

2015-07-21 Thread Michael Paquier
On Wed, Jul 22, 2015 at 1:27 PM, Peter Geoghegan p...@heroku.com wrote:
 Mel Gorman wrote a tool to perform Linux kernel performance testing.
 Detecting performance regressions in new kernels appears to be the
 major goal.

 I note that there are Postgres specific tests listed here:

 http://www.csn.ul.ie/~mel/results/home/marvin/dashboard-openSUSE-13.1-smart.html

That's really cool. Thanks for sharing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for intarray with @@

2015-07-21 Thread Alexander Korotkov
On Tue, Jul 21, 2015 at 5:14 PM, Teodor Sigaev teo...@sigaev.ru wrote:

 Some notices about code:

 1 Near function transformOperator() there is wrong operator name @


Fixed.


 2 int_query (and intquery) should be replaced to query_int to be
 consistent with actual type name. At least where it's used as separate
 lexeme.


Fixed.


 3 For historical reasons @@ doesn't commutate with itself, it has a
 commutator ~~. Patch assumes that @@ is self-commutator, but ~~ will use
 'contsel' as a restrict estimation. Suppose, we need to declare ~~ as
 deprecated and introduce commutating @@ operator.


I think deprecating ~~ is a subject of separate patch. I fixed patch
behavior in the different way. @@ and ~~ are now handled by the same
function. The function handles var @@ const and const ~~ var, but
doesn't handle const @@ var and var  ~~ const. It determines the case
by type of variable: it should be int[].

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


intarray_sel-3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Alexander Korotkov
On Tue, Jul 21, 2015 at 7:20 PM, Heikki Linnakangas hlinn...@iki.fi wrote:

 On 07/21/2015 04:14 PM, Alexander Korotkov wrote:

 On Tue, Jul 21, 2015 at 3:52 PM, Heikki Linnakangas hlinn...@iki.fi
 wrote:

  On 07/21/2015 02:56 PM, Alexander Korotkov wrote:

  Probably, but currently we are in quite unlogical situation. We have
 default fillfactor = 90 for GiST where it has no use cases at all and
 effectively is just a waste of space.


 Why is it useless for GiST?



 It's because all of GiST pages are produced by page splits. So, just after
 CREATE INDEX GiST pages aren't tightly packed in average. Actually, they
 could be tightly packed by incredible coincidence, but for large indexes
 it's quite safe assumption that they are not. With GiST we don't have
 storm
 of page splits after index creation with fillfactor = 100. So, why should
 we reserve additional space with fillfactor = 90?


 Aha, I see. Yeah, that's pretty useless. Ideally, we would make the GiST
 build algorithm smarter so that it would pack the pages more tightly. I
 have no idea how to do that, however.

 Anyway, the fact that fillfactor is useless for GiST is more of an
 argument for removing it from GiST, than for adding it to GIN.


Yes it is. I've just think that fillfactor is more useful for GIN than for
GiST now. However, it's probably doesn't worth it for both of them.
One of our customers complains that freshly built GIN indexes get bloat
very fast. I've asked them a test case. Using that test case I would try to
see if fillfactor for GIN could help in practice. For now we can mark it as
Returned with feedback where feedback is Real life use cases needed.
Removing fillfactor for GiST is now not easy. Once it's exposed for users,
removing it completely would break compatibility. I would propose to chage
default fillfactor for GiST from 90 to 100.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] What is HeapScanDescData.rs_initblock good for?

2015-07-21 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 The BRIN patch added a HeapScanDescData field rs_initblock, but so far as
 I can see it's utterly without use, and it's quite confusing (people might
 mistake it for rs_startblock, for example).  Any objection to taking it
 out again?

 Ouch, you're right, my mistake.  Feel free to remove it, yeah.

... While I'm looking at it, it sure looks like the BRIN patch broke
syncscan for those index build methods that were using it, which was
most.  You've got IndexBuildHeapRangeScan unconditionally calling
heap_setscanlimits and thereby trashing the result of ss_get_location().

I'm inclined to let it call heap_setscanlimits only if not allow_sync.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Paul Ramsey
On July 21, 2015 at 11:07:36 AM, Tom Lane (t...@sss.pgh.pa.us) wrote:
I'm inclined to think that it's not really necessary to worry about 
invalidating a per-connection cache of is this function safe to ship 
determinations.
So: yes to a local cache of all forwardable functions/ops, populated in full 
the first time through (does that speak maybe to using a binary search on a 
sorted list instead of a hash, since I only pay the sort price once and am not 
doing any insertions?). And then we just hold it until the connection goes 
away. 

Yes?

P.

Re: [HACKERS] What is HeapScanDescData.rs_initblock good for?

2015-07-21 Thread Alvaro Herrera
Tom Lane wrote:
 The BRIN patch added a HeapScanDescData field rs_initblock, but so far as
 I can see it's utterly without use, and it's quite confusing (people might
 mistake it for rs_startblock, for example).  Any objection to taking it
 out again?

Ouch, you're right, my mistake.  Feel free to remove it, yeah.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-21 Thread Jim Nasby

On 7/20/15 4:32 AM, Thakur, Sameer wrote:

Hello,

Does this actually handle multiple indexes? It doesn't appear so, which I'd 
think is a significant problem... :/

Please find v2 attached which does this.


I think it'd be better to combine both numbers into one report:

elog(WARNING,Current/Overall index percentage completion %f/%f, 
current_index_progress * 100, all_index_progress);


It'd also be good to standardize on where the * 100 is happening.

Also, AFAIK:

(itemptr-ip_blkid.bi_hi != vacrelstats-last_scanned_page.bi_hi) || 
(itemptr-ip_blkid.bi_lo != vacrelstats-last_scanned_page.bi_lo)


can be replaced by

(itemptr-ipblkid != vacrelstats-last_scanned_page)

and

vacrelstats-current_index_scanned_page_count = 
vacrelstats-current_index_scanned_page_count + 1;


can simply be

vacrelstats-current_index_scanned_page_count++;
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Paul Ramsey
 
On July 21, 2015 at 8:26:31 AM, Andres Freund 
(and...@anarazel.de(mailto:and...@anarazel.de)) wrote:
 On 2015-07-21 17:00:51 +0200, Andres Freund wrote:
  On 2015-07-21 07:55:17 -0700, Paul Ramsey wrote:
   On Tue, Jul 21, 2015 at 7:45 AM, Andres Freund wrote:
   So, right after reading the options in postgresGetForeignRelSize,
   expand the extension list into a list of all ops/functions, in a
   sorted list, and let that carry through to the deparsing instead?
 
  I'd actually try to make it longer lived, i.e. permanently. And just
  deallocate when a catcache callback says it needs to be invalidated;
  IIRC there is a relevant cache.
  
 On second thought I'd not use a binary search but a hash table. If you
 choose the right key a single table is enough for the lookup.
  
 If you need references for invalidations you might want to look for
 CacheRegisterSyscacheCallback callers. E.g. attoptcache.c

 On 2015-07-21 08:32:34 -0700, Paul Ramsey wrote: 
  Thanks! Reading some of the syscache callback stuff, one thing I’m not 
  sure of is which SysCacheIdentifier(s) I should be registering 
  callbacks against to ensure my list of funcs/procs that reside in 
  particular extensions is kept fresh. I don’t see anything tracking the 
  dependencies there 

 FOREIGNSERVEROID, and a new syscache on pg_extension.oid should suffice 
 I think. pg_foreign_server will be changed upon option changes and 
 pg_extension.oid on extension upgrades. 

 Since dependencies won't change independently of extension versions I 
 don't think we need to care otherwise. There's ALTER EXTENSION ... ADD 
 but I'm rather prepared to ignore that; if that's not ok it's trivial to 
 make it emit an invalidation. 

This hole just keeps getting deeper :) So,

- a HASH in my own code to hold all the functions that I consider “safe to 
forward” (which I’ll derive by reading the contents of the extensions the users 
declare)
- callbacks in my code registered using CacheRegisterSyscacheCallback on 
FOREIGNSERVEROID and __see_below__ that refresh my cache when called
- since there is no syscache for extensions right now, a new syscache entry for 
pg_extension.oid (and I ape things in syscache.h and syscache.c and Magic 
Occurs?)
- which means I can also CacheRegisterSyscacheCallback on the new EXTENSIONOID 
syscache
- and finally change my is_in_extension() to efficiently check my HASH instead 
of querying the system catalog

Folks are going to be OK w/ me dropping in new syscache entries so support my 
niche little feature?

ATB,

P






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Tom Lane
Paul Ramsey pram...@cleverelephant.ca writes:
 Folks are going to be OK w/ me dropping in new syscache entries so support my 
 niche little feature?

No, mainly because it adds overhead without fixing your problem.  It's not
correct to suppose that a syscache on pg_extension would reliably report
anything; consider ALTER EXTENSION ADD/DROP, which does not touch the
pg_extension row.

I'm inclined to think that it's not really necessary to worry about
invalidating a per-connection cache of is this function safe to ship
determinations.  Neither CREATE EXTENSION nor DROP EXTENSION pose any
hazard, nor would ALTER EXTENSION UPGRADE for typical scenarios (which
would only include adding new functions that weren't there before, so
they weren't in your cache anyway).

Anybody who's screwing around with extension membership on-the-fly is
unlikely to expect the system to redetermine ship-ability for active FDW
connections anyway.  If you could do that fully correctly for not a lot of
additional cost, sure; but really anything like this is only going to
take you from 99% to 99.01% coverage of real cases.  Doesn't seem worth
the trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-21 Thread Jim Nasby

On 7/20/15 11:07 AM, Marko Tiikkaja wrote:

On 7/20/15 6:02 PM, Corey Huinker wrote:

By using only(a.name_of_the_thing) we'd have a bit more clarity that the
author expected all of those values to be the same across the aggregate
window, and discovering otherwise was reason enough to fail the query.

*IF* we're considering adding these to core, I think that only() would be
just a slight modification of the last() implementation, and could be
done
at the same time.

[1] I don't care what it gets named. I just want the functionality.


A big +1 from me.  In fact, I wrote a patch implementing this for 9.5
but never got around to finishing it.


A big +1 here too; I've wanted this many times in the past.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for intarray with @@

2015-07-21 Thread Heikki Linnakangas

On 07/21/2015 07:28 PM, Alexander Korotkov wrote:

On Tue, Jul 21, 2015 at 6:49 PM, Heikki Linnakangas hlinn...@iki.fi wrote:

In this version of patch it's not checked if variable is actually and int[]
not query_int. See following test case.

# create table test2 as (select '1'::query_int val from
generate_series(1,100));
# analyze test2;

# explain select * from test2 where '{1}'::int[] @@ val;
ERROR:  unrecognized int query item type: 0

I've added this check.

* Also use the estimators for the obsolete @ and ~ operators. Not that I

care much about those as they are obsolete, but seems strange not to, as
it's a trivial matter of setting the right estimator function.

* I added an ANALYZE in the regression test. It still won't systematically
test all the cost estimation code, and there's nothing to check that the
estimates make sense, but at least more of the code will now run.


You also forgot to include intarray--1.0--1.1.sql into patch. I've also
added it.


Thanks, committed!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Peter Eisentraut
On 7/21/15 10:00 AM, Tom Lane wrote:
 I agree; this change may have seemed like a good idea at the time, but
 it was not.  Failures during make check's install step are rare enough
 that you don't really need all that output in your face to help with the
 rare situation where it fails.  And for the buildfarm's purposes, it is
 surely desirable to segregate that output from the actual check step.

It wasn't really an idea; it was just not necessary anymore.  We can put
it [directing the make install output into a file] back if that's what
people prefer.

 A possible alternative is to run the make install sub-step with -s,
 but that could be objected to on the grounds that if it did fail, you'd
 have a hard time telling exactly which step failed.

I usually run the whole make check with -s.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and replication slots

2015-07-21 Thread Peter Eisentraut
On 7/2/15 3:37 AM, Michael Paquier wrote:
 Regarding patch 3, I have more comments:
 1) I think that documentation should clearly mention that if -R and -S
 are used together, a primary_slot_name entry is added in the
 recovery.conf generated with the slot name defined.

Updated proposal attached.

 2)
  sub psql
  {
 my ($dbname, $sql) = @_;
 -   run [ 'psql', '-X', '-q', '-d', $dbname, '-f', '-' ], '', \$sql or 
 die;
 +   my ($stdout, $stderr);
 +   run [ 'psql', '-X', '-A', '-t', '-q', '-d', $dbname, '-f', '-'
 ], '', \$sql, '', \$stdout, '2', \$stderr or die;
 +   chomp $stdout;
 +   return $stdout;
  }
 RewindTest.pm has a routine called check_query defined. I would be
 great to extend a bit more psql than what I thought previously by
 returning from it ($result, $stdout, $strerr) and make check_query
 directly use it. This way we have a unique interface to run psql and
 capture output. I don't mind writing this refactoring patch on top of
 your set if that's necessary.

Let's do that as a separate patch.  Adding multiple return values makes
calling awkward, so I'd like to sort out the actual use cases before we
do that.

 3)
 +command_ok([ 'pg_basebackup', '-D', $tempdir/backupxs_sl_R, '-X',
 'stream', '-S', 'slot1', '-R' ],
 +   'pg_basebackup with replication slot and -R runs');
 +$recovery_conf = slurp_file $tempdir/backupR/recovery.conf;
 +like(slurp_file($tempdir/backupxs_sl_R/recovery.conf),
 slurp_file is slurping an incorrect file and $recovery_conf is used
 nowhere after, so you could simply remove this line.

Yeah, that was some leftover garbage.

From 7ecb1794c2aaeea9753af07e2f54f6e483af255f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pete...@gmx.net
Date: Tue, 21 Jul 2015 21:06:45 -0400
Subject: [PATCH 3/3] pg_basebackup: Add --slot option

This option specifies a replication slot for WAL streaming (-X stream),
so that there can be continuous replication slot use between WAL
streaming during the base backup and the start of regular streaming
replication.
---
 doc/src/sgml/ref/pg_basebackup.sgml  | 27 ---
 src/bin/pg_basebackup/pg_basebackup.c| 24 +++-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 22 +-
 src/test/perl/TestLib.pm |  5 -
 4 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index cb8b8a3..5f8b9b7 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -215,15 +215,36 @@ titleOptions/title
   listitem
 
para
-Write a minimal filenamerecovery.conf/filename in the output directory (or into
-the base archive file when using tar format) to ease setting
-up a standby server.
+Write a minimal filenamerecovery.conf/filename in the output
+directory (or into the base archive file when using tar format) to
+ease setting up a standby server.
+The filenamerecovery.conf/filename file will record the connection
+settings and, if specified, the replication slot
+that applicationpg_basebackup/application is using, so that the
+streaming replication will use the same settings later on.
/para
 
   /listitem
  /varlistentry
 
  varlistentry
+  termoption-S replaceableslotname/replaceable/option/term
+  termoption--slot=replaceable class=parameterslotname/replaceable/option/term
+  listitem
+   para
+This option can only be used together with literal-X
+stream/literal.  It causes the WAL streaming to use the specified
+replication slot.  If the base backup is intended to be used as a
+streaming replication standby using replication slots, it should then
+use the same replication slot name
+in filenamerecovery.conf/filename.  That way, it is ensured that
+the server does not remove any necessary WAL data in the time between
+the end of the base backup and the start of streaming replication.
+   /para
+  /listitem
+ /varlistentry
+
+ varlistentry
   termoption-T replaceable class=parameterolddir/replaceable=replaceable class=parameternewdir/replaceable/option/term
   termoption--tablespace-mapping=replaceable class=parameterolddir/replaceable=replaceable class=parameternewdir/replaceable/option/term
   listitem
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 5363680..80de882 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -239,6 +239,7 @@ usage(void)
 	   (in kB/s, or use suffix \k\ or \M\)\n));
 	printf(_(  -R, --write-recovery-conf\n
 			  write recovery.conf after backup\n));
+	printf(_(  -S, --slot=SLOTNAMEreplication slot to use\n));
 	printf(_(  -T, 

Re: [HACKERS] creating extension including dependencies

2015-07-21 Thread Michael Paquier
On Tue, Jul 21, 2015 at 11:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Petr Jelinek p...@2ndquadrant.com writes:
 ... My main question is if we are
 ok with SCHEMA having different behavior with CASCADE vs without
 CASCADE. I went originally with no and added the DEFAULT flag to
 SCHEMA. If the answer is yes then we don't need the flag (in that case
 CASCADE acts as the flag).

 Yeah, I was coming around to that position as well.  Insisting that
 SCHEMA throw an error if the extension isn't relocatable makes sense
 as long as only one extension is being considered, but once you say
 CASCADE it seems like mostly a usability fail.  I think it's probably
 OK if with CASCADE, SCHEMA is just use if needed else ignore.

OK, I'm fine with that, aka with CASCADE and a SCHEMA specified we use
it if needed or ignore it otherwise (if I am following correctly).

CREATE EXTENSION foo SCHEMA bar will fail if the extension is not
relocatable *and* does not have a schema specified in its control
file. A non-relocatable extension can be initially created anywhere.
It just cannot be moved afterwards from its original schema.

 Obviously we've gotta document all this properly.

Sure. That's a sine-qua-non condition for this patch.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-07-21 Thread Michael Paquier
On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov
a.korot...@postgrespro.ru wrote:
 attached patch allows pg_rewind to work when target timeline was switched.
 Actually, this patch fixes TODO from pg_rewind comments.

   /*
* Trace the history backwards, until we hit the target timeline.
*
* TODO: This assumes that there are no timeline switches on the target
* cluster after the fork.
*/

 This patch allows pg_rewind to handle data directory synchronization is much
 more general way. For instance, user can return promoted standby to old
 master.

Yes. That would be useful.

 In this patch target timeline history is exposed as global variable. Index
 in target timeline history is used in function interfaces instead of
 specifying TLI directly. Thus, SimpleXLogPageRead() can easily start reading
 XLOGs from next timeline when current timeline ends.

The implementation looks rather neat by having a first look at it, but
the attached patch fails to compile:
pg_rewind.c:488:4: error: use of undeclared identifier 'targetEntry'
targetEntry = targetHistory[i];
^
Nice to see as well that this has been added to the CF of September.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Michael Paquier
On Tue, Jul 21, 2015 at 10:34 PM, Andrew Dunstan wrote:
 OK, looks sane enough. but please do address the other issue.

Okidoki.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE .. ADD PRIMARY KEY .. USING INDEX has dump-restore hazard

2015-07-21 Thread Michael Paquier
On Wed, Jul 22, 2015 at 9:34 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Jul 22, 2015 at 1:23 AM, Robert Haas robertmh...@gmail.com wrote:
 Notice that the collation specifier is gone.  Oops.

 As it is not possible to specify directly a constraint for a PRIMARY
 KEY expression, what about switching dumpConstraint to have it use
 first a CREATE INDEX query with the collation and then use ALTER TABLE
 to attach the constraint to it? I am noticing that we already fetch
 the index definition in indxinfo via pg_get_indexdef. Thoughts?

And poking at that I have finished with the attached that adds a
CREATE INDEX query before ALTER TABLE ADD CONSTRAINT, to which a USING
INDEX is appended. Storage options as well as building the column list
becomes unnecessary because indexdef already provides everything what
is needed, so this patch makes dump rely more on what is on
backend-side.
-- 
Michael
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0e036b8..7a1e6db 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -14515,7 +14515,6 @@ dumpConstraint(Archive *fout, DumpOptions *dopt, ConstraintInfo *coninfo)
 	{
 		/* Index-related constraint */
 		IndxInfo   *indxinfo;
-		int			k;
 
 		indxinfo = (IndxInfo *) findObjectByDumpId(coninfo-conindex);
 
@@ -14527,6 +14526,10 @@ dumpConstraint(Archive *fout, DumpOptions *dopt, ConstraintInfo *coninfo)
 			binary_upgrade_set_pg_class_oids(fout, q,
 			 indxinfo-dobj.catId.oid, true);
 
+		if (coninfo-contype == 'p' ||
+			coninfo-contype == 'u')
+			appendPQExpBuffer(q, %s;\n, indxinfo-indexdef);
+
 		appendPQExpBuffer(q, ALTER TABLE ONLY %s\n,
 		  fmtId(tbinfo-dobj.name));
 		appendPQExpBuffer(q, ADD CONSTRAINT %s ,
@@ -14534,31 +14537,21 @@ dumpConstraint(Archive *fout, DumpOptions *dopt, ConstraintInfo *coninfo)
 
 		if (coninfo-condef)
 		{
+			/*
+			 * getIndexes sets a constraint definition only for exclusion
+			 * constraints.
+			 */
+			Assert(coninfo-contype == 'x');
+
 			/* pg_get_constraintdef should have provided everything */
 			appendPQExpBuffer(q, %s;\n, coninfo-condef);
 		}
 		else
 		{
-			appendPQExpBuffer(q, %s (,
+			appendPQExpBuffer(q, %s ,
 		 coninfo-contype == 'p' ? PRIMARY KEY : UNIQUE);
-			for (k = 0; k  indxinfo-indnkeys; k++)
-			{
-int			indkey = (int) indxinfo-indkeys[k];
-const char *attname;
-
-if (indkey == InvalidAttrNumber)
-	break;
-attname = getAttrName(indkey, tbinfo);
-
-appendPQExpBuffer(q, %s%s,
-  (k == 0) ?  : , ,
-  fmtId(attname));
-			}
-
-			appendPQExpBufferChar(q, ')');
 
-			if (indxinfo-options  strlen(indxinfo-options)  0)
-appendPQExpBuffer(q,  WITH (%s), indxinfo-options);
+			appendPQExpBuffer(q, USING INDEX %s, indxinfo-dobj.name);
 
 			if (coninfo-condeferrable)
 			{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha2/Beta1

2015-07-21 Thread Michael Paquier
On Wed, Jul 22, 2015 at 12:40 PM, Noah Misch n...@leadboat.com wrote:
 On Tue, Jul 21, 2015 at 02:01:57PM -0700, Josh Berkus wrote:
 At the Developer's Meeting, we said we'd be releasing an alpha or beta
 each month until final 9.5 release.  As such, we should be releasing a
 9.5 alpha/beta in the first week of August.

 My question for Hackers is: should this be Alpha2 or Beta 1?

 I vote for alpha2.  Comparing the Open Issues and resolved after 9.5alpha1
 sections of https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items, we've
 not transitioned to a qualitatively different level of API stability.

Mine goes for alpha2. There are some issues remaining yet that require
low-level attention.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO


Hello Robert,


Pgbench *currently* already accept multiple -f ... options, and this is a
good thing to test realistic loads which may intermix several kind of
transactions, say a lot of readonly and some update or insert, and very rare
deletes...


Hmm, I didn't realize that.  The code looks a bit inconsistent right
now - e.g. we do support multiple files, but pgbench's options-parsing
loop sets ttype to a value that depends only on the last of -f, -N,
and -S encountered.


Indeed. However as with current pgbench nothing/-N/-S and -f are 
mutually exclusive it is ok to have ttype set as it is.


With the patch pgbench just executes scripts and the options are not 
mutually exclusive: some scripts are internal and others are not, but they 
are treated the same beyond initialization, which helps removing some code 
including the ttype variable you mention. The name of the script is kept 
in an SQLScript struct along with its commands, weight and stats.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [BUGS] [HACKERS] object_classes array is broken, again

2015-07-21 Thread Alvaro Herrera
Jaimin Pan wrote:
 Hi all,
 
 How about this patch. I believe it will never missing someone and be
 detected while compiling.

Hmm, yeah this looks like something that's worth considering going
forward.  I can't think of any reason not to do this.  Perhaps we can
write getObjectClass using this, too.

The new file has a wrong comment above the list, copy-pasted from
rmgrlist.h.

I've always wondered about unifying OCLASS with OBJECT, but that's
probably a completely independent topic.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL logging problem in 9.4.3?

2015-07-21 Thread Todd A. Cook

Hi,

This thread seemed to trail off without a resolution.  Was anything done?
(See more below.)

On 07/09/15 19:06, Tom Lane wrote:

Andres Freund and...@anarazel.de writes:

On 2015-07-06 11:49:54 -0400, Tom Lane wrote:

Rather than reverting cab9a0656c36739f, which would re-introduce a
different performance problem, perhaps we could have COPY create a new
relfilenode when it does this.  That should be safe if the table was
previously empty.



I'm not convinced that cab9a0656c36739f needs to survive in that
form. To me only allowing one COPY to benefit from the wal_level =
minimal optimization has a significantly higher cost than
cab9a0656c36739f.


What evidence have you got to base that value judgement on?

cab9a0656c36739f was based on an actual user complaint, so we have good
evidence that there are people out there who care about the cost of
truncating a table many times in one transaction.


I'm the complainer mentioned in the cab9a0656c36739f commit message. :)

FWIW, we use a temp table to split a join across 4 largish tables
(10^8 rows or more each) and 2 small tables (10^6 rows each).  We
write the results of joining the 2 largest tables into the temp
table, and then join that to the other 4.  This gave significant
performance benefits because the planner would know the exact row
count of the 2-way join heading into the 4-way join.  After commit
cab9a0656c36739f, we got another noticeable performance improvement
(I did timings before and after, but I can't seem to put my hands
on the numbers right now).

We do millions of these queries every day in batches.  Each batch
reuses a single temp table (truncating it before each pair of joins)
so as to reduce the churn in the system catalogs.  In case it matters,
the temp table is created with ON COMMIT DROP.

This was (and still is) done on 9.2.x.

HTH.

-- todd cook
-- tc...@blackducksoftware.com


 On the other hand,

I know of no evidence that anyone's depending on multiple sequential
COPYs, nor intermixed COPY and INSERT, to be fast.  The original argument
for having this COPY optimization at all was to make restoring pg_dump
scripts in a single transaction fast; and that use-case doesn't care
about anything but a single COPY into a virgin table.

I think you're worrying about exactly the wrong case.


My tentative guess is that the best course is to
a) Make heap_truncate_one_rel() create a new relfeilnode. That fixes the
truncation replay issue.
b) Force new pages to be used when using the heap_sync mode in
COPY. That avoids the INIT danger you found. It seems rather
reasonable to avoid using pages that have already been the target of
WAL logging here in general.


And what reason is there to think that this would fix all the problems?
We know of those two, but we've not exactly looked hard for other cases.
Again, the only known field usage for the COPY optimization is the pg_dump
scenario; were that not so, we'd have noticed the problem long since.
So I don't have any faith that this is a well-tested area.

regards, tom lane






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Paul Ramsey
On July 21, 2015 at 11:22:12 AM, Tom Lane (t...@sss.pgh.pa.us) wrote:
 So: yes to a local cache of all forwardable functions/ops, populated in full 
 the first time through (does that speak maybe to using a binary search on a 
 sorted list instead of a hash, since I only pay the sort price once and am 
 not doing any insertions?). And then we just hold it until the connection 
 goes away.   

No, *not* populated first-time-through, because that won't handle any of  
the CREATE, DROP, or UPGRADE cases. It's also doing a lot of work you  
might never need. I was thinking of populate on demand, that is, first  
time you need to know whether function X is shippable, you find that out  
and then cache the answer (whether it be positive or negative).  

Roger that. Off to the races..

P

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO


[...] and that a subsequent -w modifies the meaning of the 
script-specifiying argument already read. That strikes me as a very 
unintuitive interface.


Ok, I understand this afterward modification objection.

What if the -w would be required *before*, and supply a weight for (the 
first/maybe all) script(s) specified *afterwards*, so it does not modify 
something already provided? I think it would be more intuitive, or at least 
less surprising.


Here is a v3 which does that. If there is a better idea, do not hesitate!

 sh ./pgbench -w 9 -f one.sql -f now.sql -T 2 -P 1 --per-script-stats
 starting vacuum...end.
 progress: 1.0 s, 24536.0 tps, lat 0.039 ms stddev 0.024
 progress: 2.0 s, 25963.8 tps, lat 0.038 ms stddev 0.015
 transaction type: multiple scripts
 scaling factor: 1
 query mode: simple
 number of clients: 1
 number of threads: 1
 duration: 2 s
 number of transactions actually processed: 50501
 latency average = 0.039 ms
 latency stddev = 0.020 ms
 tps = 25249.464772 (including connections establishing)
 tps = 25339.454154 (excluding connections establishing)
 SQL script 0, weight 9: one.sql
  - 45366 transactions (89.8% of total, tps = 22682.070035)
  - latency average = 0.038 ms
  - latency stddev = 0.016 ms
 SQL script 1, weight 1: now.sql
  - 5135 transactions (10.2% of total, tps = 2567.394737)
  - latency average = 0.044 ms
  - latency stddev = 0.041 ms

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..6bac511 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -261,6 +261,18 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 benchmarking arguments:
 
 variablelist
+ varlistentry
+  termoption-B/option/term
+  termoption--tpc-b/option/term
+  listitem
+   para
+Built-in TPC-B like test which updates three tables and performs
+an insert on a fourth. See below for details.
+This is the default if no bench is explicitely specified.
+   /para
+  /listitem
+ /varlistentry
+
 
  varlistentry
   termoption-c/option replaceableclients//term
@@ -313,8 +325,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
para
 Read transaction script from replaceablefilename/.
 See below for details.
-option-N/option, option-S/option, and option-f/option
-are mutually exclusive.
+Multiple option-f/ options are allowed.
/para
   /listitem
  /varlistentry
@@ -404,10 +415,10 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--skip-some-updates/option/term
   listitem
para
-Do not update structnamepgbench_tellers/ and
-structnamepgbench_branches/.
-This will avoid update contention on these tables, but
-it makes the test case even less like TPC-B.
+Built-in test which updates only one table compared to option-B/.
+This avoids update contention on structnamepgbench_tellers/
+and structnamepgbench_branches/, but it makes the test case
+even less like TPC-B.
/para
   /listitem
  /varlistentry
@@ -499,9 +510,9 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
 Report the specified scale factor in applicationpgbench/'s
 output.  With the built-in tests, this is not necessary; the
 correct scale factor will be detected by counting the number of
-rows in the structnamepgbench_branches/ table.  However, when testing
-custom benchmarks (option-f/ option), the scale factor
-will be reported as 1 unless this option is used.
+rows in the structnamepgbench_branches/ table.
+However, when testing only custom benchmarks (option-f/ option),
+the scale factor will be reported as 1 unless this option is used.
/para
   /listitem
  /varlistentry
@@ -511,7 +522,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
   termoption--select-only/option/term
   listitem
para
-Perform select-only transactions instead of TPC-B-like test.
+Built-in test with select-only transactions.
/para
   /listitem
  /varlistentry
@@ -552,6 +563,20 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/
  /varlistentry
 
  varlistentry
+  termoption-w/option/term
+  termoption--weight/option/term
+  listitem
+   para
+Set the integer weight for the next test script
+(option-B/, option-N/, option-S/ or option-f/).
+When several test scripts are used, the relative probability of
+drawing each test is proportional to these weights.
+The default weight is literal1/.
+   /para
+  /listitem
+ /varlistentry
+
+ varlistentry
   termoption--aggregate-interval=replaceableseconds//option/term
   listitem
para
@@ 

Re: [HACKERS] WAL logging problem in 9.4.3?

2015-07-21 Thread Martijn van Oosterhout
On Tue, Jul 21, 2015 at 02:24:47PM -0400, Todd A. Cook wrote:
 Hi,
 
 This thread seemed to trail off without a resolution.  Was anything done?

Not that I can tell. I was the original poster of this thread. We've
worked around the issue by placing a CHECKPOINT command at the end of
the migration script.  For us it's not a performance issue, more a
correctness one, tables were empty when they shouldn't have been.

I'm hoping a fix will appear in the 9.5 release, since we're intending
to release with that version.  A forced checkpoint every now and them
probably won't be a serious problem though.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


[HACKERS] brin index vacuum versus transaction snapshots

2015-07-21 Thread Kevin Grittner
If you run `make installcheck` against a cluster with
default_transaction_isolation = 'repeatable read' you get one
failure:

*** /home/kgrittn/pg/master/src/test/regress/expected/brin.out 2015-07-21 
10:21:58.798619111 -0500
--- /home/kgrittn/pg/master/src/test/regress/results/brin.out 2015-07-21 
14:00:25.169320059 -0500
***
*** 405,409 
--- 405,410 
box(point(odd, even), point(thousand, twothousand))
  FROM tenk1 ORDER BY unique2 LIMIT 5 OFFSET 5;
  VACUUM brintest; -- force a summarization cycle in brinidx
+ ERROR: brin_summarize_new_values() cannot run in a transaction that has 
already obtained a snapshot
  UPDATE brintest SET int8col = int8col * int4col;
  UPDATE brintest SET textcol = '' WHERE textcol IS NOT NULL;


The stacktrace is:

#0 brinsummarize (index=0x7f1fe0eca6b8, heapRel=0x7f1fe0ed64f8, 
numSummarized=0x30e1da8, numExisting=0x30e1da8) at brin.c:1080
#1 0x004683d3 in brinvacuumcleanup (fcinfo=0x7ffd41b95c98) at brin.c:731
#2 0x00a69b28 in FunctionCall2Coll (flinfo=0x7ffd41b96078, collation=0, 
arg1=140725706121624, arg2=0) at fmgr.c:1323
#3 0x004f7b60 in index_vacuum_cleanup (info=0x7ffd41b96198, stats=0x0) 
at indexam.c:717
#4 0x006e1004 in lazy_cleanup_index (indrel=0x7f1fe0eca6b8, stats=0x0, 
vacrelstats=0x30e0bb0) at vacuumlazy.c:1397
#5 0x006df637 in lazy_scan_heap (onerel=0x7f1fe0ed64f8, 
vacrelstats=0x30e0bb0, Irel=0x30e0ca8, nindexes=1, scan_all=0 '\000') at 
vacuumlazy.c:1108
#6 0x006dda3b in lazy_vacuum_rel (onerel=0x7f1fe0ed64f8, options=1, 
params=0x7ffd41b96798, bstrategy=0x30d9a38) at vacuumlazy.c:244
#7 0x006dc18d in vacuum_rel (relid=30220, relation=0x2f8d1a8, 
options=1, params=0x7ffd41b96798) at vacuum.c:1372
#8 0x006db711 in vacuum (options=1, relation=0x2f8d1a8, relid=0, 
params=0x7ffd41b96798, va_cols=0x0, bstrategy=0x30d9a38, isTopLevel=1 '\001') 
at vacuum.c:293
#9 0x006db31d in ExecVacuum (vacstmt=0x2f8d200, isTopLevel=1 '\001') at 
vacuum.c:121
#10 0x008bef36 in standard_ProcessUtility (parsetree=0x2f8d200, 
queryString=0x2f8c788 VACUUM brintest;, context=PROCESS_UTILITY_TOPLEVEL, 
params=0x0, dest=0x2f8d588, completionTag=0x7ffd41b96d50 ) at utility.c:654
#11 0x008be69e in ProcessUtility (parsetree=0x2f8d200, 
queryString=0x2f8c788 VACUUM brintest;, context=PROCESS_UTILITY_TOPLEVEL, 
params=0x0, dest=0x2f8d588, completionTag=0x7ffd41b96d50 ) at utility.c:335
#12 0x008be1a7 in PortalRunUtility (portal=0x2f36b18, 
utilityStmt=0x2f8d200, isTopLevel=1 '\001', dest=0x2f8d588, 
completionTag=0x7ffd41b96d50 ) at pquery.c:1187
#13 0x008bd1bd in PortalRunMulti (portal=0x2f36b18, isTopLevel=1 
'\001', dest=0x2f8d588, altdest=0x2f8d588, completionTag=0x7ffd41b96d50 ) at 
pquery.c:1318
#14 0x008bc80d in PortalRun (portal=0x2f36b18, 
count=9223372036854775807, isTopLevel=1 '\001', dest=0x2f8d588, 
altdest=0x2f8d588, completionTag=0x7ffd41b96d50 ) at pquery.c:816
#15 0x008b7edf in exec_simple_query (query_string=0x2f8c788 VACUUM 
brintest;) at postgres.c:1104
#16 0x008b720c in PostgresMain (argc=1, argv=0x2f1d450, 
dbname=0x2f1d2b0 regression, username=0x2f1d290 kgrittn) at postgres.c:4025
#17 0x0081ab99 in BackendRun (port=0x2f3d610) at postmaster.c:4183
#18 0x0081a17a in BackendStartup (port=0x2f3d610) at postmaster.c:3859
#19 0x00816753 in ServerLoop () at postmaster.c:1618
#20 0x00813d4a in PostmasterMain (argc=3, argv=0x2f1c460) at 
postmaster.c:1263
#21 0x0074ec36 in main (argc=3, argv=0x2f1c460) at main.c:223


Note that the function mentioned in the error message is not
anywhere in the stack trace, and that there was not any explicit
transaction started -- just a VACUUM command for the table, without
any BEGIN/COMMIT.


This is using source at commit 9faa6ae14f6098e4b55f0131f7ec2694a381fb87.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Tom Lane
Paul Ramsey pram...@cleverelephant.ca writes:
 On July 21, 2015 at 11:07:36 AM, Tom Lane (t...@sss.pgh.pa.us) wrote:
 I'm inclined to think that it's not really necessary to worry about 
 invalidating a per-connection cache of is this function safe to ship 
 determinations.

 So: yes to a local cache of all forwardable functions/ops, populated in full 
 the first time through (does that speak maybe to using a binary search on a 
 sorted list instead of a hash, since I only pay the sort price once and am 
 not doing any insertions?). And then we just hold it until the connection 
 goes away. 

No, *not* populated first-time-through, because that won't handle any of
the CREATE, DROP, or UPGRADE cases.  It's also doing a lot of work you
might never need.  I was thinking of populate on demand, that is, first
time you need to know whether function X is shippable, you find that out
and then cache the answer (whether it be positive or negative).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Andres Freund
On 2015-07-21 14:07:24 -0400, Tom Lane wrote:
 Paul Ramsey pram...@cleverelephant.ca writes:
  Folks are going to be OK w/ me dropping in new syscache entries so support 
  my niche little feature?
 
 No, mainly because it adds overhead without fixing your problem.

Meh. pg_extension updates are exceedingly rare, and there's a bunch of
code in extension.c that could very well have used a syscache instead of
doing manual scans over the table.

 It's not correct to suppose that a syscache on pg_extension would
 reliably report anything; consider ALTER EXTENSION ADD/DROP, which
 does not touch the pg_extension row.

I'd have just brute-force solved that by forcing a cache inval in that
case.


But I'm not going to complain too loudly if we don't do invalidation.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Robert Haas
On Tue, Jul 21, 2015 at 12:29 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 Pgbench *currently* already accept multiple -f ... options, and this is a
 good thing to test realistic loads which may intermix several kind of
 transactions, say a lot of readonly and some update or insert, and very rare
 deletes...

Hmm, I didn't realize that.  The code looks a bit inconsistent right
now - e.g. we do support multiple files, but pgbench's options-parsing
loop sets ttype to a value that depends only on the last of -f, -N,
and -S encountered.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] RFC: Allow materialized views to use unlogged data store

2015-07-21 Thread Joshua D. Drake


-hackers,

What do we think about $subject? In short, the underlying table of an MV 
would have the option of being unlogged.


JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing I'm offended is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Allow materialized views to use unlogged data store

2015-07-21 Thread Andres Freund
On 2015-07-21 14:38:24 -0700, Joshua D. Drake wrote:
 What do we think about $subject? In short, the underlying table of an MV
 would have the option of being unlogged.

There was a rather long thread about the difficulities of doing so
correctly back when matviews went in. It's not just a line of code or
10.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] brin index vacuum versus transaction snapshots

2015-07-21 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:

 If you run `make installcheck` against a cluster with
 default_transaction_isolation = 'repeatable read' you get one
 failure:

 + ERROR: brin_summarize_new_values() cannot run in a transaction that has 
 already obtained a snapshot

 This is using source at commit 9faa6ae14f6098e4b55f0131f7ec2694a381fb87.

Ah, the very next commit fixed this while I was running the tests.

Nothing to see here, folks; move along

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-21 Thread Robert Haas
On Tue, Jun 30, 2015 at 4:32 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Yes, I suggest just a single column on pg_stat_activity called pct_complete

 trace_completion_interval = 5s (default)

 Every interval, we report the current % complete for any operation that
 supports it. We just show NULL if the current operation has not reported
 anything or never will.

I am deeply skeptical about the usefulness of a progress-reporting
system that can only report one number.  I think that, in many cases,
it won't be possible to compute an accurate completion percentage, and
even if it is, people may want more data than that for various reasons
anyway.

For example, in the case of VACUUM, suppose there is a table with
1,000,000 heap pages and 200,000 index pages (so it's probably
over-indexed, but whatever).  After reading 500,000 heap pages, we
have found 0 dead tuples.  What percentage of the work have we
finished?

It's hard to say.  If we don't find any dead tuples, we've read half
the pages we will eventually read and are therefore half done.  But if
we find even 1 dead tuple, then we've got to scan all 200,000 index
pages, so we've read only 41.7% of the pages we'll eventually touch.
If we find so many dead tuples that we have to scan the indexes
multiple times for lack of maintenance_work_mem, we'll eventually read
1,000,000 + 200,000k pages, where k is the number of index scans; if
say k = 5 then we are only 25% done.  All of these scenarios are
plausible because, in all likelihood, the dirty pages in the table are
concentrated near the end.

Now we could come up with ways of making good guesses about what is
likely to happen.  We could look at the data from pg_stat_all_tables,
historical results of vacuuming this table, the state of the
visibility map, and so on.  And that all might help.  But it's going
to be fairly hard to produce a percentage of completion that is
monotonically increasing and always accurately reflects the time
remaining.  Even if we can do it, it doesn't seem like a stretch to
suppose that sometimes people will want to look at the detail data.
Instead of getting told we're X% done (according to some arcane
formula), it's quite reasonable to think that people will want to get
a bunch of values, e.g.:

1. For what percentage of heap pages have we completed phase one (HOT
prune + mark all visible if appropriate + freeze + remember dead
tuples)?
2. For what percentage of heap pages have we completed phase two (mark
line pointers unused)?
3. What percentage of maintenance_work_mem are we currently using to
remember tuples?

For a query, the information we want back is likely to be even more
complicated; e.g. EXPLAIN output with row counts and perhaps timings
to date for each plan node.  We can insist that all status reporting
get boiled down to one number, but I suspect we would be better off
asking ourselves how we could let commands return a richer set of
data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN index and aborted transaction

2015-07-21 Thread Robert Haas
On Sat, Jul 18, 2015 at 5:11 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Yeah, that's a bit of an open problem: we don't have any mechanism to
 mark a block range as needing resummarization, yet.  I don't have any
 great ideas there, TBH.  Some options that were discussed but never led
 anywhere:

 1. whenever a heap tuple is deleted that's minimum or maximum for a
 column, mark the index tuple as needing resummarization.  One a future
 vacuuming pass the index would be updated.  (I think this works for
 minmax, but I don't see how to apply it to inclusion).

 2. have block ranges be resummarized randomly during vacuum.

 3. Have index tuples last for only X number of transactions, marking the
 as needing summarization when that expires.

 4. Have a user-invoked function that re-runs summarization.  That way
 the user can implement any of the above policies, or others.

Maybe I'm confused here, but it seems like the only time
re-summarization can be needed is when tuples are pruned.  The mere
act of deleting a tuple, even if the delete goes on to commit, doesn't
create a scenario where re-summarization can work out to a win,
because there may still be snapshots that can see it.  At the point
where we prune the tuple, though, there might well be a benefit in
re-summarizing, because now a newly-computed summary value won't need
to cover a value that previously had to be there.

But it seems obviously impractical to re-summarize when we HOT-prune,
so it seems like the obvious thing to do is make vacuum do it.  We
know during phase one of vacuum whether we saw any dead tuples in page
range X-Y; if yes, re-summarize.  The only reason not to do this is if
it causes us to do a lot of resummarization that frequently fails to
produce a smaller range. Do you have any experimental data suggesting
that this is or is not a problem?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Alpha2/Beta1

2015-07-21 Thread Josh Berkus
All:

At the Developer's Meeting, we said we'd be releasing an alpha or beta
each month until final 9.5 release.  As such, we should be releasing a
9.5 alpha/beta in the first week of August.

My question for Hackers is: should this be Alpha2 or Beta 1?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Paul Ramsey
On Tue, Jul 21, 2015 at 11:29 AM, Paul Ramsey pram...@cleverelephant.ca wrote:
 On July 21, 2015 at 11:22:12 AM, Tom Lane (t...@sss.pgh.pa.us) wrote:

 No, *not* populated first-time-through, because that won't handle any of
 the CREATE, DROP, or UPGRADE cases. It's also doing a lot of work you
 might never need. I was thinking of populate on demand, that is, first
 time you need to know whether function X is shippable, you find that out
 and then cache the answer (whether it be positive or negative).


 Roger that. Off to the races..

Attached, reworked with a local cache. I felt a little dirty sticking
the cache entry right in postgres_fdw.c, so I broke out all my
nonsense into shippable.c.

Thanks!

P
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index d2b98e1..3543312 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -1,7 +1,7 @@
 # contrib/postgres_fdw/Makefile
 
 MODULE_big = postgres_fdw
-OBJS = postgres_fdw.o option.o deparse.o connection.o $(WIN32RES)
+OBJS = postgres_fdw.o option.o deparse.o connection.o shippable.o $(WIN32RES)
 PGFILEDESC = postgres_fdw - foreign data wrapper for PostgreSQL
 
 PG_CPPFLAGS = -I$(libpq_srcdir)
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 81cb2b4..d6fff30 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -229,6 +229,9 @@ foreign_expr_walker(Node *node,
Oid collation;
FDWCollateState state;
 
+   /* Access extension metadata from fpinfo on baserel */
+   PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo 
*)(glob_cxt-foreignrel-fdw_private);
+
/* Need do nothing for empty subexpressions */
if (node == NULL)
return true;
@@ -361,7 +364,7 @@ foreign_expr_walker(Node *node,
 * can't be sent to remote because it might 
have incompatible
 * semantics on remote side.
 */
-   if (!is_builtin(fe-funcid))
+   if (!is_builtin(fe-funcid)  
!is_shippable(fe-funcid, fpinfo))
return false;
 
/*
@@ -407,7 +410,7 @@ foreign_expr_walker(Node *node,
 * (If the operator is, surely its underlying 
function is
 * too.)
 */
-   if (!is_builtin(oe-opno))
+   if (!is_builtin(oe-opno)  
!is_shippable(oe-opno, fpinfo))
return false;
 
/*
@@ -445,7 +448,7 @@ foreign_expr_walker(Node *node,
/*
 * Again, only built-in operators can be sent 
to remote.
 */
-   if (!is_builtin(oe-opno))
+   if (!is_builtin(oe-opno)  
!is_shippable(oe-opno, fpinfo))
return false;
 
/*
@@ -591,7 +594,7 @@ foreign_expr_walker(Node *node,
 * If result type of given expression is not built-in, it can't be sent 
to
 * remote because it might have incompatible semantics on remote side.
 */
-   if (check_type  !is_builtin(exprType(node)))
+   if (check_type  !is_builtin(exprType(node))  
!is_shippable(exprType(node), fpinfo))
return false;
 
/*
@@ -1404,8 +1407,7 @@ deparseConst(Const *node, deparse_expr_cxt *context)
}
if (needlabel)
appendStringInfo(buf, ::%s,
-
format_type_with_typemod(node-consttype,
-   
  node-consttypmod));
+   format_type_be_qualified(node-consttype));
 }
 
 /*
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 7547ec2..9aeaf1a 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -15,6 +15,7 @@
 #include postgres_fdw.h
 
 #include access/reloptions.h
+#include catalog/pg_foreign_data_wrapper.h
 #include catalog/pg_foreign_server.h
 #include catalog/pg_foreign_table.h
 #include catalog/pg_user_mapping.h
@@ -124,6 +125,10 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
 errmsg(%s requires a 
non-negative numeric value,
def-defname)));
}
+   else if (strcmp(def-defname, extensions) == 0)
+   {
+   extractExtensionList(defGetString(def), NULL);
+   }
}
 
PG_RETURN_VOID();
@@ -153,6 +158,9 @@ InitPgFdwOptions(void)
/* updatable is 

[HACKERS] Eliminating CREATE INDEX comparator TID tie-breaker overhead

2015-07-21 Thread Peter Geoghegan
I have one more idea for accelerating sorting-related operations that
is both fairly effective and relatively easy to implement. This just
about clears my backlog of those, though.

There is an open item on the TODO list entitled Consider whether
duplicate keys should be sorted by block/offset [1].

Currently, comparetup_index_btree() and comparetup_index_hash() do a
tie-breaker on ItemPointer (heap TID). This started as insurance
against bad qsort() implementations that do badly with many equal
keys, but it was also thought that there is some value in having equal
keys be in physical (heap) order. Clearly the first concern has been
obsolete since 2006, when a high quality qsort() was added to
Postgres, but the second concern is probably still valid.

Overhead
-

Tom said in 2008 that the CREATE INDEX overhead of doing this may be
about 7% [2]. It seems even worse now, presumably due to SortSupport
reducing costs elsewhere. Several years ago, I suggested ripping out
the tie-breaker too, but didn't get very far with that idea due to the
potential downsides for index scans. I'm not about to revisit the
discussion from 2011 about whether or not it should be torn out. I'd
rather just (mostly) fix the real problem without changing the
behavior of comparetup_index_btree() (or comparetup_index_hash()).

The real problem is that we do pointer chasing to get to the
IndexTuple (tuple proper), where we might get away with only
examining the SortTuple, as would happen in comparetup_heap() in the
event of an equivalent heap tuplesort, for example. The added memory
latency hurts lower cardinality attributes (when only one attribute
appears in the Index definition and the IndexTuple isn't cache
resident), and wastes memory bandwidth on the system, which is
something that there is virtually never enough of.

Patch
=

Attached patch adds a new tie-breaker which is used only when the
tuples being sorted still fit in memory. Currently, the field
SortTuple.tupindex has a different purpose depending on whether we're
building initial runs or merging runs. However, SortTuple.tupindex
currently has no purpose when a sort can complete entirely in memory,
so that case is available to support a third meaning:
SortTuple.tupindex can be used to hold a simple ordinal number. When
our status is TSS_INITIAL (when we qsort()), this is used as a proxy
for what the TID tie-breaker would ordinarily return.

This reliably preserves the existing behavior because index tuplesort
clients invariably scan heap tuples in physical order, and so nothing
is lost.

Performance
-

The patch can make CREATE INDEX with an unlogged B-Tree index with a
single int4 attribute as much as 15% faster (although I think under
10% is much more likely). That seems like an improvement that makes
the patch worthwhile.

Design considerations and consequences


I don't think that there is much point in getting rid of
SortTuple.tupindex for in-memory sorts, which this patch closes off
the possibility of. I tested the performance of a SortTuple struct
with only a datum1 and tuple proper for in-memory sorting at one
time, and it was disappointing, and in any case unworkably invasive.

I also don't think it's worth worrying about the fact that tupindex is
assigned an ordinal number even in cases where it won't be used inside
the comparator. The overhead has to be indistinguishable from zero
anyway, and the right place to put that assignment happens to be where
there is generic TSS_INITIAL copying within puttuple_common().

I'm not concerned about synchronized scans breaking my assumption of a
physical ordering of heaptuples being fed to tuplesort.c. I think that
it is unlikely to ever be worth seriously considering this case.

I have a hard time imagining anything (beyond synchronous scans)
breaking my assumption that index tuplesorts receive tuples in heap
physical order. If anything was to break that in the future (e.g.
parallelizing the heap scan for index builds), then IMV the onus
should be on that new case to take appropriate precautions against
breaking my assumption.

[1] https://wiki.postgresql.org/wiki/Todo#Sorting
[2] http://www.postgresql.org/message-id/23321.1205726...@sss.pgh.pa.us
-- 
Peter Geoghegan
From 22e3aa14436d3c4ca8ccde2cc55777c6207fd9f0 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan peter.geoghega...@gmail.com
Date: Thu, 16 Jul 2015 22:26:01 -0700
Subject: [PATCH] Cache-aware index sort comparator tie-breakers

For the B-Tree and hash index build sort cases only, tuplesort
previously performed a tie-breaker on ItemPointer (heap physical order).
This may later provide some performance benefits for index scans.
However, it implied significant overhead during index builds, because
additional pointer chasing (to the IndexTuple tuple proper) was
required in the comparators, which adds significant memory latency where
the tuple proper does not otherwise need to be 

Re: [HACKERS] Arguable RLS security bug, EvalPlanQual() paranoia

2015-07-21 Thread Stephen Frost
Robert,

As I mentioned up thread, I'm out until the 27th. I have posted a patch
which I will push to fix the copy.c issue, and I have already stated that
I'll address the statistics issue. Further, Joe has also been working on
issues but he was out of pocket last week attending a conference.

I'm happy to work up a documentation patch for this when I get back.

Thanks!

Stephen

On Tuesday, July 21, 2015, Robert Haas robertmh...@gmail.com wrote:

 On Sun, Jul 19, 2015 at 8:56 PM, Peter Geoghegan p...@heroku.com
 javascript:; wrote:
  On Mon, Jun 1, 2015 at 12:29 AM, Peter Geoghegan p...@heroku.com
 javascript:; wrote:
  If you're using another well known MVCC database system that has RLS,
  I imagine when this happens the attacker similarly waits on the
  conflicting (privileged) xact to finish (in my example in the patch,
  Bob's xact). However, unlike with the Postgres READ COMMITTED mode,
  Mallory would then have her malicious UPDATE statement entirely rolled
  back, and her statement would acquire an entirely new MVCC snapshot,
  to be used by the USING security barrier qual (and everything else)
  from scratch. This other system would then re-run her UPDATE with the
  new MVCC snapshot. This would repeat until Mallory's UPDATE statement
  completes without encountering any concurrent UPDATEs/DELETEs to her
  would-be affected rows.
 
  In general, with this other database system, an UPDATE must run to
  completion without violating MVCC, even in READ COMMITTED mode. For
  that reason, I think we can take no comfort from the presumption that
  this flexibility in USING security barrier quals (allowing subqueries,
  etc) works securely in this other system. (I actually didn't check
  this out, but I imagine it's true).
 
  Where are we on this?
 
  Discussion during pgCon with Heikki and Andres led me to believe that
  the issue is acceptable. The issue can be documented to help ensure
  that user expectation is in line with actual user-visible behavior.
  Unfortunately, I think that that will be a clunky documentation patch.

 Perhaps I'm missing something, but it looks to me like Stephen has
 done absolutely nothing about the many issues reported with the RLS
 patch.  I organized the open items list by topic on June 26th; almost
 a month later, four more issues have been added to the section on RLS,
 and none have been removed.

 I think it is right that we should be concerned about this.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




Re: [HACKERS] Alpha2/Beta1

2015-07-21 Thread Gavin Flower

On 22/07/15 09:01, Josh Berkus wrote:
[...]


My question for Hackers is: should this be Alpha2 or Beta 1?



Yes!  :-)

(The decision, as to which, I'll leave to the REAL Developers.)

More seriously, I expect that the alpha probably generated a lot of 
useful feedback.  Was this actually the case?



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Arguable RLS security bug, EvalPlanQual() paranoia

2015-07-21 Thread Robert Haas
On Sun, Jul 19, 2015 at 8:56 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Jun 1, 2015 at 12:29 AM, Peter Geoghegan p...@heroku.com wrote:
 If you're using another well known MVCC database system that has RLS,
 I imagine when this happens the attacker similarly waits on the
 conflicting (privileged) xact to finish (in my example in the patch,
 Bob's xact). However, unlike with the Postgres READ COMMITTED mode,
 Mallory would then have her malicious UPDATE statement entirely rolled
 back, and her statement would acquire an entirely new MVCC snapshot,
 to be used by the USING security barrier qual (and everything else)
 from scratch. This other system would then re-run her UPDATE with the
 new MVCC snapshot. This would repeat until Mallory's UPDATE statement
 completes without encountering any concurrent UPDATEs/DELETEs to her
 would-be affected rows.

 In general, with this other database system, an UPDATE must run to
 completion without violating MVCC, even in READ COMMITTED mode. For
 that reason, I think we can take no comfort from the presumption that
 this flexibility in USING security barrier quals (allowing subqueries,
 etc) works securely in this other system. (I actually didn't check
 this out, but I imagine it's true).

 Where are we on this?

 Discussion during pgCon with Heikki and Andres led me to believe that
 the issue is acceptable. The issue can be documented to help ensure
 that user expectation is in line with actual user-visible behavior.
 Unfortunately, I think that that will be a clunky documentation patch.

Perhaps I'm missing something, but it looks to me like Stephen has
done absolutely nothing about the many issues reported with the RLS
patch.  I organized the open items list by topic on June 26th; almost
a month later, four more issues have been added to the section on RLS,
and none have been removed.

I think it is right that we should be concerned about this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Michael Paquier
On Wed, Jul 22, 2015 at 10:04 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 7/21/15 10:00 AM, Tom Lane wrote:
 I agree; this change may have seemed like a good idea at the time, but
 it was not.  Failures during make check's install step are rare enough
 that you don't really need all that output in your face to help with the
 rare situation where it fails.  And for the buildfarm's purposes, it is
 surely desirable to segregate that output from the actual check step.

 It wasn't really an idea; it was just not necessary anymore.  We can put
 it [directing the make install output into a file] back if that's what
 people prefer.

OK... Attached are two patches (please merge them into a single
commit, I am just separating them as they are separate issues):
- 0001 adds a missing entry in test_ddl_deparse's .gitignore. I
mentioned that upthread.
- 0002 redirects the installation logs into
abs_top_builddir/tmp_install/log/install.log. We could redirect it
only to abs_top_builddir/log/ but tmp_install is not removed after a
run of a regression make target.
-- 
Michael
From 0614062b1d41bd72ed4a0ba000d639607156066d Mon Sep 17 00:00:00 2001
From: Michael Paquier michael@otacoo.com
Date: Wed, 22 Jul 2015 10:15:20 +0900
Subject: [PATCH 1/2] Add missing entry log/ in .gitignore for test_ddl_deparse

Entry forgotten by 9faa6ae1.
---
 src/test/modules/test_ddl_deparse/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/test/modules/test_ddl_deparse/.gitignore b/src/test/modules/test_ddl_deparse/.gitignore
index 6628455..3337b3d 100644
--- a/src/test/modules/test_ddl_deparse/.gitignore
+++ b/src/test/modules/test_ddl_deparse/.gitignore
@@ -1 +1,2 @@
+/log/
 /results/
-- 
2.4.6

From 9df069d70f8c9aecf7254ebfdec85460faf5 Mon Sep 17 00:00:00 2001
From: Michael Paquier michael@otacoo.com
Date: Wed, 22 Jul 2015 10:24:17 +0900
Subject: [PATCH 2/2] Redirect installation output of make check into a log
 file

dbf2ec1a changed make check such as the installation logs get directed
to stdout and stderr, however some recent objections are proving that this
was a bad move. Hence put it back into a log file saved as tmp_install/log
which is created once per invocation of any make target doing regression
tests.
---
 src/Makefile.global.in | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 8d1250d..e2f7211 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -307,9 +307,10 @@ temp-install:
 ifndef NO_TEMP_INSTALL
 ifeq ($(MAKELEVEL),0)
 	rm -rf '$(abs_top_builddir)'/tmp_install
-	$(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install
+	$(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log
+	$(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install '$(abs_top_builddir)'/tmp_install/log/install.log 21
 endif
-	$(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C '$(top_builddir)'/$$extra DESTDIR='$(abs_top_builddir)'/tmp_install install || exit; done)
+	$(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C '$(top_builddir)'/$$extra DESTDIR='$(abs_top_builddir)'/tmp_install install '$(abs_top_builddir)'/tmp_install/log/install.log || exit; done)
 endif
 
 PROVE = @PROVE@
-- 
2.4.6


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha2/Beta1

2015-07-21 Thread Peter Geoghegan
On Tue, Jul 21, 2015 at 8:40 PM, Noah Misch n...@leadboat.com wrote:
 I vote for alpha2.  Comparing the Open Issues and resolved after 9.5alpha1
 sections of https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items, we've
 not transitioned to a qualitatively different level of API stability.

I agree. I'd hoped that my bugfixes would get committed in time for
what we were going to call Beta 1. Everyone qualified seems to be very
busy with other things that are more pressing.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] brin index vacuum versus transaction snapshots

2015-07-21 Thread Jeff Janes
On Tue, Jul 21, 2015 at 3:10 PM, Kevin Grittner kgri...@ymail.com wrote:

 Kevin Grittner kgri...@ymail.com wrote:

  If you run `make installcheck` against a cluster with
  default_transaction_isolation = 'repeatable read' you get one
  failure:

  + ERROR: brin_summarize_new_values() cannot run in a transaction that
 has already obtained a snapshot

  This is using source at commit 9faa6ae14f6098e4b55f0131f7ec2694a381fb87.

 Ah, the very next commit fixed this while I was running the tests.

 Nothing to see here, folks; move along



Still looks broken from here.


Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Josh Berkus
On 07/21/2015 09:29 AM, Fabien COELHO wrote:
 Maybe -f file.sql:weight (yuk from my point of view, but it can be done
 easily).

Maybe it's past time for pgbench to have a config file?

Given that we want to define some per-workload options, the config file
would probably need to be YAML or JSON, e.g.:

pgbench --config=workload1.pgb

workload1.pgb
-

database: bench
port: 5432
host: localhost
user: josh
clients : 16
threads : 4
response-times : on
stats-level: script
script1:
file: script1.bench
weight: 3
script2:
file: script2.bench
weight: 1

the above would execute a pgbench with 16 clients, 4 threads, script1
three times as often as script2, and report stats at the script (rather
than SQL statement) level.




-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Planner debug views

2015-07-21 Thread Qingqing Zhou
Here is a proposal introducing some debugging infrastructure into the
core. The basic idea is to allow us to query the planner search space.
To so do, we can dump related information to csv files and use foreign
table to query them. So here I propose two foreign tables:

  create foreign table pg_planner_rels(rel text, content text) ...
  create foreign table pg_planner_paths(rel text, path text,
replacedby text, totalcost float, cheapest text, content text) ...

Where
1. pg_planner_rels consists of RelOptInfo information.
2. pg_planner_paths consists of paths belong to each rel.
3. Field rel is a RelOptInfo* pointer so these two tables can join each other.
4. We can further adding subqueries view, or expand pg_planner_paths
table easily adding startup_cost etc.

And an example output is attached. The nice thing about it, is it
reusing the database query power to diagnose database query problems
:-). From the tables, we can find out all paths considered and why
they are discarded.

I have a draft implementation of above idea - main changes including:
1. Keep removed path in a separate list, also the reason of removal
(see replacedby field);
2. Change debug_print_rel() family to use StringInfo instead of printf().

I would also suggest we change DEBUG_OPTIMIZER into a GUC, because we
need this ability in release build as well. There is almost no
overhead when GUC is off.

Before I get into more details, I'd like to see if we see this is a
good way to go.

Thoughts?

Regards,
Qingqing


-- query to test
postgres=# explain select a.i, count(a.i) from a join b on a.i=b.i
group by a.i order by a.i limit 1;
  QUERY PLAN

--
 Limit  (cost=14.47..15.64 rows=1 width=4)
   -  GroupAggregate  (cost=14.47..120.67 rows=91 width=4)
 Group Key: a.i
 -  Merge Join  (cost=14.47..97.51 rows=4450 width=4)
   Merge Cond: (a.i = b.i)
   -  Index Only Scan using ai on a  (cost=0.28..45.07
rows=600 width=4)
   -  Sort  (cost=14.20..14.72 rows=210 width=4)
 Sort Key: b.i
 -  Seq Scan on b  (cost=0.00..6.10 rows=210 width=4)
(9 rows)

-- after query is done we can see rels
postgres=# select * from pg_planner_rels;
   rel   |   content
-+-
 2f49b70 | RELOPTINFO (1): rows=600 width=4
 2f49d08 | RELOPTINFO (2): rows=210 width=4
 2f4a590 | RELOPTINFO (1 2): rows=4450 width=4
 1543340 | RELOPTINFO (1): rows=1 width=64
(4 rows)

-- and paths
postgres=# select rel, path, replacedby, totalcost, substr(content, 1,
30) from pg_planner_paths ;
   rel   |  path   | replacedby | totalcost | substr
-+-++---+
 2f49b70 | 2f4abb8 || 1 | IdxScan(1) rows=600 cost=0.28.
 2f49b70 | 2f4a1f0 ||10 | SeqScan(1) rows=600 cost=0.00.
 2f49b70 | 2f49e10 ||45 | IdxScan(1) rows=600 cost=0.28.
 2f49b70 | 2f4d6b8 | 2f4d6b8| 5 | BitmapHeapScan(1) rows=600 cos
 2f49d08 | 2f4e3e0 || 6 | SeqScan(2) rows=210 cost=0.00.
 2f4a590 | 2f4f938 ||77 | HashJoin(1 2) rows=4450 cost=1
 2f4a590 | 2f4f678 ||88 | HashJoin(1 2) rows=4450 cost=8
 2f4a590 | 2f4f330 ||98 | MergeJoin(1 2) rows=4450 cost=
 2f4a590 | 2f4f850 ||   140 | NestLoop(1 2) rows=4450 cost=0
 2f4a590 | 2f4f188 ||  1907 | NestLoop(1 2) rows=4450 cost=0
 2f4a590 | 2f4f278 ||  1942 | NestLoop(1 2) rows=4450 cost=0
 2f4a590 | 2f4f8a8 | 2f4f8a8|  1908 | NestLoop(1 2) rows=4450 cost=0
 2f4a590 | 2f4f700 | 2f4f700|   105 | MergeJoin(1 2) rows=4450 cost=
 2f4a590 | 2f4eea0 | 2f4f330|   120 | MergeJoin(1 2) rows=4450 cost=
 2f4a590 | 2f4f220 | 2f4f278|  5280 | NestLoop(1 2) rows=4450 cost=0
 2f4a590 | 2f4efb8 | 2f4f188|  5245 | NestLoop(1 2) rows=4450 cost=0
 1543340 | 1543d08 || 1 | ForeignScan(1) rows=1 cost=0.0
 1543cf8 | 2f49a20 || 3 | ForeignScan(1) rows=17 cost=0.
(18 rows)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security labels on databases are bad for dump restore

2015-07-21 Thread Alvaro Herrera
Noah Misch wrote:
 On Mon, Jul 20, 2015 at 07:01:14PM -0400, Adam Brightwell wrote:

  I think that makes sense, but what about other DATABASE level info
  such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
  specifically thinking of the discussion from the following thread:
  
  http://www.postgresql.org/message-id/20150317172459.gm3...@alvh.no-ip.org
  
  If COMMENT is included then why not SECURITY LABEL or others?
 
 In any given situation, we should indeed restore both pg_database comments and
 pg_database security labels, or we should restore neither.

Agreed.

   In a greenfield, I would make pg_dump --create reproduce pertinent 
   entries
   from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I 
   would
   make non-creating pg_dump reproduce none of those.
  
  I think the bigger question is Where is the line drawn between
  pg_dump and pg_dumpall?.  At what point does one tool become the
  other?
 
 That question may be too big for me.

I don't think there's any line near pg_dumpall.  That tool seems to
have grown out of desperation without much actual design.  I think it
makes more sense to plan around that's the best pg_dump behavior for the
various use cases.

I like Noah's proposal of having pg_dump --create reproduce all
database-level state.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-21 Thread Heikki Linnakangas

On 07/21/2015 10:38 AM, Pavel Stehule wrote:

where we are with this patch? Can I do some for it?


I still feel this approach is misguided, and we should be tweaking psql 
and/or libpq instead. I don't feel strongly though, and if some other 
committer wants to pick this up in its current form, I won't object. So 
this patch has reached an impasse, and if no-one else wants to pick this 
up, I'm going to mark this as Returned with Feedback and move on.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-21 Thread Pavel Stehule
2015-07-21 9:53 GMT+02:00 Heikki Linnakangas hlinn...@iki.fi:

 On 07/21/2015 10:38 AM, Pavel Stehule wrote:

 where we are with this patch? Can I do some for it?


 I still feel this approach is misguided, and we should be tweaking psql
 and/or libpq instead. I don't feel strongly though, and if some other
 committer wants to pick this up in its current form, I won't object. So
 this patch has reached an impasse, and if no-one else wants to pick this
 up, I'm going to mark this as Returned with Feedback and move on.


Can we define, when we have a agreement and where not? The missing context
for RAISE EXCEPTION statement is a important issue and I would to solve it.

last patch has two parts:

1. remove plpgsql fix, that remove context for plpgsql RAISE statement - it
is working good enough for less NOTICE level, and work badly for EXCEPTION
and higher level.

2. enforce filtering of CONTEXT field on both sides (client/log)

For me, @1 is important and good solution (because there is strange
inconsistency between PLpgSQL and any other PL), @2 allows more ways - but
probably log_min_context (WARNING) is good idea too.

The advantage of context filtering on server side (client_min_message) is
one - it can be controlled by plpgsql - so I can do dynamic decision if
some NOTICE will have context or not.

The complexity will be +/- same for psql/libpq or for server side filtering.

Regards

Pavel


 - Heikki




Re: [HACKERS] Support for N synchronous standby servers - take 2

2015-07-21 Thread Michael Paquier
On Mon, Jul 20, 2015 at 9:59 PM, Beena Emerson memissemer...@gmail.com wrote:
 Simon Riggs wrote:

 The choice between formats is not
 solely predicated on whether we have multi-line support.

 I still think writing down some actual use cases would help bring the
 discussion to a conclusion. Inventing a general facility is hard without
 some clear goals about what we need to support.

 We need to at least support the following:
 - Grouping: Specify of standbys along with the minimum number of commits
 required from the group.
 - Group Type: Groups can either be priority or quorum group.

As far as I understood at the lowest level a group is just an alias
for a list of nodes, quorum or priority are properties that can be
applied to a group of nodes when this group is used in the expression
to define what means synchronous commit.

 - Group names: to simplify status reporting
 - Nesting: At least 2 levels of nesting

If I am following correctly, at the first level there is the
definition of the top level objects, like groups and sync expression.

 Using JSON, sync rep parameter to replicate in 2 different clusters could be
 written as:

{remotes:
 {quorum: 2,
  servers: [{london:
 {priority: 2,
  servers: [lndn1, lndn2, lndn3]
 }}
 ,
   {nyc:
 {priority: 1,
  servers: [ny1, ny2]
 }}
   ]
 }
 }
 The same parameter in the new language (as suggested above) could be written
 as:
  'remotes: 2(london: 1[lndn1, lndn2, lndn3], nyc: 1[ny1, ny2])'

OK, there is a typo. That's actually 2(london: 2[lndn1, lndn2, lndn3],
nyc: 1[ny1, ny2]) in your grammar. Honestly, if we want group aliases,
I think that JSON makes the most sense. One of the advantage of a
group is that you can use it in several places in the blob and set
different properties into it, hence we should be able to define a
group out of the sync expression.

Hence I would think that something like that makes more sense:
{
sync_standby_names:
{
quorum:2,
nodes:
[
{priority:1,group:cluster1},
{quorum:2,nodes:[node1,node2,node3]}
]
},
groups:
{
cluster1:[node11,node12,node13],
cluster2:[node21,node22,node23]
}
}

 Also, I was thinking the name of the main group could be optional.
 Internally, it can be given the name 'default group' or 'main group' for
 status reporting.

 The above could also be written as:
  '2(london: 2[lndn1, lndn2, lndn3], nyc: 1[ny1, ny2])'

 backward compatible:
 In JSON, while validating we may have to check if it starts with '{' to go

Something worth noticing, application_name can begin with {.

 for JSON parsing else proceed with the current method.

 A,B,C = 1[A,B,C]. This can be added in the new parser code.

This makes sense. We could do the same for JSON-based format as well
by reusing the in-memory structure used to deparse the blob when the
former grammar is used as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Andres Freund
On 2015-07-21 14:39:42 +0900, Michael Paquier wrote:
 Regarding initdb.log and postmaster.log, this is definitely a bug.
 Those have been moved by dcae5fa from log/ to tmp_check/log/,
 tmp_check/ getting removed at the end of pg_regress if there are no
 failures counted.

FWIW, I think that's bad TM. Especially when comparing good/bad
buildfarm runs the ability to get logs from good runs is essential.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Improve BRIN documentation somewhat

2015-07-21 Thread Emre Hasegeli
 Emre Hasegeli told me on IM he's going to submit a patch to add
 something similar for the inclusion opclass framework.

It is attached.  Thank you for pushing forward to improve the documentation.


0001-Improve-BRIN-documentation-for-Inclusion-opclass.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Alvaro Herrera
Michael Paquier wrote:
 On Tue, Jul 21, 2015 at 2:39 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:

  Regarding initdb.log and postmaster.log, this is definitely a bug.
  Those have been moved by dcae5fa from log/ to tmp_check/log/,
  tmp_check/ getting removed at the end of pg_regress if there are no
  failures counted. Both files will be saved in log/ at the location
  pg_regress is called using outputdir whose default is .. This way
  behavior is similar to ~9.4. Attached is a patch to fix this for 9.5
  and master.
 
 Something I just noticed: an entry for log/ in test_ddl_deparse's
 gitignore is missing.

Well, three things: 1) the entry is not missing right now, but it will
be missing if we apply your patch; 2) the file is inconsistent with the
other test modules anyway so we might as well apply your patch to make
them all alike; 3) we shouldn't really do anything about that until the
other patch's fate is decided.

So, regarding the other patch, I don't know if it's that useful to keep
the log files when the check succeeds; and if it fails, the only problem
we have, I think, is that the path is wrong in the buildfarm code and
that seems easily fixed, but do we want to make the master branch
different from the others?  Maybe the BF code can look up the new path
first, and if it can't find the file then look in the old path.

Unless I'm misunderstanding the whole thing.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-07-21 Thread Ildus Kurbangaliev

Hello.
I did some refactoring to previous patch. Improvements:

1) Wait is determined by class and event without affecting to atomic 
usage of it.
They are still stored in one variable. This improvement gives an 
opportunity to make

more detailed views later (waits can be grouped by class).
2) Only active wait of each backend is visible. pg_report_wait_end() 
function called

on the end of wait and clears it.
3) Wait name determination was optimized (last version used cycles for 
each of them,
and was very heavy). I added lazy `group` field to LWLock, which used as 
index in

lwlock names array.
4) New wait can be added by more simpler way. For example an individual 
lwlock

requires only specifying its name in LWLock names arrayb
5) Added new types of waits: Storage, Network, Latch

This patch is more informative and it'll be easier to extend.

Sample:

b1=# select pid, wait_event from pg_stat_activity;

pid  |  wait_event
---+--
17099 | LWLocks: BufferCleanupLock
17100 | Locks: Transaction
17101 | LWLocks: BufferPartitionLock
17102 |
17103 | Network: READ
17086 |
(6 rows)

--
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1dd31b3..91dca5c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4661,8 +4661,10 @@ XLOGShmemInit(void)
 	LWLockRegisterTranche(XLogCtl-Insert.WALInsertLockTrancheId, XLogCtl-Insert.WALInsertLockTranche);
 	for (i = 0; i  NUM_XLOGINSERT_LOCKS; i++)
 	{
-		LWLockInitialize(WALInsertLocks[i].l.lock,
+		LWLock *lock = WALInsertLocks[i].l.lock;
+		LWLockInitialize(lock,
 		 XLogCtl-Insert.WALInsertLockTrancheId);
+		lock-group = LWLOCK_WAL_INSERT;
 		WALInsertLocks[i].l.insertingAt = InvalidXLogRecPtr;
 	}
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e82a53a..fd04258 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -621,6 +621,7 @@ CREATE VIEW pg_stat_activity AS
 S.query_start,
 S.state_change,
 S.waiting,
+S.wait_event,
 S.state,
 S.backend_xid,
 s.backend_xmin,
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 4a650cc..e5c023f 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -36,6 +36,7 @@
 #include tcop/tcopprot.h
 #include utils/memutils.h
 #include storage/proc.h
+#include pgstat.h
 
 
 char	   *ssl_cert_file;
@@ -129,6 +130,8 @@ secure_read(Port *port, void *ptr, size_t len)
 	ssize_t		n;
 	int			waitfor;
 
+	pgstat_report_wait_start(WAIT_NETWORK, WAIT_NETWORK_READ);
+
 retry:
 #ifdef USE_SSL
 	waitfor = 0;
@@ -175,6 +178,7 @@ retry:
 	 * interrupts from being processed.
 	 */
 	ProcessClientReadInterrupt(false);
+	pgstat_report_wait_end();
 
 	return n;
 }
@@ -209,6 +213,8 @@ secure_write(Port *port, void *ptr, size_t len)
 	ssize_t		n;
 	int			waitfor;
 
+	pgstat_report_wait_start(WAIT_NETWORK, WAIT_NETWORK_WRITE);
+
 retry:
 	waitfor = 0;
 #ifdef USE_SSL
@@ -254,6 +260,7 @@ retry:
 	 * interrupts from being processed.
 	 */
 	ProcessClientWriteInterrupt(false);
+	pgstat_report_wait_end();
 
 	return n;
 }
diff --git a/src/backend/port/unix_latch.c b/src/backend/port/unix_latch.c
index 90ec4f8..dcebfe4 100644
--- a/src/backend/port/unix_latch.c
+++ b/src/backend/port/unix_latch.c
@@ -55,6 +55,7 @@
 #include storage/latch.h
 #include storage/pmsignal.h
 #include storage/shmem.h
+#include pgstat.h
 
 /* Are we currently in WaitLatch? The signal handler would like to know. */
 static volatile sig_atomic_t waiting = false;
@@ -262,6 +263,8 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 #endif
 	}
 
+	pgstat_report_wait_start(WAIT_LATCH, 0);
+
 	waiting = true;
 	do
 	{
@@ -500,6 +503,7 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 		}
 	} while (result == 0);
 	waiting = false;
+	pgstat_report_wait_end();
 
 	return result;
 }
diff --git a/src/backend/port/win32_latch.c b/src/backend/port/win32_latch.c
index 0e3aaee..92bb5a0 100644
--- a/src/backend/port/win32_latch.c
+++ b/src/backend/port/win32_latch.c
@@ -177,6 +177,8 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 	/* Ensure that signals are serviced even if latch is already set */
 	pgwin32_dispatch_queued_signals();
 
+	pgstat_report_wait_start(WAIT_LATCH, 0);
+
 	do
 	{
 		/*
@@ -278,6 +280,8 @@ WaitLatchOrSocket(volatile Latch *latch, int wakeEvents, pgsocket sock,
 		}
 	} while (result == 0);
 
+	pgstat_report_wait_end();
+
 	/* Clean up the event object we created for the socket */
 	if (sockevent != WSA_INVALID_EVENT)
 	{
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index e9fbc38..b379f7f 100644
--- a/src/backend/postmaster/pgstat.c
+++ 

Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-07-21 Thread Andres Freund
On 2015-07-21 13:11:36 +0300, Ildus Kurbangaliev wrote:
  
  /*
   * Top-level transactions are identified by VirtualTransactionIDs comprising
 diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
 index cff3b99..55b0687 100644
 --- a/src/include/storage/lwlock.h
 +++ b/src/include/storage/lwlock.h
 @@ -58,6 +58,9 @@ typedef struct LWLock
  #ifdef LOCK_DEBUG
   struct PGPROC *owner;   /* last exlusive owner of the lock */
  #endif
 +
 + /* LWLock group, initialized as -1, calculated in first acquire */
 + int group;
  } LWLock;

I'd very much like to avoid increasing the size of struct LWLock. We
have a lot of those and I'd still like to inline them with the buffer
descriptors. Why do we need a separate group and can't reuse the
tranche?  That might require creating a few more tranches, but ...?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL, RAISE and error context

2015-07-21 Thread Pavel Stehule
2015-07-09 23:16 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:



 2015-07-09 22:57 GMT+02:00 Merlin Moncure mmonc...@gmail.com:

 On Thu, Jul 9, 2015 at 3:31 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
 
  2015-07-09 20:08 GMT+02:00 Merlin Moncure mmonc...@gmail.com:
 
  On Thu, Jul 9, 2015 at 10:48 AM, Pavel Stehule 
 pavel.steh...@gmail.com
  wrote:
  
   2015-07-09 15:17 GMT+02:00 Merlin Moncure mmonc...@gmail.com:
  
   On Wed, Jul 8, 2015 at 11:28 PM, Pavel Stehule
   pavel.steh...@gmail.com

   wrote:
Hi
   
second version of this patch
   
make check-world passed
  
   quickly scanning the patch, the implementation is trivial (minus
   regression test adjustments), and is, IMSNSHO, the right solution.
  
  
   yes, it is right way - the behave of RAISE statement will be much
 more
   cleaner
  
  
   Several of the source level comments need some minor wordsmithing
 and
   the GUCs are missing documentation.  If we've got consensus on the
   approach, I'll pitch in on that.
  
   thank you
 
  revised patch attached. added GUC docs and cleaned up pg_settings
  language.  Also tested patch and it works beautifully.
 
  Note, Pavel's patch does adjust default behavior to what we think is
  the right settings.
 
 
  Thank you for documentation.
 
  There is small error - default for client_min_context is error - not
 notice.
  With this level a diff from regress tests is minimal. Default for
  log_min_context should be warning.

 whoop!  thanks.   Also, I was playing a bit with the idea of making
 client_min_context superuser only setting.  The idea being this
 could be used to prevent leakage of stored procedure code in cases
 where the admins don't want it to be exposed.  I figured it was a bad
 idea though; it would frustrate debugging in reasonable cases.


 This is not designed for security usage. Probably there can be some rule
 in future - the possibility to see or don't see  a error context - OFF, ON.
 For this reason, the setting a some min level is not good way.


Hi

where we are with this patch? Can I do some for it?

Regards

Pavel



 Regards

 Pavel





 merlin





Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Alvaro Herrera
Andres Freund wrote:
 On 2015-07-21 14:39:42 +0900, Michael Paquier wrote:
  Regarding initdb.log and postmaster.log, this is definitely a bug.
  Those have been moved by dcae5fa from log/ to tmp_check/log/,
  tmp_check/ getting removed at the end of pg_regress if there are no
  failures counted.
 
 FWIW, I think that's bad TM. Especially when comparing good/bad
 buildfarm runs the ability to get logs from good runs is essential.

Hm, okay, so let's put it back where it was.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Alexander Korotkov
On Tue, Jul 21, 2015 at 12:40 PM, Heikki Linnakangas hlinn...@iki.fi
wrote:

 Has anyone done any performance testing of this?

 The purpose of a fillfactor is to avoid the storm of page splits right
 after building the index, when there are some random updates to the table.
 It causes the index to bloat, as every full page is split to two half-full
 pages, and also adds latency to all the updates that have to split pages.

 As the code stands, do we actually have such a problem with GIN? Let's
 investigate. Let's create a little test table:

 create table foo (id int4, t text);
 insert into foo select g, 'foo' from generate_series(1, 100) g;

 And some indexes on it:

 -- B-tree index on id, 100%
 create index btree_100 on foo (id) with (fillfactor=100);
 -- B-tree index on id, 90% (which is the default)
 create index btree_90 on foo (id) with (fillfactor=90);
 -- GIN index on id. Id is different on each row, so this index has no
 posting trees, just the entry tree.
 create index gin_id on foo using gin (id) with (fastupdate=off);
 -- GIN index on t. T is the same on each row, so this index consists of a
 single posting tree.
 create index gin_t on foo using gin (t) with (fastupdate=off);

 Immediately after creation, the index sizes are:

 postgres=# \di+
   List of relations
  Schema |   Name| Type  | Owner  | Table |  Size   | Description
 +---+---++---+-+-
  public | btree_100 | index | heikki | foo   | 19 MB   |
  public | btree_90  | index | heikki | foo   | 21 MB   |
  public | gin_id| index | heikki | foo   | 53 MB   |
  public | gin_t | index | heikki | foo   | 1072 kB |
 (4 rows)


 Now let's update 1% of the table, spread evenly across the table, and see
 what happens to the index sizes:

 postgres=# update foo set id = id where id % 100 = 0;
 UPDATE 1
 postgres=# \di+
   List of relations
  Schema |   Name| Type  | Owner  | Table |  Size   | Description
 +---+---++---+-+-
  public | btree_100 | index | heikki | foo   | 39 MB   |
  public | btree_90  | index | heikki | foo   | 21 MB   |
  public | gin_id| index | heikki | foo   | 53 MB   |
  public | gin_t | index | heikki | foo   | 1080 kB |
 (4 rows)

 As you can see, btree_100 index doubled in size. That's the effect we're
 trying to avoid with the fillfactor, and indeed in the btree_90 index it
 was avoided. However, the GIN indexes are not bloated either. Why is that?

 The entry tree (demonstrated by the gin_id index) is not packed tightly
 when it's built. If anything, we could pack it more tightly to make it
 smaller to begin with. For the use cases where GIN works best, though, the
 entry tree should be fairly small compared to the posting lists, so it
 shouldn't make a big difference. For the posting tree, I think that's
 because the way the items are packed in the segments. Even though we pack
 the segments as tightly as possible, there is always some free space left
 over on a page that's not enough to add another segment to it at index
 build, but can be used by the updates to add items to the existing
 segments. So in practice you always end up with some free space on the
 posting tree pages, even without a fillfactor, so the effective
 fillfactor even today is not actually 100%.


There are two reasons of this behavior. You mentioned the first one:
effective fillfactor for posting lists is not 100%. But the second one is
structure of posting lists. In your example UPDATE allocates new tuple at
the end of heap. So, from the point of posting lists these updates are
appends. Situation is different when you reuse space in the heap. See an
example.

drop table foo;
create table foo (id integer, val int[]) with (fillfactor = 90);
insert into foo (select g, array[g%2]::int[] from
generate_series(1,100) g);
create index foo_val_idx_100 on foo using gin(val) with (fastupdate=off,
fillfactor=100);
create index foo_val_idx_90 on foo using gin(val) with (fastupdate=off,
fillfactor=90);

# \di+
 List of relations
 Schema |  Name   | Type  | Owner  | Table |  Size   | Description
+-+---++---+-+-
 public | foo_val_idx_100 | index | smagen | foo   | 1088 kB |
 public | foo_val_idx_90  | index | smagen | foo   | 1200 kB |
(2 rows)

update foo set val = array[(id+1)%2]::int[] where id % 50 = 0;

# \di+
 List of relations
 Schema |  Name   | Type  | Owner  | Table |  Size   | Description
+-+---++---+-+-
 public | foo_val_idx_100 | index | smagen | foo   | 1608 kB |
 public | foo_val_idx_90  | index | smagen | foo   | 1200 kB |
(2 rows)

Based on this, I think we should just drop this patch. It's not useful in
 practice.


I wouldn't say it's not useful at all. It's for sure 

Re: [BUGS] [HACKERS] object_classes array is broken, again

2015-07-21 Thread Alvaro Herrera
Tom Lane wrote:
 I wrote:
  +1 to this patch, in fact I think we could remove MAX_OCLASS altogether
  which would be very nice for switch purposes.
 
 Oh, wait, I forgot that the patch itself introduces another reference to
 MAX_OCLASS.  I wonder though if we should get rid of that as an enum value
 in favor of #define'ing a LAST_OCLASS macro referencing the last enum
 item, or some other trick like that.  It's certainly inconvenient in
 event_trigger.c to have a phony member of the enum.

Yeah, that works well enough.  Pushed that way.

 Are there any other arrays that need such tests?

I looked around with this:

git grep 'const.*\[.*[^][0-9].*\].*=.*{'

and found one suspicious-looking use of MAX_ACL_KIND which we could
perhaps define in a way equivalent to what we've done here.  We also
have RM_MAX_ID in a couple of arrays but that looks safe because both
the values and the arrays are auto-generated.

We also have MAX_TIME_PRECISION, MAX_TIMESTAMP_PRECISION,
MAX_INTERVAL_PRECISION, MAXDATEFIELDS, KeyWord_INDEX_SIZE, but these
don't look likely to actually cause any trouble.

(There are many arrays sized to numerical constants, but there are about
5000 of those and I'm not in a hurry to verify how they are used.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2015-07-21 Thread Ildus Kurbangaliev

On 07/21/2015 01:18 PM, Andres Freund wrote:

On 2015-07-21 13:11:36 +0300, Ildus Kurbangaliev wrote:
  
  /*

   * Top-level transactions are identified by VirtualTransactionIDs comprising
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index cff3b99..55b0687 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -58,6 +58,9 @@ typedef struct LWLock
  #ifdef LOCK_DEBUG
struct PGPROC *owner;   /* last exlusive owner of the lock */
  #endif
+
+   /* LWLock group, initialized as -1, calculated in first acquire */
+   int group;
  } LWLock;

I'd very much like to avoid increasing the size of struct LWLock. We
have a lot of those and I'd still like to inline them with the buffer
descriptors. Why do we need a separate group and can't reuse the
tranche?  That might require creating a few more tranches, but ...?

Andres
Do you mean moving  LWLocks defined by offsets and with dynamic sizes to 
separate tranches?
It sounds like an good option, but it will require refactoring of 
current tranches. In current implementation

i tried not change current things.

Simple solution will be using uint8 for tranche (because we have only 3 
of them now,
and it will take much time to get to 255) and uint8 for group. In this 
case size will not change.


--
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Alexander Korotkov
On Tue, Jul 21, 2015 at 2:49 PM, Heikki Linnakangas hlinn...@iki.fi wrote:

 Hmm. That's slightly different from the test case I used, in that the
 update is changing the indexed value, which means that the new value goes
 to different posting tree than the old one. I'm not sure if that makes any
 difference. You're also updating more rows, 1/50 vs. 1/100.

 This case is closer to my earlier one:

 postgres=# create table foo (id int4, i int4, t text) with (fillfactor=90);
 CREATE TABLE
 postgres=# insert into foo select g, 1 from  generate_series(1, 100) g;
 INSERT 0 100
 postgres=# create index btree_foo_id on foo (id); -- to force non-HOT
 updates
 CREATE INDEX
 postgres=# create index gin_i on foo using gin (i) with (fastupdate=off);
 CREATE INDEX
 postgres=# \di+
List of relations
  Schema | Name | Type  | Owner  | Table |  Size   | Description
 +--+---++---+-+-
  public | btree_foo_id | index | heikki | foo   | 21 MB   |
  public | gin_i| index | heikki | foo   | 1072 kB |
 (2 rows)

 postgres=# update foo set id=-id where id % 100 = 0;
 UPDATE 1
 postgres=# \di+
List of relations
  Schema | Name | Type  | Owner  | Table |  Size   | Description
 +--+---++---+-+-
  public | btree_foo_id | index | heikki | foo   | 22 MB   |
  public | gin_i| index | heikki | foo   | 1080 kB |
 (2 rows)

 I'm not sure what's making the difference to your test case. Could be
 simply that your case happens to leave less free space on each page,
 because of the different data.


Yeah, it's likely because of different data.

 Based on this, I think we should just drop this patch. It's not useful in
 practice.


 I wouldn't say it's not useful at all. It's for sure not as useful as
 btree
 fillfactor, but it still could be useful in some cases...
 Probably, we could leave 100 as default fillfactor, but provide an option.


 Doesn't seem worth the trouble to me...


Probably, but currently we are in quite unlogical situation. We have
default fillfactor = 90 for GiST where it has no use cases at all and
effectively is just a waste of space. On the other had we're rejecting
fillfactor for GIN where it could have at least some use cases... Should we
don't have fillfactor for both GiST and GIN?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Heikki Linnakangas

On 07/21/2015 02:00 PM, Alexander Korotkov wrote:

On Tue, Jul 21, 2015 at 12:40 PM, Heikki Linnakangas hlinn...@iki.fi
wrote:


Has anyone done any performance testing of this?

The purpose of a fillfactor is to avoid the storm of page splits right
after building the index, when there are some random updates to the table.
It causes the index to bloat, as every full page is split to two half-full
pages, and also adds latency to all the updates that have to split pages.

As the code stands, do we actually have such a problem with GIN? Let's
investigate. Let's create a little test table:

create table foo (id int4, t text);
insert into foo select g, 'foo' from generate_series(1, 100) g;

And some indexes on it:

-- B-tree index on id, 100%
create index btree_100 on foo (id) with (fillfactor=100);
-- B-tree index on id, 90% (which is the default)
create index btree_90 on foo (id) with (fillfactor=90);
-- GIN index on id. Id is different on each row, so this index has no
posting trees, just the entry tree.
create index gin_id on foo using gin (id) with (fastupdate=off);
-- GIN index on t. T is the same on each row, so this index consists of a
single posting tree.
create index gin_t on foo using gin (t) with (fastupdate=off);

Immediately after creation, the index sizes are:

postgres=# \di+
   List of relations
  Schema |   Name| Type  | Owner  | Table |  Size   | Description
+---+---++---+-+-
  public | btree_100 | index | heikki | foo   | 19 MB   |
  public | btree_90  | index | heikki | foo   | 21 MB   |
  public | gin_id| index | heikki | foo   | 53 MB   |
  public | gin_t | index | heikki | foo   | 1072 kB |
(4 rows)


Now let's update 1% of the table, spread evenly across the table, and see
what happens to the index sizes:

postgres=# update foo set id = id where id % 100 = 0;
UPDATE 1
postgres=# \di+
   List of relations
  Schema |   Name| Type  | Owner  | Table |  Size   | Description
+---+---++---+-+-
  public | btree_100 | index | heikki | foo   | 39 MB   |
  public | btree_90  | index | heikki | foo   | 21 MB   |
  public | gin_id| index | heikki | foo   | 53 MB   |
  public | gin_t | index | heikki | foo   | 1080 kB |
(4 rows)

As you can see, btree_100 index doubled in size. That's the effect we're
trying to avoid with the fillfactor, and indeed in the btree_90 index it
was avoided. However, the GIN indexes are not bloated either. Why is that?

The entry tree (demonstrated by the gin_id index) is not packed tightly
when it's built. If anything, we could pack it more tightly to make it
smaller to begin with. For the use cases where GIN works best, though, the
entry tree should be fairly small compared to the posting lists, so it
shouldn't make a big difference. For the posting tree, I think that's
because the way the items are packed in the segments. Even though we pack
the segments as tightly as possible, there is always some free space left
over on a page that's not enough to add another segment to it at index
build, but can be used by the updates to add items to the existing
segments. So in practice you always end up with some free space on the
posting tree pages, even without a fillfactor, so the effective
fillfactor even today is not actually 100%.



There are two reasons of this behavior. You mentioned the first one:
effective fillfactor for posting lists is not 100%. But the second one is
structure of posting lists. In your example UPDATE allocates new tuple at
the end of heap. So, from the point of posting lists these updates are
appends.


Ah, good point.


Situation is different when you reuse space in the heap. See an
example.

drop table foo;
create table foo (id integer, val int[]) with (fillfactor = 90);
insert into foo (select g, array[g%2]::int[] from
generate_series(1,100) g);
create index foo_val_idx_100 on foo using gin(val) with (fastupdate=off,
fillfactor=100);
create index foo_val_idx_90 on foo using gin(val) with (fastupdate=off,
fillfactor=90);

# \di+
  List of relations
  Schema |  Name   | Type  | Owner  | Table |  Size   | Description
+-+---++---+-+-
  public | foo_val_idx_100 | index | smagen | foo   | 1088 kB |
  public | foo_val_idx_90  | index | smagen | foo   | 1200 kB |
(2 rows)

update foo set val = array[(id+1)%2]::int[] where id % 50 = 0;

# \di+
  List of relations
  Schema |  Name   | Type  | Owner  | Table |  Size   | Description
+-+---++---+-+-
  public | foo_val_idx_100 | index | smagen | foo   | 1608 kB |
  public | foo_val_idx_90  | index | smagen | foo   | 1200 kB |
(2 rows)


Hmm. That's slightly different from the test case I used, in that the 
update is changing the indexed 

Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Michael Paquier
On Tue, Jul 21, 2015 at 7:01 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Michael Paquier wrote:
 On Tue, Jul 21, 2015 at 2:39 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:

  Regarding initdb.log and postmaster.log, this is definitely a bug.
  Those have been moved by dcae5fa from log/ to tmp_check/log/,
  tmp_check/ getting removed at the end of pg_regress if there are no
  failures counted. Both files will be saved in log/ at the location
  pg_regress is called using outputdir whose default is .. This way
  behavior is similar to ~9.4. Attached is a patch to fix this for 9.5
  and master.

 Something I just noticed: an entry for log/ in test_ddl_deparse's
 gitignore is missing.

 Well, three things: 1) the entry is not missing right now, but it will
 be missing if we apply your patch; 2) the file is inconsistent with the
 other test modules anyway so we might as well apply your patch to make
 them all alike; 3) we shouldn't really do anything about that until the
 other patch's fate is decided.

Well, just to be clear, I meant with the previous patch I have attached.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-07-21 Thread Haribabu Kommi
Hi Hackers,

This is the patch adds a new function called pg_hba_lookup to get all
matching entries
from the pg_hba.conf for the providing input data.The rows are
displayed from the other
the hba conf entries are matched.

This is an updated version of previous failure attempt to create a
catalog view for the
pg_hba.conf [1]. The view is not able to handle the SQL queries properly because
keywords that are present in database and user columns.


currently the following two types are added:

pg_hba_lookup(database, user)
pg_hba_lookup(database, user, address, hostname)


How it works:

With the provided input data, it tries to match the entries of
pg_hba.conf and populate the
result set with all matching entries.

With the recent Tomlane's commit
1e24cf645d24aab3ea39a9d259897fd0cae4e4b6 of Don't leave pg_hba and
pg_ident data lying around in running backends [2], the parsed hba
conf entries are not available in the backend side. Temporarily I just
reverted this patch for the
proof of concept purpose. Once we agree with the approach, I will try
to find out a solution
to the same.


How is it useful:

With the output of this view, administrator can identify the lines
that are matching for the given
criteria easily without going through the file.


Record format:

Column name | datatype
---
line_number | int4
type  | text
database  | jsonb
user  | jsonb
address| inet
hostname | text
method | text
options  | jsonb

Please suggest me for any column additions or data type changes that
are required.


Example output:

postgres=# select pg_hba_lookup('postgres','all');
 pg_hba_lookup
---
 (84,local,[all],[all],,,trust,{})
 (86,host,[all],[all],127.0.0.1,,trust,{})
 (88,host,[all],[all],::1,,trust,{})

Here I attached a proof of concept patch for the same.

Any suggestions/comments on this proposed approach?

[1] 
http://www.postgresql.org/message-id/f40b0968db0a904da78a924e633be78645f...@sydexchtmp2.au.fjanz.com

[2] 
http://www.postgresql.org/message-id/e1zaquy-00072j...@gemulon.postgresql.org

Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc.patch
Description: Binary data


revert_hba_context_release_in_backend.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Heikki Linnakangas

Has anyone done any performance testing of this?

The purpose of a fillfactor is to avoid the storm of page splits right 
after building the index, when there are some random updates to the 
table. It causes the index to bloat, as every full page is split to two 
half-full pages, and also adds latency to all the updates that have to 
split pages.


As the code stands, do we actually have such a problem with GIN? Let's 
investigate. Let's create a little test table:


create table foo (id int4, t text);
insert into foo select g, 'foo' from generate_series(1, 100) g;

And some indexes on it:

-- B-tree index on id, 100%
create index btree_100 on foo (id) with (fillfactor=100);
-- B-tree index on id, 90% (which is the default)
create index btree_90 on foo (id) with (fillfactor=90);
-- GIN index on id. Id is different on each row, so this index has no 
posting trees, just the entry tree.

create index gin_id on foo using gin (id) with (fastupdate=off);
-- GIN index on t. T is the same on each row, so this index consists of 
a single posting tree.

create index gin_t on foo using gin (t) with (fastupdate=off);

Immediately after creation, the index sizes are:

postgres=# \di+
  List of relations
 Schema |   Name| Type  | Owner  | Table |  Size   | Description
+---+---++---+-+-
 public | btree_100 | index | heikki | foo   | 19 MB   |
 public | btree_90  | index | heikki | foo   | 21 MB   |
 public | gin_id| index | heikki | foo   | 53 MB   |
 public | gin_t | index | heikki | foo   | 1072 kB |
(4 rows)


Now let's update 1% of the table, spread evenly across the table, and 
see what happens to the index sizes:


postgres=# update foo set id = id where id % 100 = 0;
UPDATE 1
postgres=# \di+
  List of relations
 Schema |   Name| Type  | Owner  | Table |  Size   | Description
+---+---++---+-+-
 public | btree_100 | index | heikki | foo   | 39 MB   |
 public | btree_90  | index | heikki | foo   | 21 MB   |
 public | gin_id| index | heikki | foo   | 53 MB   |
 public | gin_t | index | heikki | foo   | 1080 kB |
(4 rows)

As you can see, btree_100 index doubled in size. That's the effect we're 
trying to avoid with the fillfactor, and indeed in the btree_90 index it 
was avoided. However, the GIN indexes are not bloated either. Why is that?


The entry tree (demonstrated by the gin_id index) is not packed tightly 
when it's built. If anything, we could pack it more tightly to make it 
smaller to begin with. For the use cases where GIN works best, though, 
the entry tree should be fairly small compared to the posting lists, so 
it shouldn't make a big difference. For the posting tree, I think that's 
because the way the items are packed in the segments. Even though we 
pack the segments as tightly as possible, there is always some free 
space left over on a page that's not enough to add another segment to it 
at index build, but can be used by the updates to add items to the 
existing segments. So in practice you always end up with some free space 
on the posting tree pages, even without a fillfactor, so the effective 
fillfactor even today is not actually 100%.


Based on this, I think we should just drop this patch. It's not useful 
in practice.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Grouping Sets: Fix unrecognized node type bug

2015-07-21 Thread Kyotaro HORIGUCHI
Hello,

At Mon, 20 Jul 2015 15:45:21 +0530, Jeevan Chalke 
jeevan.cha...@enterprisedb.com wrote in 
CAM2+6=X9QWgbjJrR-dcLXh-RvvpGy=9enhuoghzrxhcj2kv...@mail.gmail.com
 On Sat, Jul 18, 2015 at 12:27 AM, Andrew Gierth and...@tao11.riddles.org.uk
  wrote:
 
   Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp
  writes:
 
   Kyotaro Hello, this looks to be a kind of thinko. The attached patch
   Kyotaro fixes it.
 
  No, that's still wrong. Just knowing that there is a List is not enough
  to tell whether to concat it or append it.

Thank you. I've missed the non-grouping-set cases.

  Jeevan's original patch tries to get around this by making the RowExpr
  case wrap another List around its result (which is then removed by the
  concat), but this is the wrong approach too because it breaks nested
  RowExprs (which isn't valid syntax in the spec, because the spec allows
  only column references in GROUP BY, not arbitrary expressions, but which
  we have no reason not to support).
 
  Attached is the current version of my fix (with Jeevan's regression
  tests plus one of mine).
 
 
 Looks good to me.

It also looks for me to work as expected and to be in good shape.

The two foreach loops for T_GroupingSet and T_List became to look
very simiar but they don't seem can be merged in reasonable
shape.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jul 20, 2015 at 10:29 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 In short I would give up on the DEFAULT SCHEMA business, and
 add a new flag in the control file to decide if a given extension
 passes down the schema name of its child when created in cascade,
 default being true for the potential issues with search_path not
 pointing to public.

 Well, so far, it seems like this decision is something where different
 DBAs might have different policies.  If you put the flag in the
 control file, you're saying it is the extension developer's decision,
 which may not be best.

I have doubts about that too.  But really, why have a flag at all
anywhere?  If we are doing a CASCADE, and the referenced extension needs a
schema, the alternatives are either (1) let it have one, or (2) fail.
I am not seeing that (2) is a superior alternative in any circumstances.

We will need to document that the behavior of CASCADE is install all
needed extensions into the schema you specify, but what's wrong with
that?  If the user wants to put them in different schemas, he cannot
use CASCADE in any case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Tom Lane
Andrew Dunstan andrew.duns...@pgexperts.com writes:
 On 07/21/2015 01:39 AM, Michael Paquier wrote:
 Regarding install.log, the use of stdout/stderr instead of a log file
 has been changed in dbf2ec1a after that:
 http://www.postgresql.org/message-id/553fe7fc.2040...@gmx.net
 Since 9.5 as the location of the temporary installation is global, we
 could basically revert some parts of dcae5fac if that helps so as
 install.log is saved in $ROOT/tmp_install/log/install.log... But I am
 not sure what we win with that, and the argument to remove install.log
 is that now the temporary installation is a make target. Both ways
 have advantages and disadvantages.

 I'm quite unhappy about how we now pollute the output of make check 
 with the install log. See the above links for why. And when I run it by 
 hand I don't want all this scrolling by me on the screen. The previous 
 output of make check was small and clean, and I want it to go back to 
 that, with the other logs available as necessary. The reason the 
 buildfarm client cd's into the regress directory to run make check is 
 to avoid extraneous output.

I agree; this change may have seemed like a good idea at the time, but
it was not.  Failures during make check's install step are rare enough
that you don't really need all that output in your face to help with the
rare situation where it fails.  And for the buildfarm's purposes, it is
surely desirable to segregate that output from the actual check step.

A possible alternative is to run the make install sub-step with -s,
but that could be objected to on the grounds that if it did fail, you'd
have a hard time telling exactly which step failed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security labels on databases are bad for dump restore

2015-07-21 Thread Noah Misch
On Mon, Jul 20, 2015 at 07:01:14PM -0400, Adam Brightwell wrote:
  Consistency with existing practice would indeed have pg_dump ignore
  pg_shseclabel and have pg_dumpall reproduce its entries.
 
 I think that makes sense, but what about other DATABASE level info
 such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
 specifically thinking of the discussion from the following thread:
 
 http://www.postgresql.org/message-id/20150317172459.gm3...@alvh.no-ip.org
 
 If COMMENT is included then why not SECURITY LABEL or others?

In any given situation, we should indeed restore both pg_database comments and
pg_database security labels, or we should restore neither.  Restoring neither
is most consistent with history, but several people like the idea of restoring
both.  I won't mind either conclusion.

  In a greenfield, I would make pg_dump --create reproduce pertinent entries
  from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I 
  would
  make non-creating pg_dump reproduce none of those.
 
 I think the bigger question is Where is the line drawn between
 pg_dump and pg_dumpall?.  At what point does one tool become the
 other?

That question may be too big for me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check changes have caused buildfarm deterioration.

2015-07-21 Thread Andrew Dunstan

On 07/21/2015 01:39 AM, Michael Paquier wrote:

On Tue, Jul 21, 2015 at 6:17 AM, Tom Lane t...@sss.pgh.pa.us wrote:

Andrew Dunstan andrew.duns...@pgexperts.com writes:

Somewhere along the way some changes to the way we do make check have
caused a significant deterioration in the buildfarm's logging. Compare
these two from animal crake, which happens to be my test instance:
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2015-07-20%2013%3A09%3A02stg=check
and
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2015-07-20%2017%3A23%3A08stg=check

Yeah, I've been bitching about the poor logging for awhile, but I had
not realized that it's still working as-expected in the back branches.

Comparing different branches, it looks like somewhere along the way
means between 9.4 and 9.5.  I suspect that commit dcae5faccab64776, or
perhaps the followon dbf2ec1a1c053379, is to blame.

Regarding install.log, the use of stdout/stderr instead of a log file
has been changed in dbf2ec1a after that:
http://www.postgresql.org/message-id/553fe7fc.2040...@gmx.net
Since 9.5 as the location of the temporary installation is global, we
could basically revert some parts of dcae5fac if that helps so as
install.log is saved in $ROOT/tmp_install/log/install.log... But I am
not sure what we win with that, and the argument to remove install.log
is that now the temporary installation is a make target. Both ways
have advantages and disadvantages.



I'm quite unhappy about how we now pollute the output of make check 
with the install log. See the above links for why. And when I run it by 
hand I don't want all this scrolling by me on the screen. The previous 
output of make check was small and clean, and I want it to go back to 
that, with the other logs available as necessary. The reason the 
buildfarm client cd's into the regress directory to run make check is 
to avoid extraneous output.




Regarding initdb.log and postmaster.log, this is definitely a bug.
Those have been moved by dcae5fa from log/ to tmp_check/log/,
tmp_check/ getting removed at the end of pg_regress if there are no
failures counted. Both files will be saved in log/ at the location
pg_regress is called using outputdir whose default is .. This way
behavior is similar to ~9.4. Attached is a patch to fix this for 9.5
and master.




OK, looks sane enough. but please do address the other issue.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Heikki Linnakangas

On 07/21/2015 02:56 PM, Alexander Korotkov wrote:

Probably, but currently we are in quite unlogical situation. We have
default fillfactor = 90 for GiST where it has no use cases at all and
effectively is just a waste of space.


Why is it useless for GiST?
- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-21 Thread Robert Haas
On Mon, Jul 20, 2015 at 10:29 PM, Michael Paquier
michael.paqu...@gmail.com wrote: In  short I would give up on the
DEFAULT SCHEMA business, and
 add a new flag in the control file to decide if a given extension
 passes down the schema name of its child when created in cascade,
 default being true for the potential issues with search_path not
 pointing to public.

Well, so far, it seems like this decision is something where different
DBAs might have different policies.  If you put the flag in the
control file, you're saying it is the extension developer's decision,
which may not be best.

Maybe I'm confused.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-07-21 Thread Alexander Korotkov
On Tue, Jul 21, 2015 at 3:52 PM, Heikki Linnakangas hlinn...@iki.fi wrote:

 On 07/21/2015 02:56 PM, Alexander Korotkov wrote:

 Probably, but currently we are in quite unlogical situation. We have
 default fillfactor = 90 for GiST where it has no use cases at all and
 effectively is just a waste of space.


 Why is it useless for GiST?


It's because all of GiST pages are produced by page splits. So, just after
CREATE INDEX GiST pages aren't tightly packed in average. Actually, they
could be tightly packed by incredible coincidence, but for large indexes
it's quite safe assumption that they are not. With GiST we don't have storm
of page splits after index creation with fillfactor = 100. So, why should
we reserve additional space with fillfactor = 90?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Selectivity estimation for intarray with @@

2015-07-21 Thread Alexander Korotkov
Hi!

While Uriy is on vacation, I've revised this patch a bit.

On Tue, Jul 14, 2015 at 8:55 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 Hi Uriy,

 This patch looks pretty good.

 The first line of intarray--1.1.sql mis-identifies itself as /*
 contrib/intarray/intarray--1.0.sql */


Fixed.


 The real file intarray--1.0.sql file probably should not be included in
 the final patch, but I like having it for testing.


I've removed intarray--1.0.sql since it shouldn't be in final commit.


 It applies and builds cleanly over the alter operator patch (and now the
 commit as well), passes make check of the contrib module both with and
 without cassert.

 I could succesfully upgrade from version 1.0 to 1.1 without having to drop
 the gin__int_ops indexes in the process.

 I could do pg_upgrade from 9.2 and 9.4 to 9.6devel with large indexes in
 place, and then upgrade the extension to 1.1, and it worked without having
 to rebuild the index.

 It does what it says, and I think we want this.


Good.


 There were some cases where the estimates were not very good, but that
 seems to be limitation of what pg_stats makes available, not of this
 patch.  Specifically if the elements listed in the query text are not part
 of most_common_elems (or worse yet, most_common_elems is null) then it is
 left to guess with no real data, and those guesses can be pretty bad.  It
 is not this patches job to fix that, however.



 It assumes all the stats are independent and so doesn't account for
 correlation between members.  This is also how the core selectivity
 estimates work between columns, so I can't really complain about that.  It
 is easy to trick it with things like @@ '(!300  300)'::query_int, but I
 don't think that is necessary to fix that.


Analysis of all the dependencies inside query is NP-complete task. We could
try to workout simple cases, but postgres optimizer currently doesn't care
about it.

# explain select * from test where val = 'val' and val != 'val';
 QUERY PLAN
-
 Seq Scan on test  (cost=0.00..39831.00 rows=45 width=8)
   Filter: ((val  'val'::text) AND (val = 'val'::text))
(2 rows)

I think we could do the same inside intquery until we figure out some
better solution for postgres optimizer in general.


 I have not been creative enough to come up with queries for which this
 improvement in selectivity estimate causes the execution plan to change in
 important ways.  I'm sure the serious users of this module would have such
 cases, however.

 I haven't tested gist__int_ops as I can't get those indexes to build in a
 feasible amount of time.  But the selectivity estimates are independent of
 any actual index, so testing those doesn't seem to be critical.

 There is no documentation change, which makes sense as this internal stuff
 which isn't documented to start with.


Yes. For instance, tsquery make very similar selectivity estimation as
intquery, but it's assumed to be internal and isn't mentioned in
documentation.

There are no regression test changes.  Not sure about that, it does seem
 like regression tests would be desirable.


It would be nice to cover selectivity estimation with regression tests, but
AFAICS we didn't do it for any selectivity estimation functions yet.
Problem is that selectivity estimation is quite complex process and its
result depending on random sampling during analyze, floating points
operations and so on. We could make a test like with very high level of
confidence, estimate number of rows here should be in [10;100]. But it's
hard to fit such assumptions into our current regression tests
infrastructure.

I haven't gone through the C code.


I also did some coding style and comments modifications.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


intarray_sel-2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-21 Thread Petr Jelinek

On 2015-07-21 15:48, Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:

On Mon, Jul 20, 2015 at 10:29 PM, Michael Paquier
michael.paqu...@gmail.com wrote:

In short I would give up on the DEFAULT SCHEMA business, and
add a new flag in the control file to decide if a given extension
passes down the schema name of its child when created in cascade,
default being true for the potential issues with search_path not
pointing to public.



Well, so far, it seems like this decision is something where different
DBAs might have different policies.  If you put the flag in the
control file, you're saying it is the extension developer's decision,
which may not be best.


I have doubts about that too.  But really, why have a flag at all
anywhere?  If we are doing a CASCADE, and the referenced extension needs a
schema, the alternatives are either (1) let it have one, or (2) fail.
I am not seeing that (2) is a superior alternative in any circumstances.

We will need to document that the behavior of CASCADE is install all
needed extensions into the schema you specify, but what's wrong with
that?  If the user wants to put them in different schemas, he cannot
use CASCADE in any case.



Yes this is the behavior I want as well. My main question is if we are 
ok with SCHEMA having different behavior with CASCADE vs without 
CASCADE. I went originally with no and added the DEFAULT flag to 
SCHEMA. If the answer is yes then we don't need the flag (in that case 
CASCADE acts as the flag).


Or course yes would then mean CREATE EXTENSION foo SCHEMA bar will 
fail if foo is not relocatable but CREATE EXTENSION foo SCHEMA bar 
CASCADE will succeed and install foo into schema foo instead of 
bar and only relocatable dependencies will go to bar. OTOH 
non-relocatable dependencies will go to their own schema no matter what 
user specifies in the command so I guess it's ok to just document that 
this is the behavior of CASCADE. As you say if somebody wants control 
over each individual extension they can't use CASCADE anyway.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selectivity estimation for intarray with @@

2015-07-21 Thread Teodor Sigaev

Some notices about code:

1 Near function transformOperator() there is wrong operator name @
2 int_query (and intquery) should be replaced to query_int to be consistent with 
actual type name. At least where it's used as separate lexeme.
3 For historical reasons @@ doesn't commutate with itself, it has a commutator 
~~. Patch assumes that @@ is self-commutator, but ~~ will use  'contsel' as a 
restrict estimation. Suppose, we need to declare ~~ as deprecated and introduce 
commutating @@ operator.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating extension including dependencies

2015-07-21 Thread Tom Lane
Petr Jelinek p...@2ndquadrant.com writes:
 ... My main question is if we are 
 ok with SCHEMA having different behavior with CASCADE vs without 
 CASCADE. I went originally with no and added the DEFAULT flag to 
 SCHEMA. If the answer is yes then we don't need the flag (in that case 
 CASCADE acts as the flag).

Yeah, I was coming around to that position as well.  Insisting that
SCHEMA throw an error if the extension isn't relocatable makes sense
as long as only one extension is being considered, but once you say
CASCADE it seems like mostly a usability fail.  I think it's probably
OK if with CASCADE, SCHEMA is just use if needed else ignore.

Obviously we've gotta document all this properly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Robert Haas
On Fri, Jul 17, 2015 at 9:50 AM, Fabien coe...@cri.ensmp.fr wrote:
   sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats

That is a truly horrifying abuse of command-line arguments.  -1 from
me, or minus more than one if I've got that many chits to burn.

I have been thinking that the way to do this is to push more into the
script file itself, e.g. allow:

\if random()  0.1
stuff
\else
other stuff
\endif

Maybe that's overkill and there's some way of specifying multiple
scripts on the command line, but IMO what you've got here is not it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] postgres_fdw extension support

2015-07-21 Thread Paul Ramsey
Here's a third revision that allows the 'extensions' option on the
wrapper as well, so that supported extensions can be declared once in
one place.

Since the CREATE FOREIGN DATA WRAPPER statement is actually called
inside the CREATE EXTENSION script for postgres_fdw, the way to get
this option is actually to alter the wrapper,

  ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS ( extensions 'seg' );

Right now declaring extensions at different levels is additive, I
didn't add the option to revoke an extension for a particular server
definition (the cube,-postgis entry for example). If that's a
deal-breaker, I can add it too, but it felt like something that could
wait for a user to positively declare I must have that feature!

P.


On Fri, Jul 17, 2015 at 5:58 AM, Paul Ramsey pram...@cleverelephant.ca wrote:

 On July 17, 2015 at 5:57:42 AM, Simon Riggs 
 (si...@2ndquadrant.com(mailto:si...@2ndquadrant.com)) wrote:

 Options already exist on CREATE FOREIGN DATA WRAPPER, so it should be easy 
 to support that.

 I'd rather add it once on the wrapper than be forced to list all the options 
 on every foreign server, unless required to do so.

 Gotcha, that does make sense.

 P.
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 81cb2b4..9c5136e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -34,11 +34,15 @@
 
 #include postgres_fdw.h
 
+#include access/genam.h
 #include access/heapam.h
 #include access/htup_details.h
 #include access/sysattr.h
 #include access/transam.h
+#include catalog/dependency.h
+#include catalog/indexing.h
 #include catalog/pg_collation.h
+#include catalog/pg_depend.h
 #include catalog/pg_namespace.h
 #include catalog/pg_operator.h
 #include catalog/pg_proc.h
@@ -49,8 +53,10 @@
 #include optimizer/var.h
 #include parser/parsetree.h
 #include utils/builtins.h
+#include utils/fmgroids.h
 #include utils/lsyscache.h
 #include utils/rel.h
+#include utils/snapmgr.h
 #include utils/syscache.h
 
 
@@ -136,6 +142,7 @@ static void printRemoteParam(int paramindex, Oid paramtype, 
int32 paramtypmod,
 deparse_expr_cxt *context);
 static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
   deparse_expr_cxt *context);
+static bool is_in_extension(Oid procid, PgFdwRelationInfo *fpinfo);
 
 
 /*
@@ -229,6 +236,9 @@ foreign_expr_walker(Node *node,
Oid collation;
FDWCollateState state;
 
+   /* Access extension metadata from fpinfo on baserel */
+   PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo 
*)(glob_cxt-foreignrel-fdw_private);
+
/* Need do nothing for empty subexpressions */
if (node == NULL)
return true;
@@ -361,7 +371,7 @@ foreign_expr_walker(Node *node,
 * can't be sent to remote because it might 
have incompatible
 * semantics on remote side.
 */
-   if (!is_builtin(fe-funcid))
+   if (!is_builtin(fe-funcid)  
!is_in_extension(fe-funcid, fpinfo))
return false;
 
/*
@@ -407,7 +417,7 @@ foreign_expr_walker(Node *node,
 * (If the operator is, surely its underlying 
function is
 * too.)
 */
-   if (!is_builtin(oe-opno))
+   if (!is_builtin(oe-opno)  
!is_in_extension(oe-opno, fpinfo))
return false;
 
/*
@@ -445,7 +455,7 @@ foreign_expr_walker(Node *node,
/*
 * Again, only built-in operators can be sent 
to remote.
 */
-   if (!is_builtin(oe-opno))
+   if (!is_builtin(oe-opno)  
!is_in_extension(oe-opno, fpinfo))
return false;
 
/*
@@ -591,7 +601,7 @@ foreign_expr_walker(Node *node,
 * If result type of given expression is not built-in, it can't be sent 
to
 * remote because it might have incompatible semantics on remote side.
 */
-   if (check_type  !is_builtin(exprType(node)))
+   if (check_type  !is_builtin(exprType(node))  
!is_in_extension(exprType(node), fpinfo))
return false;
 
/*
@@ -669,6 +679,65 @@ is_builtin(Oid oid)
 
 
 /*
+ * Returns true if given operator/function is part of an extension declared in 
the 
+ * server options.
+ */
+static bool
+is_in_extension(Oid procnumber, PgFdwRelationInfo *fpinfo)
+{
+   static int nkeys = 1;
+   ScanKeyData key[nkeys];
+   HeapTuple tup;
+   Relation depRel;
+   SysScanDesc scan;
+   

Re: [HACKERS] pgbench stats per script other stuff

2015-07-21 Thread Fabien COELHO


Hello Robert,


On Fri, Jul 17, 2015 at 9:50 AM, Fabien coe...@cri.ensmp.fr wrote:

  sh ./pgbench -T 3 -B -N -w 2 -S -w 7 --per-script-stats


That is a truly horrifying abuse of command-line arguments.  -1 from
me, or minus more than one if I've got that many chits to burn.


Are you against the -w, or against saying that pgbench execute scripts, 
whether internal or from files?


The former is obviously a matter of taste and I can remove -w if nobody 
wants it, too bad because the feature seems useful to me from a testing 
point of view, this is a choice between aesthetic and feature. Note that 
you do not have to use it if you do not like it.


The later really homogeneise the code internally and allows to factor out 
things, to have orthogonal features (internal scripts are treated the same 
way as external files, this requires less lines of code because it is 
simpler), and does not harm anyone IMO, so it would be sad to let it go.



I have been thinking that the way to do this is to push more into the
script file itself, e.g. allow:

\if random()  0.1
stuff
\else
other stuff
\endif

Maybe that's overkill and there's some way of specifying multiple
scripts on the command line, but IMO what you've got here is not it.


I think that is overkill, and moreover it is not useful: the point is to 
collect statistics *per scripts*, with an random if you would not know 
which part was executed, so you would loose the whole point of having per 
script stats.


If you have another suggestion about how to provide weights, which does 
not rely on ifs nor on options? Maybe a special comment in the script (yuk 
from my point of view because the script would carry its weight whereas I 
think this should be orthogonal to the script contents, but it would be 
better than nothing..).


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers