Re: [HACKERS] Some questions about the array.

2015-11-30 Thread YUriy Zhuravlev
On Monday 30 November 2015 08:58:49 you wrote:
> +1  IMO this line of thinking is a dead end.  Better handled via
> functions, not syntax

Maybe then add array_pyslice(start, end) when start is 0 and with negative 
indexes? Only for 1D array.
What do you think?

-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-11-30 Thread Julien Rouhaud
Hello,

On 15/10/2015 16:04, Pavel Stehule wrote:
> 
> 
> 2015-10-15 15:42 GMT+02:00 Tom Lane  >:
> 
> "Shulgin, Oleksandr"  > writes:
> > I was thinking about this and what seems to be the biggest problem is 
> when
> > to actually turn the feature on.  It seems unlikely that someone will 
> want
> > to enable it unconditionally.  Enabling per-backend also doesn't seem 
> to be
> > a good approach because you don't know if the next query you'd like to 
> look
> > at is going to run in this exact backend.
> 
> Check.
> 
> > What might be actually usable is poking pg_stat_statements for queryid 
> to
> > decide if we need to do explain (and possibly analyze).
> 
> Hm, interesting thought.
> 
> > Does this make sense to you?  Does this make a good argument for merging
> > pg_stat_statements and auto_explain into core?
> 
> I'd say more that it's a good argument for moving this feature out to
> one of those extensions, or perhaps building a third extension that
> depends on both of those.  TBH, none of this stuff smells to me like
> something that ought to be in core.
> 
> 
> There are few features, that I would to see in core:
> 
> 1. possibility to get full SQL string
> 2. possibility to get state string
> 
> We can speak how to do it well.
> 

I registered as reviewer on this, but after reading the whole thread for
the second time, it's still not clear to me if the last two submitted
patches (0001-Add-auto_explain.publish_plans.patch and
0002-Add-SHM-table-of-contents-to-the-explain-DSM.patch) are still
needed reviews, since multiple refactoring ideas and objections have
been raised since.


Regards.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-11-30 Thread Jeff Janes
On Mon, Nov 30, 2015 at 9:18 AM, Masahiko Sawada  wrote:
> On Sun, Nov 29, 2015 at 2:21 AM, Jeff Janes  wrote:
>> On Tue, Nov 24, 2015 at 3:13 PM, Masahiko Sawada  
>> wrote:
>>>
>>> Yeah, we need to consider to compute checksum if enabled.
>>> I've changed the patch, and attached.
>>> Please review it.
>>
>> Thanks for the update.  This now conflicts with the updates doesn to
>> fix pg_upgrade out-of-space issue on Windows. I've fixed (I think) the
>> conflict in order to do some testing, but I'd like to get an updated
>> patch from the author in case I did it wrong.  I don't want to find
>> bugs that I just introduced myself.
>>
>
> Thank you for having a look.
>
> Attached updated v28 patch.
> Please review it.
>
> Regards,

After running pg_upgrade, if I manually vacuum a table a start getting warnings:

WARNING:  page is not marked all-visible (and all-frozen) but
visibility map bit(s) is set in relation "foo" page 32756
WARNING:  page is not marked all-visible (and all-frozen) but
visibility map bit(s) is set in relation "foo" page 32756
WARNING:  page is not marked all-visible (and all-frozen) but
visibility map bit(s) is set in relation "foo" page 32757
WARNING:  page is not marked all-visible (and all-frozen) but
visibility map bit(s) is set in relation "foo" page 32757

The warnings are right where the blocks would start using the 2nd page
of the _vm, so I think the problem is there.  And looking at the code,
I think that "cur += SizeOfPageHeaderData;" in the inner loop cannot
be correct.  We can't skip a header in the current (old) block each
time we reach the end of the new block.  The thing we are skipping in
the current block is half the time not a header, but the data at the
halfway point through the block.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> 
> > The non-documentation question is around DROP OWNED.  We need to either
> > have policies dropped by DROP OWNED (well, roles removed, unless it's
> > the last one, in which case the policy should be dropped), or update the
> > documentation to reflect that they don't.  I had been thinking we'd
> > fix DROP OWNED to deal with the policies, but if folks feel it's too
> > late for that kind of a change, then we can simply document it.  I don't
> > believe that's unreasonable for a new feature and we can work to get it
> > addressed in 9.6.
> 
> DROP OWNED is documented as a mechanism to help you drop the role, so
> it should do whatever is needed for that.  I don't think documenting the
> fact that it leaves the user as part of policies is good enough.

We already can't take care of everything with DROP OWNED though, since
we can't do cross-database queries, and the overall process almost
certainly requires additional effort (to reassign objects, etc...), so
while I'd be happier if policies were handled by it, I don't think it's
as serious of an issue.

Still, I'll get a patch worked up for it and then we can discuss the
merits of that patch going in to 9.5 now versus just into HEAD.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-11-30 Thread Peter Geoghegan
On Sun, Nov 29, 2015 at 10:14 PM, Peter Geoghegan  wrote:
> I'm currently running some benchmarks on my external sorting patch on
> the POWER7 machine that Robert Haas and a few other people have been
> using for some time now [1]. So far, the benchmarks look very good
> across a variety of scales.
>
> I'll run a round of tests without the prefetching enabled (which the
> patch series makes further use of -- they're also used when writing
> tuples out). If there is no significant impact, I'll completely
> abandon this patch, and we can move on.

I took a look at this. It turns out prefetching significantly helps on
the POWER7 system, when sorting gensort tables of 50 million, 100
million, 250 million, and 500 million tuples (3 CREATE INDEX tests for
each case, 1GB maintenance_work_mem):

[pg@hydra gensort]$ cat test_output_patch_1gb.txt | grep "sort ended"
LOG:  external sort ended, 171063 disk blocks used: CPU 4.33s/71.28u
sec elapsed 75.75 sec
LOG:  external sort ended, 171063 disk blocks used: CPU 4.30s/71.32u
sec elapsed 75.91 sec
LOG:  external sort ended, 171063 disk blocks used: CPU 4.29s/71.34u
sec elapsed 75.69 sec
LOG:  external sort ended, 342124 disk blocks used: CPU 8.10s/165.56u
sec elapsed 174.35 sec
LOG:  external sort ended, 342124 disk blocks used: CPU 8.07s/165.15u
sec elapsed 173.70 sec
LOG:  external sort ended, 342124 disk blocks used: CPU 8.01s/164.73u
sec elapsed 174.84 sec
LOG:  external sort ended, 855306 disk blocks used: CPU 23.65s/491.37u
sec elapsed 522.44 sec
LOG:  external sort ended, 855306 disk blocks used: CPU 21.13s/508.02u
sec elapsed 530.48 sec
LOG:  external sort ended, 855306 disk blocks used: CPU 22.63s/475.33u
sec elapsed 499.09 sec
LOG:  external sort ended, 1710613 disk blocks used: CPU
47.99s/1016.78u sec elapsed 1074.55 sec
LOG:  external sort ended, 1710613 disk blocks used: CPU
46.52s/1015.25u sec elapsed 1078.23 sec
LOG:  external sort ended, 1710613 disk blocks used: CPU
44.34s/1013.26u sec elapsed 1067.16 sec
[pg@hydra gensort]$ cat test_output_patch_noprefetch_1gb.txt | grep "sort ended"
LOG:  external sort ended, 171063 disk blocks used: CPU 4.79s/78.14u
sec elapsed 83.03 sec
LOG:  external sort ended, 171063 disk blocks used: CPU 3.85s/77.71u
sec elapsed 81.64 sec
LOG:  external sort ended, 171063 disk blocks used: CPU 3.94s/77.71u
sec elapsed 81.71 sec
LOG:  external sort ended, 342124 disk blocks used: CPU 8.88s/180.15u
sec elapsed 189.69 sec
LOG:  external sort ended, 342124 disk blocks used: CPU 8.30s/179.07u
sec elapsed 187.92 sec
LOG:  external sort ended, 342124 disk blocks used: CPU 8.29s/179.06u
sec elapsed 188.02 sec
LOG:  external sort ended, 855306 disk blocks used: CPU 22.16s/516.86u
sec elapsed 541.35 sec
LOG:  external sort ended, 855306 disk blocks used: CPU 21.66s/513.59u
sec elapsed 538.00 sec
LOG:  external sort ended, 855306 disk blocks used: CPU 22.56s/499.63u
sec elapsed 525.53 sec
LOG:  external sort ended, 1710613 disk blocks used: CPU
45.00s/1062.26u sec elapsed 1118.52 sec
LOG:  external sort ended, 1710613 disk blocks used: CPU
44.42s/1061.33u sec elapsed 1117.27 sec
LOG:  external sort ended, 1710613 disk blocks used: CPU
44.47s/1064.93u sec elapsed 1118.79 sec

For example, the 50 million tuple test has over 8% of its runtime
shaved off. This seems to be a consistent pattern.

Note that only the writing of tuples uses prefetching here, because
that happens to be the only affected codepath for prefetching (note
also that this is the slightly different, external-specific version of
the patch). I hesitate to give that up, although it is noticeable that
it matters less at higher scales, where we're bottlenecked on
quicksorting itself, more so than writing. Those costs grow at
different rates, of course.

Perhaps we can consider more selectively applying prefetching in the
context of writing out tuples. After all, the amount of useful work
that we can do pending fetching from memory ought to be more
consistent and manageable, which could make it a consistent win. I
will need to think about this some more.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] gincostestimate and hypothetical indexes

2015-11-30 Thread Julien Rouhaud
Hello,

I figured out that it's not possible to use a hypothetical gin index, as
the gincostestimate function try to retrieve some statistical data from
the index meta page.

Attached patch fixes this. I believe this should be back-patched as was
a2095f7fb5a57ea1794f25d029756d9a140fd429.

Regards.
-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 37fad86..24ffa3a 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -101,6 +101,7 @@
 #include 
 
 #include "access/gin.h"
+#include "access/gin_private.h"
 #include "access/htup_details.h"
 #include "access/sysattr.h"
 #include "catalog/index.h"
@@ -7260,11 +7261,25 @@ gincostestimate(PG_FUNCTION_ARGS)
 	qinfos = deconstruct_indexquals(path);
 
 	/*
-	 * Obtain statistic information from the meta page
+	 * Obtain statistic information from the meta page if the index is not
+	 * hypothetical. Otherwise set all the counters to 0, as it would be for an
+	 * index that never got VACUUMed.
 	 */
-	indexRel = index_open(index->indexoid, AccessShareLock);
-	ginGetStats(indexRel, );
-	index_close(indexRel, AccessShareLock);
+	if (!index->hypothetical)
+	{
+		indexRel = index_open(index->indexoid, AccessShareLock);
+		ginGetStats(indexRel, );
+		index_close(indexRel, AccessShareLock);
+	}
+	else
+	{
+		ginStats.nPendingPages = 0;
+		ginStats.nTotalPages = 0;
+		ginStats.nEntryPages = 0;
+		ginStats.nDataPages = 0;
+		ginStats.nEntries = 0;
+		ginStats.ginVersion = GIN_CURRENT_VERSION;
+	}
 
 	numEntryPages = ginStats.nEntryPages;
 	numDataPages = ginStats.nDataPages;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Well, it's December nearly, and we don't seem to be making much progress
> towards pushing out 9.5.0.  I see the following items on
> https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items
> 
> * Open Row-Level Security Issues
> 
> Seems like what's left here is only documentation fixes, but they still
> need to get done.

These are mainly just documentation improvements which I'm working on,
though the docs were recently updated and I need to incorporate Peter's
changes which I wasn't exactly anticipating.

The non-documentation question is around DROP OWNED.  We need to either
have policies dropped by DROP OWNED (well, roles removed, unless it's
the last one, in which case the policy should be dropped), or update the
documentation to reflect that they don't.  I had been thinking we'd
fix DROP OWNED to deal with the policies, but if folks feel it's too
late for that kind of a change, then we can simply document it.  I don't
believe that's unreasonable for a new feature and we can work to get it
addressed in 9.6.

I'm starting to think that just documenting it makes sense for 9.5.  I 
doubt it's going to be a serious issue during 9.5's lifetime.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Alvaro Herrera
Stephen Frost wrote:

> The non-documentation question is around DROP OWNED.  We need to either
> have policies dropped by DROP OWNED (well, roles removed, unless it's
> the last one, in which case the policy should be dropped), or update the
> documentation to reflect that they don't.  I had been thinking we'd
> fix DROP OWNED to deal with the policies, but if folks feel it's too
> late for that kind of a change, then we can simply document it.  I don't
> believe that's unreasonable for a new feature and we can work to get it
> addressed in 9.6.

DROP OWNED is documented as a mechanism to help you drop the role, so
it should do whatever is needed for that.  I don't think documenting the
fact that it leaves the user as part of policies is good enough.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-11-30 Thread Pavel Stehule
Hi


> 2. using independent implementation - there is some redundant code, but we
> can support duble insted int, and we can support some additional units.
>

 new patch is based on own implementation - use numeric/bigint calculations

+ regress tests and doc

Regards

Pavel
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 60b9a09..c94743e
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17695 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  to bytes
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..a87c6a7
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,36 
--- 25,59 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
  #include "utils/relmapper.h"
  #include "utils/syscache.h"
  
+ #define MAX_UNIT_LEN		3
+ #define MAX_DIGITS		20
+ 
+ typedef struct
+ {
+ 	char		unit[MAX_UNIT_LEN + 1];
+ 	long int	multiplier;
+ } unit_multiplier;
+ 
+ static const unit_multiplier unit_multiplier_table[] =
+ {
+ 	{"PB", 1024L * 1024 * 1024 * 1024 * 1024},
+ 	{"TB", 1024L * 1024 * 1024 * 1024},
+ 	{"GB", 1024L * 1024 * 1024},
+ 	{"MB", 1024L * 1024},
+ 	{"kB", 1024L},
+ 	{"B", 1L},
+ 
+ 	{""}/* end of table marker */
+ };
+ 
+ 
  /* Divide by two and round towards positive infinity. */
  #define half_rounded(x)   (((x) + ((x) < 0 ? 0 : 1)) / 2)
  
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 700,705 
--- 723,830 
  }
  
  /*
+  * Convert human readable size to long int.
+  *
+  * cannot to use parse_int due too low limits there
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text	   *arg = PG_GETARG_TEXT_PP(0);
+ 	const char	   *cp = text_to_cstring(arg);
+ 	char		digits[MAX_DIGITS + 1];
+ 	int		ndigits = 0;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	switch (*cp)
+ 	{
+ 		case '+':
+ 			cp++;
+ 			break;
+ 		case '-':
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("size cannot be negative")));
+ 	}
+ 
+ 	while (*cp)
+ 	{
+ 		if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
+ 			digits[ndigits++] = *cp++;
+ 		else
+ 			break;
+ 	}
+ 
+ 	if (ndigits == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("string is empty")));
+ 
+ 	digits[ndigits] = '\0';
+ 	num = DatumGetNumeric(DirectFunctionCall3(numeric_in,
+ 			CStringGetDatum(digits), 0, -1));
+ 
+ 	/* allow whitespace between integer and unit */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	/* Handle possible unit */
+ 	if (*cp != '\0')
+ 	{
+ 		char		unit[MAX_UNIT_LEN + 1];
+ 		int		unitlen = 0;
+ 		int		i;
+ 		long int	multiplier;
+ 		bool			found = false;
+ 		Numeric			mul_num;
+ 
+ 
+ 		while (*cp && !isspace(*cp) && unitlen < MAX_UNIT_LEN)
+ 			unit[unitlen++] = *cp++;
+ 
+ 		unit[unitlen] = '\0';
+ 
+ 		/* allow whitespace after unit */
+ 		while (isspace((unsigned char) *cp))
+ 			cp++;
+ 
+ 		if (*cp != '\0')
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid format")));
+ 
+ 		for (i = 0; *unit_multiplier_table[i].unit; i++)
+ 		{
+ 			if (strcmp(unit, unit_multiplier_table[i].unit) == 0)
+ 			{
+ multiplier = unit_multiplier_table[i].multiplier;
+ found = true;
+ break;
+ 			}
+ 		}
+ 
+ 		if (!found)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("unknown unit \"%s\"", unit)));
+ 
+ 		mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum(multiplier)));
+ 		num = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+ 			NumericGetDatum(mul_num),
+ 			NumericGetDatum(num)));
+ 	}
+ 
+ 	result = DatumGetInt64(DirectFunctionCall1(numeric_int8, NumericGetDatum(num)));
+ 
+ 	PG_RETURN_INT64(result);
+ }
+ 
+ /*
   * Get the filenode of a relation
   *
   * This is expected to be used in queries like
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index d8640db..b68c8fa
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*** DATA(insert OID = 2286 ( pg_total_relati
*** 3662,3667 
--- 3662,3669 
  DESCR("total disk space usage for the specified 

Re: [HACKERS] parallel joins, and better parallel explain

2015-11-30 Thread Greg Stark
On Mon, Nov 30, 2015 at 4:52 PM, Robert Haas  wrote:
> Not only does this build only one copy of the hash table instead of N
> copies, but we can parallelize the hash table construction itself by
> having all workers insert in parallel, which is pretty cool.

Hm. The case where you don't want parallel building of the hash table
might be substantially simpler. You could build a hash table and then
copy it into shared memory as single contiguous read-only data
structure optimized for lookups. I have an inkling that there are even
ways of marking the memory as being read-only and not needing cache
synchronization.



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel joins, and better parallel explain

2015-11-30 Thread Robert Haas
On Thu, Nov 26, 2015 at 3:45 AM, Simon Riggs  wrote:
> Sounds like good progress.

Thanks.

> This gives us multiple copies of the hash table, which means we must either
> use N * work_mem, or we must limit the hash table to work_mem / N per
> partial plan.

We use N * work_mem in this patch.  The other option would be a lot
more expensive in terms of planning time, because we'd have to
generate one set of hash join paths (or at least hash join costs) for
use in serial plans and another set for use in parallel plans.  As it
is, the parallel stuff just piggybacks on the plans we're generating
anyway.  We might have to change that at some point, but I think we'll
do well to put that point off as long as possible.

> How can the partial paths share a hash table?

I'm glad you asked that question.  For that, we need an allocator that
can work with dynamic shared memory segments, and a hash table built
on top of that allocator.  It's relatively complicated because the
same DSM segments can be mapped at different addresses in different
processes, so you can't use native pointers.  However, I'm pleased to
report that my colleague Thomas Munro is working on this problem, and
that he will submit this work to pgsql-hackers when it's ready, which
may be soon enough that we can consider including this in 9.6 if the
design meets with general approval.

As you may recall, I previously proposed an allocator framework,
somewhat of a WIP in progress at that time, and the reaction here was
a bit lukewarm, which is why I shifted from parallel sort to parallel
seq scan as a first project.  I now think that was a good decision,
and as a result of Peter Geoghegan's work on sorting and my own
experience further developing the parallelism code, I no longer think
that allocator is the right solution for parallel sort anyway. But I
still think it might be the right solution for a parallel hash join
with a shared hash table.

My idea is that you'd end up with a plan like this:

Gather
-> Hash Join
  -> Parallel Seq Scan
  -> Parallel Hash
-> Parallel Seq Scan

Not only does this build only one copy of the hash table instead of N
copies, but we can parallelize the hash table construction itself by
having all workers insert in parallel, which is pretty cool.  However,
I don't expect this to be better than an unshared hash table in all
cases.  We have a fair amount of evidence that accessing
backend-private data structures can sometimes be much faster than
accessing shared data structures - cf. not only the syscaches and
relcache, but also the use of Materialize nodes by the planner in
certain cases even when there are no filtering quals underneath.  So
there's probably going to be a need to consider both types of plans
and decide between them based on memory usage and expected runtime.
The details are not all clear to me yet, and most likely we'll have to
postpone some of the decisions until the code is written and can be
performance-tested to see in which cases one strategy performs better
or worse than the other.  What I can confirm at this point is that
I've thought about the problem you're asking about here, and that
EnterpriseDB intends to contribute code to address it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-11-30 Thread Masahiko Sawada
On Sun, Nov 29, 2015 at 2:21 AM, Jeff Janes  wrote:
> On Tue, Nov 24, 2015 at 3:13 PM, Masahiko Sawada  
> wrote:
>>
>> Yeah, we need to consider to compute checksum if enabled.
>> I've changed the patch, and attached.
>> Please review it.
>
> Thanks for the update.  This now conflicts with the updates doesn to
> fix pg_upgrade out-of-space issue on Windows. I've fixed (I think) the
> conflict in order to do some testing, but I'd like to get an updated
> patch from the author in case I did it wrong.  I don't want to find
> bugs that I just introduced myself.
>

Thank you for having a look.

Attached updated v28 patch.
Please review it.

Regards,

--
Masahiko Sawada


000_add_frozen_bit_into_visibilitymap_v28.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel joins, and better parallel explain

2015-11-30 Thread Robert Haas
On Mon, Nov 30, 2015 at 12:01 PM, Greg Stark  wrote:
> On Mon, Nov 30, 2015 at 4:52 PM, Robert Haas  wrote:
>> Not only does this build only one copy of the hash table instead of N
>> copies, but we can parallelize the hash table construction itself by
>> having all workers insert in parallel, which is pretty cool.
>
> Hm. The case where you don't want parallel building of the hash table
> might be substantially simpler. You could build a hash table and then
> copy it into shared memory as single contiguous read-only data
> structure optimized for lookups. I have an inkling that there are even
> ways of marking the memory as being read-only and not needing cache
> synchronization.

Yes, that's another approach that we could consider.  I suspect it's
not really a lot better than the parallel-build case.  If the inner
table is small, then it's probably best to have every worker build its
own unshared copy of the table rather than having one worker build the
table and everybody else wait, which might lead to stalls during the
build phase and additional traffic on the memory bus during the probe
phase (though, as you say, giving the kernel a hint could help in some
cases).  If the inner table is big, then having everybody wait for a
single process to perform the build probably sucks.

But it's not impossible that there could be cases when it trumps every
other strategy.  For example, if you're going to be doing a huge
number of probes, you could try building the hash table with several
different hash functions until you find one that is collision-free or
nearly so, and then use that one.  The extra effort spent during the
build phase might speed up the probe phase enough to win.  You can't
do that sort of thing so easily in a parallel build.  Even apart from
that, if you build the hash table locally first and then copy it into
shared memory afterwards, you can free up any extra memory and use
only the minimal amount that you really need, which could be
beneficial in some cases.  I'm just not sure that's appealing enough
to justify carrying a third system for building hash tables for hash
joins.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-11-30 Thread Daniel Verite
Pavel Stehule wrote:

> [ \rotate being a wrong name ]

Here's an updated patch.

First it renames the command to \crosstabview, which hopefully may
be more consensual, at least it's semantically much closer to crosstab .

> The important question is sorting output. The vertical header is
> sorted by first appearance in result. The horizontal header is
> sorted in ascending or descending order. This is unfriendly for
> often use case - month names. This can be solved by third parameter
> - sort function.

I've thought that sorting with an external function would be too
complicated for this command, but sorting ascending by default
was not the right choice either.
So I've changed to sorting by first appearance in result (like the vertical
header), and sorting ascending or descending only when specified
(with +colH or -colH syntax).

So the synopsis becomes: \crosstabview [ colV [+ | -]colH ]

Example with a time series (daily mean temperatures in Paris,2014),
month names across, day numbers down :

select
  to_char(w_date,'DD') as day ,
  to_char(w_date,'Mon') as month,
  w_temp from weather 
  where w_date between '2014-01-01' and '2014-12-31'
  order by w_date
\crosstabview

 day | Jan | Feb | Mar | Apr | May | Jun | ...[cut]
-+-+-+-+-+-+-+-
 01  |   8 |   8 |   6 |  16 |  12 |  15 |
 02  |  10 |   6 |   6 |  15 |  12 |  16 |
 03  |  11 |   5 |   7 |  14 |  11 |  17 |
 04  |  10 |   6 |   8 |  12 |  12 |  14 |
 05  |   6 |   7 |   8 |  14 |  16 |  14 |
 06  |  10 |   9 |   9 |  16 |  17 |  20 |
 07  |  11 |  10 |  10 |  18 |  14 |  24 |
 08  |  11 |   8 |  12 |  10 |  13 |  22 |
 09  |  10 |   6 |  14 |  12 |  16 |  22 |
 10  |   6 |   7 |  14 |  14 |  14 |  19 |
 11  |   7 |   6 |  12 |  14 |  12 |  21 |
...cut..
 28  |   4 |   7 |  10 |  12 |  14 |  16 |
 29  |   4 | |  14 |  10 |  15 |  16 |
 30  |   5 | |  14 |  14 |  17 |  18 |
 31  |   5 | |  14 | |  16 | |

The month names come out in the expected order here,
contrary to what happened with the previous iteration of
the patch which forced a sort in all cases.
Here it plays out well because the single "ORDER BY w_date" is
simultaneously OK for the vertical and horizontal headers,
a common case for time series.

For more complicated cases, when the horizontal and vertical
headers should be ordered independantly, and
in addition the horizontal header should not be sorted
by its values, I've toyed with the idea of sorting by another
column which would supposedly be added in the query
just for sorting, but it loses much in simplicity. For the more
complex stuff, users can always turn to the server-side methods
if needed.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 5899bb4..3836fbd 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2449,6 +2449,95 @@ lo_import 152801
 
   
 
+  
+\crosstabview [ colV  [-|+]colH ] 
+
+
+Execute the current query buffer (like \g) and shows the results
+inside a crosstab grid. The output column colV
+becomes a vertical header and the output column
+colH becomes a horizontal header.
+The results for the other output columns are projected inside the grid.
+
+
+
+colV
+and colH can indicate a
+column position (starting at 1), or a column name. Normal case folding
+and quoting rules apply on column names. By default,
+colV is column 1
+and colH is column 2.
+A query having only one output column cannot be viewed in crosstab, and
+colH must differ from
+colV.
+
+
+
+The vertical header, displayed as the leftmost column,
+contains the set of all distinct values found in
+column colV, in the order
+of their first appearance in the query results.
+
+
+The horizontal header, displayed as the first row,
+contains the set of all distinct non-null values found in
+column colH.  They come
+by default in their order of appearance in the query results, or in ascending
+order if a plus (+) sign precedes colH,
+or in descending order if it's a minus (-) sign.
+
+
+
+The query results being tuples of N columns
+(including colV and
+colH),
+for each distinct value x of
+colH
+and each distinct value y of
+colV,
+a cell located at the intersection (x,y) in the grid
+has contents determined by these rules:
+
+
+
+ if there is no corresponding row in the results such that the value
+ for colH
+ is x and the value
+ for colV
+ is y, the cell is empty.
+
+
+
+
+

[HACKERS] Logical replication and multimaster

2015-11-30 Thread Konstantin Knizhnik

Hello all,

We have implemented ACID multimaster based on logical replication and 
our DTM (distributed transaction manager) plugin.

Good news is that it works and no inconsistency is detected.
But unfortunately it is very very slow...

At standalone PostgreSQL I am able to achieve about 3 TPS with 10 
clients performing simple depbit-credit transactions.
And with multimaster consisting of three nodes spawned at the same 
system I got about 100 (one hundred) TPS.

There are two main reasons of such awful performance:

1. Logical replication serializes all transactions:  there is single 
connection between wal-sender and receiver BGW.

2. 2PC synchronizes transaction commit at all nodes.

None of these two reasons are show stoppers themselves.
If we remove DTM and do asynchronous logical replication then 
performance of multimaster is increased to 6000 TPS
(please notice that in this test all multimaster node are spawned at the 
same system, sharing its resources,

so 6k is not bad result comparing with 30k at standalone system).
And according to 2ndquadrant results, BDR performance is very close to 
hot standby.


On the other hand our previous experiments with DTM shows only about 2 
times slowdown comparing with vanilla PostgreSQL.

But result of combining DTM and logical replication is frustrating.

I wonder if it is principle limitation of logical replication approach 
which is efficient only for asynchronous replication or it can be 
somehow tuned/extended to efficiently support synchronous replication?


We have also considered alternative approaches:
1. Statement based replication.
2. Trigger-based replication.
3. Replication using custom nodes.

In case of statement based replication it is hard to guarantee identity 
of of data at different nodes.
Approaches 2 and 3 are much harder to implement and requiring to 
"reinvent" substantial part of logical replication.
Them also require some kind of connection pool which can be used to send 
replicated transactions to the peer nodes (to avoid serialization of 
parallel transactions as in case of logical replication).


But looks like there is not so much sense in having multiple network 
connection between one pair of nodes.
It seems to be better to have one connection between nodes, but provide 
parallel execution of received transactions at destination side. But it 
seems to be also nontrivial. We have now in PostgreSQL some 
infrastructure for background works, but there is still no abstraction 
of workers pool and job queue which can provide simple way to organize 
parallel execution of some jobs. I wonder if somebody is working now on 
it or we should try to propose our solution?


Best regards,
Konstantin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some questions about the array.

2015-11-30 Thread Merlin Moncure
On Thu, Nov 26, 2015 at 6:08 AM, Teodor Sigaev  wrote:
> YUriy Zhuravlev wrote:
>>
>> On Friday 06 November 2015 12:55:44 you wrote:
>>>
>>> Omitted bounds are common in other languages and would be handy. I
>>> don't think they'd cause any issues with multi-dimensional arrays or
>>> variable start-pos arrays.
>>
>>
>> And yet, what about my patch?
>
> My vote: let us do it, mean, omitting bounds. It simplifies syntax in rather
> popular queries.

+1  useful and intuitive

>> Discussions about ~ and{:} it seems optional.
>
> ~ is allowed as unary operator and therefore such syntax will introduce
> incompatibily/ambiguity.

+1  IMO this line of thinking is a dead end.  Better handled via
functions, not syntax

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-30 Thread Jeff Janes
On Sat, Nov 28, 2015 at 4:05 PM, Peter Geoghegan  wrote:
> On Sat, Nov 28, 2015 at 2:04 PM, Jeff Janes  wrote:
...
>>
>> The final merging is intermixed with whatever other work goes on to
>> build the actual index files out of the sorted data, so I don't know
>> exactly what the timing of just the merge part was.  But it was
>> certainly a minority of the time, even if you assume the actual index
>> build were free.  For the patched code, the majority of the time goes
>> to the quick sorting stages.
>
> I'm not sure what you mean here.

I had no point to make here, I was just trying to answer one of your
questions about how much time was spent merging. I don't know, because
it is interleaved with and not separately instrumented from the index
build.

>
> I would generally expect that the merge phase takes significantly less
> than sorting runs, regardless of how we sort runs, unless parallelism
> is involved, where merging could dominate. The master branch has a
> faster merge step, at least proportionally, because it has larger
> runs.
>
>> When I test each version of the code at its own most efficient
>> maintenance_work_mem, I get
>> 3007.2 seconds at 1GB for patched and 3836.46 seconds at 64MB for unpatched.
>
> As I said, it seems a little bit unfair to hand-tune work_mem or
> maintenance_work_mem like that. Who can afford to do that? I think you
> agree that it's untenable to have DBAs allocate work_mem differently
> for cases where an internal sort or external sort is expected;
> workloads are just far too complicated and changeable.

Right, I agree with all that.  But I think it is important to know
where the benefits come from.  It looks like about half comes from
being more robust to overly-large memory usage, and half from absolute
improvements which you get at each implementations own best setting.
Also, if someone had previously restricted work_mem (or more likely
maintenance_work_mem) simply to avoid the large memory penalty, they
need to know to revisit that decision. Although they still don't get
any actual benefit from using too much memory, just a reduced penalty.

I'm kind of curious as to why the optimal for the patched code appears
at 1GB and not lower.  If I get a chance to rebuild the test, I will
look into that more.


>
>> I'm attaching the trace_sort output from the client log for all 4 of
>> those scenarios.  "sort_0005" means all 5 of your patches were
>> applied, "origin" means none of them were.
>
> Thanks for looking at this. This is very helpful. It looks like the
> server you used here had fairly decent disks, and that we tended to be
> CPU bound more often than not. That's a useful testing ground.

It has a Perc H710 RAID controller with 15,000 RPM drives, but it is
also a virtualized system that has other stuff going on.  The disks
are definitely better than your average household computer, but I
don't think they are anything special as far as real database hardware
goes.  It is hard to saturate the disks for sequential reads.  It will
be interesting to see what parallel builds can do.


What would be next in reviewing the patches?  Digging into the C-level
implementation?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-11-30 Thread Bruce Momjian
On Mon, Nov 30, 2015 at 10:48:04PM +0530, Masahiko Sawada wrote:
> On Sun, Nov 29, 2015 at 2:21 AM, Jeff Janes  wrote:
> > On Tue, Nov 24, 2015 at 3:13 PM, Masahiko Sawada  
> > wrote:
> >>
> >> Yeah, we need to consider to compute checksum if enabled.
> >> I've changed the patch, and attached.
> >> Please review it.
> >
> > Thanks for the update.  This now conflicts with the updates doesn to
> > fix pg_upgrade out-of-space issue on Windows. I've fixed (I think) the
> > conflict in order to do some testing, but I'd like to get an updated
> > patch from the author in case I did it wrong.  I don't want to find
> > bugs that I just introduced myself.
> >
> 
> Thank you for having a look.

I would not bother mentioning this detail in the pg_upgrade manual page:

+   Since the format of visibility map has been changed in version 9.6,
+   pg_upgrade creates and rewrite new '_vm'
+   file even if upgrading from 9.5 or before to 9.6 or later with link mode 
(-k).

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment edits in nodeGather.c

2015-11-30 Thread Robert Haas
On Tue, Nov 24, 2015 at 9:43 PM, Amit Langote
 wrote:
> On 2015/11/25 11:31, Robert Haas wrote:
>> On Tue, Nov 24, 2015 at 1:06 AM, Amit Langote
>>  wrote:
>>> While going through nodeGather.c, I noticed portions of the file header
>>> comment that may have been obsoleted by recent revisions of the relevant
>>> parellelism code. For example, there is a reference to PartialSeqScan node
>>> which did not make it into the tree. Attached fixes it. Also, wondering if
>>> the semantics of Gather node is that of Scan or more generic Plan? That is
>>> to ask whether the following edit makes sense:
>>>
>>>   * nodeGather.c
>>> - *   Support routines for scanning a plan via multiple workers.
>>> + *   Support routines for getting the result from a plan via multiple
>>> + *   workers.
>>>   *
>>
>> Well I think "scanning a plan" is clear enough even if it's
>> technically a Scan.
>
> Okay, ripped that out in the attached.

Committed, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-30 Thread Jeff Janes
On Sun, Nov 29, 2015 at 8:02 PM, David Fetter  wrote:
>>
>> For me very large sorts (100,000,000 ints) with work_mem below 4MB do
>> better with unpatched than with your patch series, by about 5%.  Not a
>> big deal, but also if it is easy to keep the old behavior then I think
>> we should.  Yes, it is dumb to do large sorts with work_mem below 4MB,
>> but if you have canned apps which do a mixture of workloads it is not
>> so easy to micromanage their work_mem.  Especially as there are no
>> easy tools that let me as the DBA say "if you connect from this IP
>> address, you get this work_mem".
>
> That's certainly doable with pgbouncer, for example.

I had not considered that.  How would you do it with pgbouncer?  The
think I can think of would be to put it in server_reset_query, which
doesn't seem correct.


>  What would you
> have in mind for the more general capability?  It seems to me that
> bloating up pg_hba.conf would be undesirable, but maybe I'm picturing
> this as bigger than it actually needs to be.

I would envision something like "ALTER ROLE set ..." only for
application_name and IP address instead of ROLE.  I have no idea how I
would implement that, it is just how I would like to use it as the end
user.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Additional role attributes && superuser review

2015-11-30 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Nov 20, 2015 at 12:29 PM, Stephen Frost  wrote:
> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
> >> On Thu, Nov 19, 2015 at 7:10 AM, Stephen Frost wrote:
> >> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
> >> >> It seems weird to not have a dedicated role for pg_switch_xlog.
> >> >
> >> > I didn't add a pg_switch_xlog default role in this patch series, but
> >> > would be happy to do so if that's the consensus.  It's quite easy to do.
> >>
> >> Agreed. I am not actually getting why that's part of the backup
> >> actually. That would be more related to archiving, both being
> >> unrelated concepts. But at this point I guess that's mainly a
> >> philosophical split.
> >
> > As David notes, they're actually quite related.  Note that in our
> > documentation pg_switch_xlog() is listed in the "Backup Control
> > Functions" table.
> >
> > I can think of a use-case for a user who can call pg_switch_xlog, but
> > not pg_start_backup()/pg_stop_backup(), but I have to admit that it
> > seems rather limited and I'm on the fence about it being a worthwhile
> > distinction.
> 
> Sounds too narrow to me.  Are we going to have a separate predefined
> role for every security-restricted function to which someone might
> want to grant access?  That seems over the top to me.

I certainly don't want to go down to that level and was, as seen above,
unsure about having pg_switch_xlog() as a differentiated privilege.
Michael, do you still see that as a useful independent capability?

> I don't think we should make it our goal to completely eliminate the
> use of SECURITY DEFINER functions for privilege delegation.  Of
> course, being able to grant privileges directly is nicer, because then
> the client code doesn't have to know about it.  But I think it's OK,
> even good, if the predefined roles cater to the common cases, and the
> less common cases aren't handled quite as elegantly.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-11-30 Thread Pavel Stehule
2015-11-30 16:34 GMT+01:00 Daniel Verite :

> Pavel Stehule wrote:
>
> > [ \rotate being a wrong name ]
>
> Here's an updated patch.
>
> First it renames the command to \crosstabview, which hopefully may
> be more consensual, at least it's semantically much closer to crosstab .
>

thank you very much :)

>
> > The important question is sorting output. The vertical header is
> > sorted by first appearance in result. The horizontal header is
> > sorted in ascending or descending order. This is unfriendly for
> > often use case - month names. This can be solved by third parameter
> > - sort function.
>
> I've thought that sorting with an external function would be too
> complicated for this command, but sorting ascending by default
> was not the right choice either.
> So I've changed to sorting by first appearance in result (like the vertical
> header), and sorting ascending or descending only when specified
> (with +colH or -colH syntax).
>
> So the synopsis becomes: \crosstabview [ colV [+ | -]colH ]
>
> Example with a time series (daily mean temperatures in Paris,2014),
> month names across, day numbers down :
>
> select
>   to_char(w_date,'DD') as day ,
>   to_char(w_date,'Mon') as month,
>   w_temp from weather
>   where w_date between '2014-01-01' and '2014-12-31'
>   order by w_date
> \crosstabview
>
>  day | Jan | Feb | Mar | Apr | May | Jun | ...[cut]
> -+-+-+-+-+-+-+-
>  01  |   8 |   8 |   6 |  16 |  12 |  15 |
>  02  |  10 |   6 |   6 |  15 |  12 |  16 |
>  03  |  11 |   5 |   7 |  14 |  11 |  17 |
>  04  |  10 |   6 |   8 |  12 |  12 |  14 |
>  05  |   6 |   7 |   8 |  14 |  16 |  14 |
>  06  |  10 |   9 |   9 |  16 |  17 |  20 |
>  07  |  11 |  10 |  10 |  18 |  14 |  24 |
>  08  |  11 |   8 |  12 |  10 |  13 |  22 |
>  09  |  10 |   6 |  14 |  12 |  16 |  22 |
>  10  |   6 |   7 |  14 |  14 |  14 |  19 |
>  11  |   7 |   6 |  12 |  14 |  12 |  21 |
> ...cut..
>  28  |   4 |   7 |  10 |  12 |  14 |  16 |
>  29  |   4 | |  14 |  10 |  15 |  16 |
>  30  |   5 | |  14 |  14 |  17 |  18 |
>  31  |   5 | |  14 | |  16 | |
>
> The month names come out in the expected order here,
> contrary to what happened with the previous iteration of
> the patch which forced a sort in all cases.
> Here it plays out well because the single "ORDER BY w_date" is
> simultaneously OK for the vertical and horizontal headers,
> a common case for time series.
>
> For more complicated cases, when the horizontal and vertical
> headers should be ordered independantly, and
> in addition the horizontal header should not be sorted
> by its values, I've toyed with the idea of sorting by another
> column which would supposedly be added in the query
> just for sorting, but it loses much in simplicity. For the more
> complex stuff, users can always turn to the server-side methods
> if needed.
>
>
it is looking well

I'll do review tomorrow

Regards

Pavel


> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: [HACKERS] Additional role attributes && superuser review

2015-11-30 Thread Robert Haas
On Fri, Nov 20, 2015 at 12:29 PM, Stephen Frost  wrote:
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> On Thu, Nov 19, 2015 at 7:10 AM, Stephen Frost wrote:
>> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> >> It seems weird to not have a dedicated role for pg_switch_xlog.
>> >
>> > I didn't add a pg_switch_xlog default role in this patch series, but
>> > would be happy to do so if that's the consensus.  It's quite easy to do.
>>
>> Agreed. I am not actually getting why that's part of the backup
>> actually. That would be more related to archiving, both being
>> unrelated concepts. But at this point I guess that's mainly a
>> philosophical split.
>
> As David notes, they're actually quite related.  Note that in our
> documentation pg_switch_xlog() is listed in the "Backup Control
> Functions" table.
>
> I can think of a use-case for a user who can call pg_switch_xlog, but
> not pg_start_backup()/pg_stop_backup(), but I have to admit that it
> seems rather limited and I'm on the fence about it being a worthwhile
> distinction.

Sounds too narrow to me.  Are we going to have a separate predefined
role for every security-restricted function to which someone might
want to grant access?  That seems over the top to me.

I don't think we should make it our goal to completely eliminate the
use of SECURITY DEFINER functions for privilege delegation.  Of
course, being able to grant privileges directly is nicer, because then
the client code doesn't have to know about it.  But I think it's OK,
even good, if the predefined roles cater to the common cases, and the
less common cases aren't handled quite as elegantly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-11-30 Thread Andres Freund
On 2015-11-30 12:58:43 -0500, Bruce Momjian wrote:
> I would not bother mentioning this detail in the pg_upgrade manual page:
> 
> +   Since the format of visibility map has been changed in version 9.6,
> +   pg_upgrade creates and rewrite new 
> '_vm'
> +   file even if upgrading from 9.5 or before to 9.6 or later with link mode 
> (-k).

Might be worthwhile to keep as that influences the runtime for link mode
when migrating <9.6 -> 9.6.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-11-30 Thread Bruce Momjian
On Mon, Nov 30, 2015 at 07:05:21PM +0100, Andres Freund wrote:
> On 2015-11-30 12:58:43 -0500, Bruce Momjian wrote:
> > I would not bother mentioning this detail in the pg_upgrade manual page:
> > 
> > +   Since the format of visibility map has been changed in version 9.6,
> > +   pg_upgrade creates and rewrite new 
> > '_vm'
> > +   file even if upgrading from 9.5 or before to 9.6 or later with link 
> > mode (-k).
> 
> Might be worthwhile to keep as that influences the runtime for link mode
> when migrating <9.6 -> 9.6.

It is hard to see that it would have a measurable duration.  The
pg_upgrade docs are already very long and this detail doesn't seems
significant.  Can someone test the overhead?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Andres Freund
On 2015-11-30 14:43:59 -0500, Tom Lane wrote:
> * pg_rewind exiting with error code 1 when source and target are on the same 
> timeline
> 
> Is this a new-in-9.5 bug, or a pre-existing problem?  If the latter,
> I'm not sure it's a release blocker.

pg_rewind was only introduced in 9.5, no?


> * Finish multixact truncation rework
> 
> We're not seriously going to push something this large into 9.5 at this
> point, are we?

To my knowledge this mostly comment changes. And some, actually
independent, improvements. More blocked on procedual disagreements -
with Noah wanting to revert the existing commits, fixup some stuff, then
reapply them - and me just wanting to do the improvements independently.



Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Alvaro Herrera
Tom Lane wrote:

> * DDL deparsing testing module should have detected that transforms were not 
> supported, but it failed to notice that
> 
> Is this really a release blocker?  As a testing matter, it seems like any
> fix would go into HEAD only.

Not a blocker as far as I'm concerned.

> * another strange behavior with track_commit_timestamp
> 
> Where are we on this?

I will test the patch I proposed more thoroughly, then push.  I intend
to add a few test cases on top of the recovery testing patch Michael
submitted (but that's branch master only), to make sure I don't
re-introduce bugs already fixed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Remaining 9.5 open items

2015-11-30 Thread Tom Lane
Well, it's December nearly, and we don't seem to be making much progress
towards pushing out 9.5.0.  I see the following items on
https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items

* Open Row-Level Security Issues

Seems like what's left here is only documentation fixes, but they still
need to get done.

* DDL deparsing testing module should have detected that transforms were not 
supported, but it failed to notice that

Is this really a release blocker?  As a testing matter, it seems like any
fix would go into HEAD only.

* Foreign join pushdown vs EvalPlanQual

Is this fixed by 5fc4c26db?  If not, what remains to do?

* pg_rewind exiting with error code 1 when source and target are on the same 
timeline

Is this a new-in-9.5 bug, or a pre-existing problem?  If the latter,
I'm not sure it's a release blocker.

* psql extended wrapped format off by one error in line wrapping

There's a submitted patch, so I'll take a look at whether it's pushable.

* Finish multixact truncation rework

We're not seriously going to push something this large into 9.5 at this
point, are we?

* another strange behavior with track_commit_timestamp

Where are we on this?

* Relation files of unlogged relation for btree and spgist indexes not 
initialized after promotion

Again, is this a release blocker?  It's evidently a very old bug.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Additional role attributes && superuser review

2015-11-30 Thread Alvaro Herrera
Stephen Frost wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:

> > > I can think of a use-case for a user who can call pg_switch_xlog, but
> > > not pg_start_backup()/pg_stop_backup(), but I have to admit that it
> > > seems rather limited and I'm on the fence about it being a worthwhile
> > > distinction.
> > 
> > Sounds too narrow to me.  Are we going to have a separate predefined
> > role for every security-restricted function to which someone might
> > want to grant access?  That seems over the top to me.
> 
> I certainly don't want to go down to that level and was, as seen above,
> unsure about having pg_switch_xlog() as a differentiated privilege.
> Michael, do you still see that as a useful independent capability?

Hmm, Robert's argument seems reasonable -- we can continue to offer
access to individual elements by granting execute on a security-definer
function owned by predefined user pg_backup.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-30 Thread Peter Geoghegan
On Mon, Nov 30, 2015 at 9:51 AM, Jeff Janes  wrote:
>> As I said, it seems a little bit unfair to hand-tune work_mem or
>> maintenance_work_mem like that. Who can afford to do that? I think you
>> agree that it's untenable to have DBAs allocate work_mem differently
>> for cases where an internal sort or external sort is expected;
>> workloads are just far too complicated and changeable.
>
> Right, I agree with all that.  But I think it is important to know
> where the benefits come from.  It looks like about half comes from
> being more robust to overly-large memory usage, and half from absolute
> improvements which you get at each implementations own best setting.
> Also, if someone had previously restricted work_mem (or more likely
> maintenance_work_mem) simply to avoid the large memory penalty, they
> need to know to revisit that decision. Although they still don't get
> any actual benefit from using too much memory, just a reduced penalty.

Well, to be clear, they do get a benefit with much larger memory
sizes. It's just that the benefit does not continue indefinitely. I
agree with this assessment, though.

> I'm kind of curious as to why the optimal for the patched code appears
> at 1GB and not lower.  If I get a chance to rebuild the test, I will
> look into that more.

I think that the availability of abbreviated keys (or something that
allows most comparisons made by quicksort/the heap to be resolved at
the SortTuple level) could make a big difference for things like this.
Bear in mind that the merge phase has better cache characteristics
when many attributes must be compared, and not mostly just leading
attributes. Alphasort [1] merges in-memory runs (built with quicksort)
to create on-disk runs for this reason. (I tried that, and it didn't
help -- maybe I get that benefit from merging on-disk runs, since
modern machines have so much more memory than in 1994).

> It has a Perc H710 RAID controller with 15,000 RPM drives, but it is
> also a virtualized system that has other stuff going on.  The disks
> are definitely better than your average household computer, but I
> don't think they are anything special as far as real database hardware
> goes.

What I meant was that it's better than my laptop. :-)

> What would be next in reviewing the patches?  Digging into the C-level
> implementation?

Yes, certainly, but let me post a revised version first. I have
improved the comments, and performed some consolidation of commits.

Also, I am going to get a bunch of test results from the POWER7
system. I think I might see more benefits with higher
maintenance_work_mem settings that you saw, primarily because my case
can mostly just use abbreviated keys during the quicksort operations.
Also, I find it very very useful that while (for example) your 3GB
test case was slower than your 1GB test case, it was only 5% slower. I
have a lot of hope that we can have a cost model for sizing an
effective maintenance_work_mem for this reason -- the consequences of
being wrong are really not that severe. It's unfortunate that we
currently waste so much memory by blindly adhering to
work_mem/maintenance_work_mem. This matters a lot more when we have
parallel sort.

[1] http://www.cs.berkeley.edu/~rxin/db-papers/alphasort.pdf
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-30 Thread Peter Geoghegan
On Mon, Nov 30, 2015 at 5:12 PM, Greg Stark  wrote:
> I think the take-away is that this is outside the domain where any 
> interesting break points occur.

I think that these are representative of what people want to do with
external sorts. We have already had Jeff look for a regression. He
found one only with less than 4MB of work_mem (the default), with over
100 million tuples.

What exactly are we looking for?

> And can you calculate an estimate where the domain would be where multiple 
> passes would be needed for this table at these work_mem sizes? Is it feasible 
> to test around there?

Well, you said that 1GB of work_mem was enough to avoid that within
about 4TB - 8TB of data. So, I believe the answer is "no":

[pg@hydra ~]$ df -h
Filesystem Size  Used Avail Use% Mounted on
rootfs  20G   19G  519M  98% /
devtmpfs31G  128K   31G   1% /dev
tmpfs   31G  384K   31G   1% /dev/shm
/dev/mapper/vg_hydra-root   20G   19G  519M  98% /
tmpfs   31G  127M   31G   1% /run
tmpfs   31G 0   31G   0% /sys/fs/cgroup
tmpfs   31G 0   31G   0% /media
/dev/md0   497M  145M  328M  31% /boot
/dev/mapper/vg_hydra-data 1023G  322G  651G  34% /data

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add EXTRA_CFLAGS to configure

2015-11-30 Thread Bruce Momjian
On Wed, Oct 28, 2015 at 01:53:31PM +, Nathan Wagner wrote:
> On Wed, Oct 28, 2015 at 02:42:19PM +0100, Robert Haas wrote:
> > On Wed, Oct 28, 2015 at 2:17 PM, Andres Freund  wrote:
> > >> I use COPT for this purpose.
> > >
> > > Unless I miss something you can't just pass that to configure though,
> > > right? I.e. it has to be passed to each make invocation?
> > 
> > What I do is:
> > 
> > echo COPT=-Wall -Werror > src/Makefile.custom
> 
> Make will pick up variables from the environment.  So unless the
> makefile goes out of its way to circumvent it, you can just do
> 
> COPT=-Werror
> export COPT
> 
> and then run your usual configure/compile cycle.  There's no
> specific need to modify the makefiles or pass extra arguments
> into make.

FYI, I use CUSTOM_COPT in src/Makefile.custom.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-11-30 Thread Kyotaro HORIGUCHI
Hello, I studied your lastest patch.

At Fri, 27 Nov 2015 16:59:20 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] Error with index on unlogged table

2015-11-30 Thread Michael Paquier
On Tue, Dec 1, 2015 at 11:11 AM, Kyotaro HORIGUCHI
 wrote:
> Hello, I studied your latest patch.

Thanks!

> I feel quite uncomfortable that it solves the problem from a kind
> of nature of unlogged object by arbitrary flagging which is not
> fully corresponds to the nature. If we can deduce the necessity
> of fsync from some nature, it would be preferable.

INIT_FORKNUM is not something only related to unlogged relations,
indexes use them as well. And that's actually
If you look at for example BRIN indexes that do not sync immediately
their INIT_FORKNUM when index is created, I think that we still are
going to need a new flag to control the sync at WAL replay because
startup process cannot know a relation's persistence, thing that we
can know when the XLOG_FPI record is created. For BRIN indexes, we
want particularly to not sync the INIT_FORKNUM when the relation is
not an unlogged one.

> In the current patch, is_sync for log_newpage is generally true
> for and only for INIT_FORKNUM pages. Exceptions as far as I can
> see are,

Yep, that's not always the case.

> copy_relation_data: called with arbitrary forknum but it doesn't
>set is_fsync even for coying INIT_FORKNUM. (Is this not a
>problem?)

Oh. And actually, it seems that use_wal is broken there as well. I
think that we would still want to issue a XLOG_FPI record for an
unlogged relation should it be moved to a new tablespace to copy its
INIT_FORKNUM correctly to its new home.
After moving an unlogged relation to a new tablespace, and after
promoting a standby the standby fails to start because of this error:
FATAL:  could not create file
"pg_tblspc/16388/PG_9.6_201511071/16384/16389": File exists
This could be fixed separately though.

> spgbuildempty, ginbuildempty: these emits two or three newpage
>   logs at once so only the last one is set is_fsync for
>   performance reason.

It doesn't matter to fsync just at the last one. Each one of them
would be replayed either way, the last one triggering the sync, no?

> In short, it seems to me that the reason to choose using
> XLOG_FPI_FOR_SYNC here is only performance of processing
> successive FPIs for INIT_FORKNUM.

Yeah, there is a one-way link between this WAL record a INIT_FORKNUM.
However please note that having a INIT_FORKNUM does not always imply
that a sync is wanted. copy_relation_data is an example of that.

> INIT_FORKNUM is generated only for unlogged tables and their
> belongings. I suppose such successive fsyncs doesn't cause
> observable performance drop assuming that the number of unlogged
> tables and belongings is not so high, especially with smarter
> storages. All we should do is that just fsync only for
> INIT_FORKNUM's FPIs for the case. If the performance does matter
> even so, we still can fsync the last md-file when any wal record
> other than FPI for INIT_FORK comes. (But this would be a bit
> complex..)

Hm. If a system uses a bunch of temporary relations with brin index or
other included I would not say so. For back branches we may have to do
it unconditionally using INIT_FORKNUM, but having a control flag to
have it only done for unlogged relations would leverage that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-11-30 Thread Pavel Stehule
Hi

>
>> > Do you have some ideas about the name of this class?
>>
>> I think plpy.Error is fine.
>>
>>
> here is updated patch - work with 2.x Python.
>
> I have 3.x Python broken on my fedora, so I should not do tests on 3.x.
>

here is complete patch - regress tests for all supported Python branches


>
> Regards
>
> Pavel
>
diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
new file mode 100644
index 015bbad..ee6a42a
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
*** $$ LANGUAGE plpythonu;
*** 1205,1210 
--- 1205,1228 
  approximately the same functionality
 

+ 
+   
+Raising Errors
+ 
+
+ A plpy.Error can be raised from PL/Python, the constructor accepts
+ keyword parameters:
+ plpy.Error([ message [, detail [, hint [, sqlstate  [, schema  [, table  [, column  [, datatype  [, constraint ]).
+
+
+ An example of raising custom exception could be written as:
+ 
+ DO $$
+   raise plpy.Error('custom message', hint = 'It is test only');
+ $$ LANGUAGE plpythonu;
+ 
+
+   
   
  
   
diff --git a/src/pl/plpython/expected/plpython_error.out b/src/pl/plpython/expected/plpython_error.out
new file mode 100644
index 1f52af7..cb792eb
*** a/src/pl/plpython/expected/plpython_error.out
--- b/src/pl/plpython/expected/plpython_error.out
*** EXCEPTION WHEN SQLSTATE 'SILLY' THEN
*** 422,424 
--- 422,486 
  	-- NOOP
  END
  $$ LANGUAGE plpgsql;
+ /* the possibility to set fields of custom exception
+  */
+ DO $$
+ raise plpy.Error('This is message text.',
+ detail = 'This is detail text',
+ hint = 'This is hint text.')
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: This is message text.
+ DETAIL:  This is detail text
+ HINT:  This is hint text.
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 4, in 
+ hint = 'This is hint text.')
+ PL/Python anonymous code block
+ \set VERBOSITY verbose
+ DO $$
+ raise plpy.Error('This is message text.',
+ detail = 'This is detail text',
+ hint = 'This is hint text.',
+ sqlstate = 'SILLY',
+ schema = 'any info about schema',
+ table = 'any info about table',
+ column = 'any info about column',
+ datatype = 'any info about datatype',
+ constraint = 'any info about constraint')
+ $$ LANGUAGE plpythonu;
+ ERROR:  SILLY: plpy.Error: This is message text.
+ DETAIL:  This is detail text
+ HINT:  This is hint text.
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 10, in 
+ constraint = 'any info about constraint')
+ PL/Python anonymous code block
+ SCHEMA NAME:  any info about schema
+ TABLE NAME:  any info about table
+ COLUMN NAME:  any info about column
+ DATATYPE NAME:  any info about datatype
+ CONSTRAINT NAME:  any info about constraint
+ LOCATION:  PLy_elog, plpy_elog.c:122
+ \set VERBOSITY default
+ DO $$
+ raise plpy.Error(detail = 'This is detail text')
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: 
+ DETAIL:  This is detail text
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 2, in 
+ raise plpy.Error(detail = 'This is detail text')
+ PL/Python anonymous code block
+ DO $$
+ raise plpy.Error();
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: 
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 2, in 
+ raise plpy.Error();
+ PL/Python anonymous code block
+ DO $$
+ raise plpy.Error(sqlstate = 'wrong sql state');
+ $$ LANGUAGE plpythonu;
+ ERROR:  could not create Error object (invalid SQLSTATE code)
+ CONTEXT:  PL/Python anonymous code block
diff --git a/src/pl/plpython/expected/plpython_error_0.out b/src/pl/plpython/expected/plpython_error_0.out
new file mode 100644
index 5323906..3d83a53
*** a/src/pl/plpython/expected/plpython_error_0.out
--- b/src/pl/plpython/expected/plpython_error_0.out
*** EXCEPTION WHEN SQLSTATE 'SILLY' THEN
*** 422,424 
--- 422,486 
  	-- NOOP
  END
  $$ LANGUAGE plpgsql;
+ /* the possibility to set fields of custom exception
+  */
+ DO $$
+ raise plpy.Error('This is message text.',
+ detail = 'This is detail text',
+ hint = 'This is hint text.')
+ $$ LANGUAGE plpythonu;
+ ERROR:  plpy.Error: This is message text.
+ DETAIL:  This is detail text
+ HINT:  This is hint text.
+ CONTEXT:  Traceback (most recent call last):
+   PL/Python anonymous code block, line 4, in 
+ hint = 'This is hint text.')
+ PL/Python anonymous code block
+ \set VERBOSITY verbose
+ DO $$
+ raise plpy.Error('This is message text.',
+ detail = 'This is detail text',
+ hint = 'This is hint text.',
+ sqlstate = 'SILLY',
+ schema = 'any info about schema',
+ 

Re: [HACKERS] proposal: multiple psql option -c

2015-11-30 Thread Pavel Stehule
2015-11-30 15:17 GMT+01:00 Michael Paquier :

> On Thu, Nov 26, 2015 at 4:21 AM, Pavel Stehule wrote:
> > Attached patch per Tom Lane proposal.
> >
> > * multiple -c -f options are supported, the order of options is respected
> > * the statements for one -c options are executed in transactions
> > * Iacob's doc patch merged
>
>  enum _actions
>  {
> ACT_NOTHING = 0,
> -   ACT_SINGLE_SLASH,
> ACT_LIST_DB,
> -   ACT_SINGLE_QUERY,
> -   ACT_FILE
> +   ACT_FILE_STDIN
>  };
>
> Removing some items from the list of potential actions and creating a
> new sublist listing action types is a bit weird. Why not grouping them
> together and allow for example -l as well in the list of things that
> is considered as a repeatable action? It seems to me that we could
> simplify the code this way, and instead of ACT_NOTHING we could check
> if the list of actions is empty or not.
>

fixed

Regards

Pavel


> --
> Michael
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
new file mode 100644
index 5899bb4..2928c92
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** PostgreSQL documentation
*** 38,46 
   PostgreSQL. It enables you to type in
   queries interactively, issue them to
   PostgreSQL, and see the query results.
!  Alternatively, input can be from a file. In addition, it provides a
!  number of meta-commands and various shell-like features to
!  facilitate writing scripts and automating a wide variety of tasks.
  
   
  
--- 38,47 
   PostgreSQL. It enables you to type in
   queries interactively, issue them to
   PostgreSQL, and see the query results.
!  Alternatively, input can be from a file or from command line
!  arguments. In addition, it provides a number of meta-commands and various
!  shell-like features to facilitate writing scripts and automating a wide
!  variety of tasks.
  
   
  
*** PostgreSQL documentation
*** 89,126 
--command=command


!   Specifies that psql is to execute one
!   command string, command,
!   and then exit. This is useful in shell scripts. Start-up files
!   (psqlrc and ~/.psqlrc) are
!   ignored with this option.


!   command must be either
!   a command string that is completely parsable by the server (i.e.,
!   it contains no psql-specific features),
!   or a single backslash command. Thus you cannot mix
!   SQL and psql
!   meta-commands with this option. To achieve that, you could
!   pipe the string into psql, for example:
!   echo '\x \\ SELECT * FROM foo;' | psql.
(\\ is the separator meta-command.)


!If the command string contains multiple SQL commands, they are
!processed in a single transaction, unless there are explicit
!BEGIN/COMMIT commands included in the
!string to divide it into multiple transactions.  This is
!different from the behavior when the same string is fed to
!psql's standard input.  Also, only
!the result of the last SQL command is returned.


!Because of these legacy behaviors, putting more than one command in
!the -c string often has unexpected results.  It's
!better to feed multiple commands to psql's
!standard input, either using echo as
!illustrated above, or via a shell here-document, for example:
  
  psql EOF
  \x
--- 90,134 
--command=command


!   Specifies that psql is to execute the given
!   command string, command.
!   This option can be repeated and combined in any order with
!   the -f option.


!   command must be either a
!   command string that is completely parsable by the server (i.e., it
!   contains no psql-specific features), or a
!   single backslash command. Thus you cannot mix
!   SQL and psql meta-commands
!   with this option. To achieve that, you could use
!   repeated -c options or pipe the string
!   into psql, for example:
!   psql -c '\x' -c 'SELECT * FROM foo;' or
!   echo '\x \\ SELECT * FROM foo;' | psql
(\\ is the separator meta-command.)


!Each command string passed to -c is sent to the server
!as a single query. Because of this, the server executes it as a single
!transaction, even if a command string contains
!multiple SQL commands, unless there are
!explicit BEGIN/COMMIT commands included in the
!string to divide it into multiple transactions.  Also, the server only
!returns the result of the last SQL command to the
!client.  This is different from the behavior when the same string with
!multiple SQL commands is fed
!to psql's standard input because
!then psql sends each 

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-11-30 Thread Vinayak
Thanks for the v7.
Please check the comment below.
-Table name in the vacuum progress

+ snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
schemaname,relname);

In the vacuum progress, column table_name is showing first 30 characters of
table name.
postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
postgres=# select * from pg_stat_vacuum_progress ;
-[ RECORD 1 ]---+--
pid | 12284
table_name  | public.test_vacuum_progress_i
phase   | Scanning Heap
total_heap_pages| 41667
scanned_heap_pages  | 25185
percent_complete| 60
total_index_pages   | 0
scanned_index_pages | 0
index_scan_count| 0




-
Regards,
Vinayak,

--
View this message in context: 
http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5875614.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5Beta1 psql wrapped format expanded output

2015-11-30 Thread Tom Lane
Jeff Janes  writes:
> On Fri, Oct 23, 2015 at 5:11 PM, Jeff Janes  wrote:
>> Why swidth for border 2 is three greater than it is with border 1, I
>> don't really know.

> Now I see why.  Border 2 doesn't just add a '|' on either end of the line,
> but also adds a space to the left end, so that the "column" name is not
> hard up against the preceding '|'

I looked this over and concluded that the real problem was that the logic
that added space for newline/wrap marker columns was many bricks shy of a
load.  For example it had

if ((opt_border < 2) &&
((hmultiline &&
  (format == _asciiformat_old)) ||
 (dmultiline &&
  (format != _asciiformat_old
iwidth++;/* for newline indicators */

which aside from being nearly unreadable conflated the header wrap column
with the data wrap column; and even if those had identical conditions for
being added, which they don't, you'd need to count two more columns here
not just one.

I rewrote it to correspond more accurately to what the printing logic
actually does, and pushed it as 0e0776bc9.  Let me know if you still see
any problems here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gincostestimate and hypothetical indexes

2015-11-30 Thread Tom Lane
Julien Rouhaud  writes:
> I figured out that it's not possible to use a hypothetical gin index, as
> the gincostestimate function try to retrieve some statistical data from
> the index meta page.

Good point.

> Attached patch fixes this. I believe this should be back-patched as was
> a2095f7fb5a57ea1794f25d029756d9a140fd429.

I don't much care for this patch though.  The core problem is that just
returning all zeroes seems quite useless: it will probably result in silly
cost estimates.  The comment in the patch claiming that this would be the
situation in a never-vacuumed index is wrong, because ginbuild() updates
those stats too.  But I'm not sure exactly what to do instead :-(.

Ideally we'd put it on the head of the hypothetical-index plugin to invent
some numbers, but I dunno if we want to create such an API or not ... and
we certainly couldn't back-patch such a change.

Maybe we could do something along the lines of pretending that 90% of the
index size given by the plugin is entry pages?  Don't know what a good
ratio would be exactly, but we could probably come up with one with a bit
of testing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gincostestimate and hypothetical indexes

2015-11-30 Thread Julien Rouhaud
On 01/12/2015 00:37, Tom Lane wrote:
> Julien Rouhaud  writes:
>> I figured out that it's not possible to use a hypothetical gin index, as
>> the gincostestimate function try to retrieve some statistical data from
>> the index meta page.
> 
> Good point.
> 
>> Attached patch fixes this. I believe this should be back-patched as was
>> a2095f7fb5a57ea1794f25d029756d9a140fd429.
> 
> I don't much care for this patch though.  The core problem is that just
> returning all zeroes seems quite useless: it will probably result in silly
> cost estimates.  The comment in the patch claiming that this would be the
> situation in a never-vacuumed index is wrong, because ginbuild() updates
> those stats too.  But I'm not sure exactly what to do instead :-(.
> 

Oops, it looks that this is only true for pre 9.1 indexes (per comment
shown below).

> Ideally we'd put it on the head of the hypothetical-index plugin to invent
> some numbers, but I dunno if we want to create such an API or not ... and
> we certainly couldn't back-patch such a change.
> 
> Maybe we could do something along the lines of pretending that 90% of the
> index size given by the plugin is entry pages?  Don't know what a good
> ratio would be exactly, but we could probably come up with one with a bit
> of testing.
> 

I used zero values because gincostestimate already handle empty
statistics, and pretend that 100% of the pages are entry pages:


/*
* nPendingPages can be trusted, but the other fields are as of the last
* VACUUM. Scale them by the ratio numPages / nTotalPages to account for
* growth since then. If the fields are zero (implying no VACUUM at all,
* and an index created pre-9.1), assume all pages are entry pages.
*/
if (ginStats.nTotalPages == 0 || ginStats.nEntryPages == 0)
{
numEntryPages = numPages;
numDataPages = 0;
numEntries = numTuples; /* bogus, but no other info available */
}


But I don't have any clue of what would be a better ratio either.

>   regards, tom lane
> 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-30 Thread Greg Stark
Hm. Here is a log-log chart of those results (sorry for html mail). I'm not
really sure if log-log is the right tool to use for a O(nlog(n)) curve
though.

I think the take-away is that this is outside the domain where any
interesting break points occur. Maybe run more tests on the low end to find
where the tapesort can generate a single tape and avoid the merge and see
where the discontinuity is with quicksort for the various work_mem sizes.

And can you calculate an estimate where the domain would be where multiple
passes would be needed for this table at these work_mem sizes? Is it
feasible to test around there?
[image: Inline image 1]




-- 
greg


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Michael Paquier
On Tue, Dec 1, 2015 at 4:43 AM, Tom Lane  wrote:
> * pg_rewind exiting with error code 1 when source and target are on the same 
> timeline
>
> Is this a new-in-9.5 bug, or a pre-existing problem?  If the latter,
> I'm not sure it's a release blocker.

pg_rewind has been introduced in 9.5. It would be good to get
something consistent before GA.

> * Relation files of unlogged relation for btree and spgist indexes not 
> initialized after promotion
>
> Again, is this a release blocker?  It's evidently a very old bug.

Actually for 9.5 it could be said so. The approach proposed by Andres
and that I have patched requires a bump of the WAL format. It would be
nice to get that into 9.5 tree. Or 9.5 will have to use a solution
similar to the back branches, which would be something like syncing
unconditionally INIT_FORKNUM at replay after replaying its FPW.
Thoughts welcome on the dedicated thread.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5Beta1 psql wrapped format expanded output

2015-11-30 Thread Jeff Janes
On Mon, Nov 30, 2015 at 2:59 PM, Tom Lane  wrote:
> Jeff Janes  writes:
>> On Fri, Oct 23, 2015 at 5:11 PM, Jeff Janes  wrote:
>>> Why swidth for border 2 is three greater than it is with border 1, I
>>> don't really know.
>
>> Now I see why.  Border 2 doesn't just add a '|' on either end of the line,
>> but also adds a space to the left end, so that the "column" name is not
>> hard up against the preceding '|'
>
> I looked this over and concluded that the real problem was that the logic
> that added space for newline/wrap marker columns was many bricks shy of a
> load.  For example it had
>
> if ((opt_border < 2) &&
> ((hmultiline &&
>   (format == _asciiformat_old)) ||
>  (dmultiline &&
>   (format != _asciiformat_old
> iwidth++;/* for newline indicators */
>
> which aside from being nearly unreadable conflated the header wrap column
> with the data wrap column; and even if those had identical conditions for
> being added, which they don't, you'd need to count two more columns here
> not just one.
>
> I rewrote it to correspond more accurately to what the printing logic
> actually does, and pushed it as 0e0776bc9.  Let me know if you still see
> any problems here.
>
> regards, tom lane

The wrapping behavior looks good now, and the code is much more understandable.

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-11-30 Thread Alvaro Herrera
Stephen Frost wrote:
> * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> > Stephen Frost wrote:
> > 
> > > The non-documentation question is around DROP OWNED.  We need to either
> > > have policies dropped by DROP OWNED (well, roles removed, unless it's
> > > the last one, in which case the policy should be dropped), or update the
> > > documentation to reflect that they don't.  I had been thinking we'd
> > > fix DROP OWNED to deal with the policies, but if folks feel it's too
> > > late for that kind of a change, then we can simply document it.  I don't
> > > believe that's unreasonable for a new feature and we can work to get it
> > > addressed in 9.6.
> > 
> > DROP OWNED is documented as a mechanism to help you drop the role, so
> > it should do whatever is needed for that.  I don't think documenting the
> > fact that it leaves the user as part of policies is good enough.
> 
> We already can't take care of everything with DROP OWNED though, since
> we can't do cross-database queries, and the overall process almost
> certainly requires additional effort (to reassign objects, etc...), so
> while I'd be happier if policies were handled by it, I don't think it's
> as serious of an issue.

Yes, the documentation says to apply a combination of REASSIGN OWNED
plus DROP OWNED to each database.  Sure, it's not a single command, but
if you additionally put the burden that the policies must be taken care
of separately, then the whole process is made a little worse.

> Still, I'll get a patch worked up for it and then we can discuss the
> merits of that patch going in to 9.5 now versus just into HEAD.

Cool.

In the past, we've made a bunch of changes to DROP OWNED in order to
deal with object types that caused errors, even in minor releases.  I
think this is just another case of the same problem.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-11-30 Thread Simon Riggs
On 30 November 2015 at 22:27, Julien Rouhaud 
wrote:


> I registered as reviewer on this, but after reading the whole thread for
> the second time, it's still not clear to me if the last two submitted
> patches (0001-Add-auto_explain.publish_plans.patch and
> 0002-Add-SHM-table-of-contents-to-the-explain-DSM.patch) are still
> needed reviews, since multiple refactoring ideas and objections have
> been raised since.
>

I looked into it more deeply and decided partial EXPLAINs aren't very
useful yet.

I've got some thoughts around that which I'll publish when I have more
time. I suggest this is a good idea, just needs some serious
committer-love, so we should bounce this for now.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Additional role attributes && superuser review

2015-11-30 Thread Michael Paquier
On Tue, Dec 1, 2015 at 3:32 AM, Stephen Frost  wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
>> On Fri, Nov 20, 2015 at 12:29 PM, Stephen Frost  wrote:
>> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> >> On Thu, Nov 19, 2015 at 7:10 AM, Stephen Frost wrote:
>> >> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> >> >> It seems weird to not have a dedicated role for pg_switch_xlog.
>> >> >
>> >> > I didn't add a pg_switch_xlog default role in this patch series, but
>> >> > would be happy to do so if that's the consensus.  It's quite easy to do.
>> >>
>> >> Agreed. I am not actually getting why that's part of the backup
>> >> actually. That would be more related to archiving, both being
>> >> unrelated concepts. But at this point I guess that's mainly a
>> >> philosophical split.
>> >
>> > As David notes, they're actually quite related.  Note that in our
>> > documentation pg_switch_xlog() is listed in the "Backup Control
>> > Functions" table.
>> >
>> > I can think of a use-case for a user who can call pg_switch_xlog, but
>> > not pg_start_backup()/pg_stop_backup(), but I have to admit that it
>> > seems rather limited and I'm on the fence about it being a worthwhile
>> > distinction.
>>
>> Sounds too narrow to me.  Are we going to have a separate predefined
>> role for every security-restricted function to which someone might
>> want to grant access?  That seems over the top to me.
>
> I certainly don't want to go down to that level and was, as seen above,
> unsure about having pg_switch_xlog() as a differentiated privilege.
> Michael, do you still see that as a useful independent capability?

OK, let's do so then by having this one fall under pg_backup. Let's
not be my grunting concerns be an obstacle for this patch, and we
could still change it afterwards in this release beta cycle anyway
based on user feedback.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using quicksort for every external sort run

2015-11-30 Thread Peter Geoghegan
On Mon, Nov 30, 2015 at 12:29 PM, Peter Geoghegan  wrote:
>> I'm kind of curious as to why the optimal for the patched code appears
>> at 1GB and not lower.  If I get a chance to rebuild the test, I will
>> look into that more.
>
> I think that the availability of abbreviated keys (or something that
> allows most comparisons made by quicksort/the heap to be resolved at
> the SortTuple level) could make a big difference for things like this.

Using the Hydra POWER7 server [1] + the gensort benchmark [2], which
uses the C collation, and has abbreviated keys that have lots of
entropy, I see benefits with higher and higher maintenance_work_mem
settings.

I will present a variety of cases, which seemed like something Greg
Stark is particularly interested in. On the whole, I am quite pleased
with how things are shown to be improved in a variety of different
scenarios.

Looking at CREATE INDEX build times on an (unlogged) gensort table
with 50 million, 100 million, 250 million, and 500 million tuples,
with maintenance_work_mem settings of 512MB, 1GB, 10GB, and 15GB,
there are sustained improvements as more memory is made available. I'm
not saying that that would be the case with low cardinality leading
attribute tuples -- probably not -- but it seems pretty nice that this
case can sustain improvements as more memory is made available. The
server used here has reasonably good disks (Robert goes into this in
his blogpost), but nothing spectacular.

This is what a 500 million tuple gensort table looks like:

postgres=# \dt+
List of relations
 Schema |   Name| Type  | Owner | Size  | Description
+---+---+---+---+-
 public | sort_test | table | pg| 32 GB |
(1 row)

Results:

50 million tuple table (best of 3):
--

512MB: (8-way final merge) external sort ended, 171058 disk blocks
used: CPU 4.11s/79.30u sec elapsed 83.60 sec
1GB: (4-way final merge) external sort ended, 171063 disk blocks used:
CPU 4.29s/71.34u sec elapsed 75.69 sec
10GB: N/A
15GB: N/A
1GB (master branch): (3-way final merge) external sort ended, 171064
disk blocks used: CPU 6.19s/163.00u sec elapsed 170.84 sec

100 million tuple table (best of 3):


512MB: (16-way final merge) external sort ended, 342114 disk blocks
used: CPU 8.61s/177.77u sec elapsed 187.03 sec
1GB: (8-way final merge) external sort ended, 342124 disk blocks used:
CPU 8.07s/165.15u sec elapsed 173.70 sec
10GB: N/A
15GB: N/A
1GB (master branch): (5-way final merge) external sort ended, 342129
disk blocks used: CPU 11.68s/358.17u sec elapsed 376.41 sec

250 million tuple table (best of 3):


512MB:  (39-way final merge) external sort ended, 855284 disk blocks
used: CPU 19.96s/486.57u sec elapsed 507.89 sec
1GB: (20-way final merge) external sort ended, 855306 disk blocks
used: CPU 22.63s/475.33u sec elapsed 499.09 sec
10GB: (2-way final merge) external sort ended, 855326 disk blocks
used: CPU 21.99s/341.34u sec elapsed 366.15 sec
15GB: (2-way final merge) external sort ended, 855326 disk blocks
used: CPU 23.23s/322.18u sec elapsed 346.97 sec
1GB (master branch): (11-way final merge) external sort ended, 855315
disk blocks used: CPU 30.56s/973.00u sec elapsed 1015.63 sec

500 million tuple table (best of 3):


512MB: (77-way final merge) external sort ended, 1710566 disk blocks
used: CPU 45.70s/1016.70u sec elapsed 1069.02 sec
1GB: (39-way final merge) external sort ended, 1710613 disk blocks
used: CPU 44.34s/1013.26u sec elapsed 1067.16 sec
10GB: (4-way final merge) external sort ended, 1710649 disk blocks
used: CPU 46.46s/772.97u sec elapsed 841.35 sec
15GB: (3-way final merge) external sort ended, 1710652 disk blocks
used: CPU 51.55s/729.88u sec elapsed 809.68 sec
1GB (master branch): (20-way final merge) external sort ended, 1710632
disk blocks used: CPU 69.35s/2013.21u sec elapsed 2113.82 sec

I attached a detailed account of these benchmarks, for those that
really want to see the nitty-gritty. This includes a 1GB case for
patch without memory prefetching (which is not described in this
message).

[1] http://rhaas.blogspot.com/2012/03/performance-and-scalability-on-ibm.html
[2] https://github.com/petergeoghegan/gensort
-- 
Peter Geoghegan


sort-benchmarks.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory prefetching while sequentially fetching from SortTuple array, tuplestore

2015-11-30 Thread Peter Geoghegan
On Mon, Nov 30, 2015 at 1:04 PM, Peter Geoghegan  wrote:
> For example, the 50 million tuple test has over 8% of its runtime
> shaved off. This seems to be a consistent pattern.

I included the nitty-gritty details of this case in something attached
to a mail I just sent, over in the "Quicksort for every external sort
run" thread, FWIW.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2015-11-30 Thread Robert Haas
On Thu, Nov 26, 2015 at 5:27 AM, Kouhei Kaigai  wrote:
> I'm now implementing. The above design perfectly works on ForeignScan.
> On the other hands, I'd like to have deeper consideration for CustomScan.
>
> My recent patch adds LibraryName and SymbolName on CustomScanMethods
> to lookup the method table even if library is not loaded yet.
> However, this ExtensibleNodeMethods relies custom scan provider shall
> be loaded, by parallel infrastructure, prior to the deserialization.
> It means extension has a chance to register itself as well.
>
> My idea is, redefine CustomScanMethod as follows:
>
> typedef struct ExtensibleNodeMethods
> {
> const char *extnodename;
> Sizenode_size;
> Node *(*nodeCopy)(const Node *from);
> bool  (*nodeEqual)(const Node *a, const Node *b);
> void  (*nodeOut)(struct StringInfoData *str, const Node *node);
> void  (*nodeRead)(Node *node);
> } ExtensibleNodeMethods;
>
> typedef struct CustomScanMethods
> {
> union {
> const char *CustomName;
> ExtensibleNodeMethods  xnode;
> };
> /* Create execution state (CustomScanState) from a CustomScan plan node */
> Node   *(*CreateCustomScanState) (struct CustomScan *cscan);
> } CustomScanMethods;
>
> It allows to pull CustomScanMethods using GetExtensibleNodeMethods
> by CustomName; that is alias of extnodename in ExtensibleNodeMethods.
> Thus, we don't need to care about LibraryName and SymbolName.
>
> This kind of trick is not needed for ForeignScan because all the pointer
> stuff are reproducible by catalog, however, method table of CustomScan
> is not.
>
> How about your opinion?

Anonymous unions are not C89, so this is a no-go.

I think we should just drop CustomName and replace it with
ExtensibleNodeMethods.  That will break things for third-party code
that is using the existing CustomScan stuff, but there's a good chance
that nobody other than you has written any such code.  And even if
someone has, updating it for this change will not be very difficult.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2015-11-30 Thread Marko Tiikkaja

On 2015-12-01 05:00, David Rowley wrote:

We already allow a SELECT's target list to contain non-aggregated columns
in a GROUP BY query in cases where the non-aggregated column is
functionally dependent on the GROUP BY clause.

For example a query such as;

SELECT p.product_id,p.description, SUM(s.quantity)
FROM product p
INNER JOIN sale s ON p.product_id = s.product_id
GROUP BY p.product_id;

is perfectly fine in PostgreSQL, as p.description is functionally dependent
on p.product_id (assuming product_id is the PRIMARY KEY of product).


This has come up before (on other forums, at least), and my main concern 
has been that unlike the case where we go from throwing an error to 
allowing a query, this has a chance to make the planning of currently 
legal queries slower.  Have you tried to measure the impact of this on 
queries where there's no runtime gains to be had?



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor comment edits in nodeGather.c

2015-11-30 Thread Amit Langote
On 2015/12/01 3:06, Robert Haas wrote:
> On Tue, Nov 24, 2015 at 9:43 PM, Amit Langote
>  wrote:
>> On 2015/11/25 11:31, Robert Haas wrote:
>>>
>>> Well I think "scanning a plan" is clear enough even if it's
>>> technically a Scan.
>>
>> Okay, ripped that out in the attached.
> 
> Committed, thanks.

Thanks!




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error with index on unlogged table

2015-11-30 Thread Kyotaro HORIGUCHI
Hello, 

At Tue, 1 Dec 2015 11:53:35 +0900, Michael Paquier  
wrote in 
> On Tue, Dec 1, 2015 at 11:11 AM, Kyotaro HORIGUCHI
>  wrote:
> > Hello, I studied your latest patch.
> 
> Thanks!
> 
> > I feel quite uncomfortable that it solves the problem from a kind
> > of nature of unlogged object by arbitrary flagging which is not
> > fully corresponds to the nature. If we can deduce the necessity
> > of fsync from some nature, it would be preferable.
> 
> INIT_FORKNUM is not something only related to unlogged relations,
> indexes use them as well. And that's actually
> If you look at for example BRIN indexes that do not sync immediately
> their INIT_FORKNUM when index is created, I think that we still are
> going to need a new flag to control the sync at WAL replay because
> startup process cannot know a relation's persistence, thing that we
> can know when the XLOG_FPI record is created. For BRIN indexes, we
> want particularly to not sync the INIT_FORKNUM when the relation is
> not an unlogged one.

(The comment added in brinbuildempty looks wrong since it
 actually doesn't fsync it immediately..)

Hmm, I've already seen that, and having your explanation I wonder
why brinbuidempty issues WAL for what is not necessary to be
persistent at the mement. Isn't it breaking agreements about
Write Ahead Log? INIT_FORKNUM and unconditionally fsync'ing would
be equally tied excluding the anormally about WAL. (Except for
succeeding newpages.)

> > In the current patch, is_sync for log_newpage is generally true
> > for and only for INIT_FORKNUM pages. Exceptions as far as I can
> > see are,
> 
> Yep, that's not always the case.
> 
> > copy_relation_data: called with arbitrary forknum but it doesn't
> >set is_fsync even for coying INIT_FORKNUM. (Is this not a
> >problem?)
> 
> Oh. And actually, it seems that use_wal is broken there as well. I
> think that we would still want to issue a XLOG_FPI record for an
> unlogged relation should it be moved to a new tablespace to copy its
> INIT_FORKNUM correctly to its new home.

Agreed.

> After moving an unlogged relation to a new tablespace, and after
> promoting a standby the standby fails to start because of this error:
> FATAL:  could not create file
> "pg_tblspc/16388/PG_9.6_201511071/16384/16389": File exists
> This could be fixed separately though.
> 
> > spgbuildempty, ginbuildempty: these emits two or three newpage
> >   logs at once so only the last one is set is_fsync for
> >   performance reason.
> 
> It doesn't matter to fsync just at the last one. Each one of them
> would be replayed either way, the last one triggering the sync, no?

Yes, it does (except for some unreal case below). It was just a
confirmation and I didn't see this as a problem at all. Sorry for the
ambiguous writing.

> > In short, it seems to me that the reason to choose using
> > XLOG_FPI_FOR_SYNC here is only performance of processing
> > successive FPIs for INIT_FORKNUM.
> 
> Yeah, there is a one-way link between this WAL record a INIT_FORKNUM.
> However please note that having a INIT_FORKNUM does not always imply
> that a sync is wanted. copy_relation_data is an example of that.

As I wrote above, I suppose we should fix(?) the irregular
relationship between WAL and init fork of brin and so.

> > INIT_FORKNUM is generated only for unlogged tables and their
> > belongings. I suppose such successive fsyncs doesn't cause
> > observable performance drop assuming that the number of unlogged
> > tables and belongings is not so high, especially with smarter
> > storages. All we should do is that just fsync only for
> > INIT_FORKNUM's FPIs for the case. If the performance does matter
> > even so, we still can fsync the last md-file when any wal record
> > other than FPI for INIT_FORK comes. (But this would be a bit
> > complex..)
> 
> Hm. If a system uses a bunch of temporary relations with brin index or
> other included I would not say so. For back branches we may have to do
> it unconditionally using INIT_FORKNUM, but having a control flag to
> have it only done for unlogged relations would leverage that.

It could, and should do so. And if we take such systems with
bunch of temp relations as significant (I agree with this),
XLogRegisterBlock() looks to be able to register multiple blocks
into single wal record and we could eliminate arbitrary flagging
on individual FPI records using it. Is it possible?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing Functionally Dependent GROUP BY Columns

2015-11-30 Thread David Rowley
We already allow a SELECT's target list to contain non-aggregated columns
in a GROUP BY query in cases where the non-aggregated column is
functionally dependent on the GROUP BY clause.

For example a query such as;

SELECT p.product_id,p.description, SUM(s.quantity)
FROM product p
INNER JOIN sale s ON p.product_id = s.product_id
GROUP BY p.product_id;

is perfectly fine in PostgreSQL, as p.description is functionally dependent
on p.product_id (assuming product_id is the PRIMARY KEY of product).

It seems that there's no shortage of relational databases in existence
today which don't support this. These databases would require the GROUP BY
clause to include the p.description column too.

It seems rather unfortunate that people who migrate applications to
PostgreSQL may not be aware that we support this, as currently if we
needlessly include the p.description column, PostgreSQL naively includes
this column while grouping. These people could well be incurring a
performance penalty due to our planner not removing the useless items from
the list, as if the primary key is present, then including any other
columns won't cause splitting of the groups any further, all other columns
from the *same relation* can simply be removed from the GROUP BY clause.

There are in fact also two queries in TPC-H (Q10 and Q18) which are written
to include all of the non-aggregated column in the GROUP BY list. During a
recent test I witnessed a 50% gain in performance in Q10 by removing the
unneeded columns from the GROUP BY clause.

I've attached a patch which implements this in PostgreSQL.

The patch may need a little more work in order to adjust the targetlist's
tleSortGroupRefs to remove invalid ones and perhaps also remove the gaps.

I'm posting this now so that I can gauge the community interest in this.

Is it something that we'd like to have in PostgreSQL?

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


prune_group_by_clause_2027f512_2015-12-01.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-30 Thread Michael Paquier
On Mon, Nov 30, 2015 at 11:51 PM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> On Mon, Nov 30, 2015 at 6:28 AM, Noah Misch  wrote:
>
>> > The proposed code is short on guidance about when to put a function in 
>> > TestLib
>> > versus TestBase.  TestLib has no header comment.  The TestBase header 
>> > comment
>> > would permit, for example, command_ok() in that module.  I would try 
>> > instead
>> > keeping TestLib as the base module and moving into PostgresNode the 
>> > functions
>> > that deal with PostgreSQL clusters (get_new_node teardown_node psql
>> > poll_query_until issues_sql_like).
>>
>> PostgresNode is wanted to be a base representation of how of node is,
>> not of how to operate on it. The ways to perform the tests, which
>> works on a node, is wanted as a higher-level operation.
>>
>> Logging and base configuration of a test set is a lower level of
>> operations than PostgresNode, because cluster nodes need actually to
>> perform system calls, some of those system calls like run_log allowing
>> to log in the centralized log file. I have tried to make the headers
>> of those modules more verbose, please see attached.
>
> I'm not terribly convinced by this argument TBH.  Perhaps we can have
> PostgresNode be one package, and the logging routines be another
> package, and we create a higher-level package whose @ISA=(PostgresNode,
> LoggingWhatever) and then we move the routines suggested by Noah into
> that new package.  Then the tests use that instead of PostgresNode
> directly.

OK... I have merged TestLib and PostgresNode of the previous patch
into PostgresNode into the way suggested by Noah. TestBase has been
renamed back to TestLib, and includes as well the base test functions
like command_ok.
Regards,
-- 
Michael
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 299dcf5..f64186d 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -4,6 +4,7 @@
 
 use strict;
 use warnings;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 14;
 
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index dc96bbf..c7eb250 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -2,6 +2,7 @@ use strict;
 use warnings;
 use Cwd;
 use Config;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 51;
 
@@ -9,8 +10,15 @@ program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
 program_options_handling_ok('pg_basebackup');
 
-my $tempdir = tempdir;
-start_test_server $tempdir;
+my $tempdir = TestLib::tempdir;
+
+my $node = get_new_node();
+# Initialize node without replication settings
+$node->init(hba_permit_replication => 0);
+$node->start;
+my $pgdata = $node->data_dir;
+
+$ENV{PGPORT} = $node->port;
 
 command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
@@ -26,19 +34,19 @@ if (open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR")
 	close BADCHARS;
 }
 
-configure_hba_for_replication "$tempdir/pgdata";
-system_or_bail 'pg_ctl', '-D', "$tempdir/pgdata", 'reload';
+$node->set_replication_conf();
+system_or_bail 'pg_ctl', '-D', $pgdata, 'reload';
 
 command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup fails because of WAL configuration');
 
-open CONF, ">>$tempdir/pgdata/postgresql.conf";
+open CONF, ">>$pgdata/postgresql.conf";
 print CONF "max_replication_slots = 10\n";
 print CONF "max_wal_senders = 10\n";
 print CONF "wal_level = archive\n";
 close CONF;
-restart_test_server;
+$node->restart;
 
 command_ok([ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup runs');
@@ -81,13 +89,13 @@ command_fails(
 
 # Tar format doesn't support filenames longer than 100 bytes.
 my $superlongname = "superlongname_" . ("x" x 100);
-my $superlongpath = "$tempdir/pgdata/$superlongname";
+my $superlongpath = "$pgdata/$superlongname";
 
 open FILE, ">$superlongpath" or die "unable to create file $superlongpath";
 close FILE;
 command_fails([ 'pg_basebackup', '-D', "$tempdir/tarbackup_l1", '-Ft' ],
 	'pg_basebackup tar with long name fails');
-unlink "$tempdir/pgdata/$superlongname";
+unlink "$pgdata/$superlongname";
 
 # The following tests test symlinks. Windows doesn't have symlinks, so
 # skip on Windows.
@@ -98,7 +106,7 @@ SKIP: {
 	# to our physical temp location.  That way we can use shorter names
 	# for the tablespace directories, which hopefully won't run afoul of
 	# the 99 character length limit.
-	my $shorter_tempdir = tempdir_short . "/tempdir";
+	my $shorter_tempdir = TestLib::tempdir_short . "/tempdir";
 	symlink "$tempdir", $shorter_tempdir;
 
 	mkdir "$tempdir/tblspc1";
@@ -120,7 +128,7 @@ SKIP: {
 			"-T$shorter_tempdir/tblspc1=$tempdir/tbackup/tblspc1" ],
 		'plain format with tablespaces succeeds with tablespace mapping');
 	ok(-d "$tempdir/tbackup/tblspc1", 'tablespace was relocated');
-	

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-11-30 Thread Kyotaro HORIGUCHI
Sorry for the confusing description and the chopped sentsnce.

At Tue, 01 Dec 2015 16:25:57 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20151201.162557.184519961.horiguchi.kyot...@lab.ntt.co.jp>
> Hello,
> 
> At Mon, 30 Nov 2015 19:10:44 -0700 (MST), Vinayak  wrote 
> in <1448935844520-5875614.p...@n5.nabble.com>
> > Thanks for the v7.
> > Please check the comment below.
> > -Table name in the vacuum progress
> > 
> > + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> > schemaname,relname);
> > 
> > In the vacuum progress, column table_name is showing first 30 characters of
> > table name.
> 
> Yeah, it is actually restricted in that length. But if we allow
> the buffer to store whole the qualified names, it will need 64 *
> 2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
> be acceptable by others, but I don't think that is preferable..
> 
> Separating namespace and relation name as below reduces the
> required length of the field but 62 bytes is still too long for
> most of the information and in turn too short for longer messages
> in some cases.
> 
> As a more dractic change in design, since these fields are
> written/read in sequential manner, providing one free buffer of
> the size of.. so.. about 128 bytes for each beentry and storing
> strings delimiting with '\0' and numbers in binary format, as an
> example, would do. 

This would fail to make sense.. I suppose this can be called
'packed format', as opposed to fixed-length format. Sorry for
poor wording.

> Additional functions to write into/read from
> this buffer safely would be needed but this gives both the
> ability to store longer messages and relatively short total
> buffer size, and allows arbitrary number of parameters limited
> only by the length of the free buffer.
> 
> What do you think about this?
> 
> 
> By the way, how about giving separate columns for relname and
> namespace? I think it is more usual way to designate a relation
> in this kind of view and it makes the snprintf to concatenate
> name and schema unnecessary(it's not significant, though). (The
> following example is after pg_stat_all_tables)
> 
> > postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
> > postgres=# select * from pg_stat_vacuum_progress ;
> > pid | 12284
> > schemaname  | public
> > relname | test_vacuum_progress_i...
> > phase   | Scanning Heap
> > total_heap_pages| 41667
> ...
> 
> 
> And I have some comments about code.

This is just what I forgot to delete. I'll mention them later if
necessary.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-11-30 Thread Kyotaro HORIGUCHI
Hello,

At Mon, 30 Nov 2015 19:10:44 -0700 (MST), Vinayak  wrote 
in <1448935844520-5875614.p...@n5.nabble.com>
> Thanks for the v7.
> Please check the comment below.
> -Table name in the vacuum progress
> 
> + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> schemaname,relname);
> 
> In the vacuum progress, column table_name is showing first 30 characters of
> table name.

Yeah, it is actually restricted in that length. But if we allow
the buffer to store whole the qualified names, it will need 64 *
2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
be acceptable by others, but I don't think that is preferable..

Separating namespace and relation name as below reduces the
required length of the field but 62 bytes is still too long for
most of the information and in turn too short for longer messages
in some cases.

As a more dractic change in design, since these fields are
written/read in sequential manner, providing one free buffer of
the size of.. so.. about 128 bytes for each beentry and storing
strings delimiting with '\0' and numbers in binary format, as an
example, would do. Additional functions to write into/read from
this buffer safely would be needed but this gives both the
ability to store longer messages and relatively short total
buffer size, and allows arbitrary number of parameters limited
only by the length of the free buffer.

What do you think about this?


By the way, how about giving separate columns for relname and
namespace? I think it is more usual way to designate a relation
in this kind of view and it makes the snprintf to concatenate
name and schema unnecessary(it's not significant, though). (The
following example is after pg_stat_all_tables)

> postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
> postgres=# select * from pg_stat_vacuum_progress ;
> pid | 12284
> schemaname  | public
> relname | test_vacuum_progress_i...
> phase   | Scanning Heap
> total_heap_pages| 41667
...


And I have some comments about code.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2015-11-30 Thread Ildus Kurbangaliev
On Mon, 23 Nov 2015 12:12:23 -0500
Robert Haas  wrote:

> On Fri, Nov 20, 2015 at 6:53 AM, Ildus Kurbangaliev
>  wrote:
> > We keep limited number of LWLocks in base shared memory, why not
> > keep their thanches in shared memory too? Other tranches can be in
> > local memory, we just have to save somewhere highest id of these
> > tranches.  
> 
> I just don't see it buying us anything.  The tranches are small and
> contain only a handful of values.  The values need to be present in
> shared memory but the tranches themselves don't.
> 
> Now, if it's convenient to put them in shared memory and doesn't cause
> us any other problems, then maybe there's no real downside.  But it's
> not clear to me that there's any upside either.
> 

I see. Since tranche names are always in shared memory or static
strings, then maybe we should just create global static char * array 
with fixed length for names (something like `char
*LWLockTrancheNames[NUM_LWLOCK_BUILTIN_TRANCHES]`)? It will be
just enough for monitoring purposes, and doesn't matter where a tranche
is located. We will have a name pointer for built-in tranches only, and
fixed length of this array will not be a problem since we know exact
count of them.

-- 
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Revisiting pg_stat_statements and IN() (Was: Re: pg_stat_statements fingerprinting logic and ArrayExpr)

2015-11-30 Thread Shulgin, Oleksandr
On Wed, Nov 25, 2015 at 9:13 AM, Lukas Fittl  wrote:

> On Mon, Nov 23, 2015 at 11:53 PM, Peter Geoghegan  wrote:
>
>> One specific justification he gave for not using pg_stat_statements was:
>>
>> "Doesn’t merge bind vars in IN()" (See slide #11)
>>
>> I wonder:
>>
>> * How do other people feel about this? Personally, I've seen enough
>> problems of this kind in the field that "slippery slope" arguments
>> against this don't seem very compelling.
>>
>
> As someone who runs a little monitoring service thats solely based on
> pg_stat_statements data, ignoring IN list length would certainly be a good
> change.
>
> We currently do this in post-processing, together with other data cleanup
> (e.g. ignoring the length of a VALUES list in an INSERT statement).
>
> Given the fact that pgss data is normalized & you don't know which plan
> was chosen, I don't think distinguishing based on the length of the list
> helps anyone really.
>
> I see pg_stat_statements as a high-level overview of which queries have
> run, and which ones you might want to look into closer using e.g.
> auto_explain.
>

I still have the plans to try to marry pg_stat_statements and auto_explain
for the next iteration of "online query plans" extension I was proposing a
few months ago, and the first thing I was going to look into is rectifying
this problem with IN() jumbling.  So, have a +1 from me.

--
Alex


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2015-11-30 Thread Michael Paquier
On Mon, Nov 30, 2015 at 10:39 PM, Michael Paquier
 wrote:
> On Mon, Nov 30, 2015 at 10:36 PM, Michael Paquier wrote:
>> Instinctively, it seems to me that we had better return Nan for the
>> new asind and acosd when being out of range for OSX, Linux will
>> complain about an out-of-range error so the code is right in this
>> case.
>
> This is still mentioned upthread btw. And it does not seem to be a
> good idea to change this platform dependent behavior by throwing an
> error on the old functions, neither does it seem user-friendly to have
> inconsistent results for the XXX function and its XXXd equivalent.

At this stage, it seems wiser to me to mark this patch as "returned
with feedback". Other opinions of course are welcome.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2015-11-30 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Nov 30, 2015 at 10:36 PM, Michael Paquier wrote:
>> Instinctively, it seems to me that we had better return Nan for the
>> new asind and acosd when being out of range for OSX, Linux will
>> complain about an out-of-range error so the code is right in this
>> case.

> This is still mentioned upthread btw. And it does not seem to be a
> good idea to change this platform dependent behavior by throwing an
> error on the old functions, neither does it seem user-friendly to have
> inconsistent results for the XXX function and its XXXd equivalent.

FWIW, I think that probably the best course of action is to go ahead
and install POSIX-compliant error checking in the existing functions
too.  POSIX:2008 is quite clear about this:

: An application wishing to check for error situations should set errno to
: zero and call feclearexcept(FE_ALL_EXCEPT) before calling these
: functions. On return, if errno is non-zero or fetestexcept(FE_INVALID |
: FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW) is non-zero, an error has
: occurred.

Although I'm not too sure about Windows, the inconsistent results
we're getting on OS X are certainly from failure to adhere to the spec.

I concur with Peter's opinion that this is material for a separate
patch, but it seems like it's one that had better go in first.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: multiple psql option -c

2015-11-30 Thread Michael Paquier
On Thu, Nov 26, 2015 at 4:21 AM, Pavel Stehule wrote:
> Attached patch per Tom Lane proposal.
>
> * multiple -c -f options are supported, the order of options is respected
> * the statements for one -c options are executed in transactions
> * Iacob's doc patch merged

 enum _actions
 {
ACT_NOTHING = 0,
-   ACT_SINGLE_SLASH,
ACT_LIST_DB,
-   ACT_SINGLE_QUERY,
-   ACT_FILE
+   ACT_FILE_STDIN
 };

Removing some items from the list of potential actions and creating a
new sublist listing action types is a bit weird. Why not grouping them
together and allow for example -l as well in the list of things that
is considered as a repeatable action? It seems to me that we could
simplify the code this way, and instead of ACT_NOTHING we could check
if the list of actions is empty or not.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential pointer dereference in plperl.c (caused by transforms patch)

2015-11-30 Thread Michael Paquier
On Sat, Nov 28, 2015 at 5:29 AM, Noah Misch wrote:
> fcinfo->flinfo->fn_oid==InvalidOid implies an inline block, and those have no
> arguments.  If it placates Coverity, I lean toward an assert-only change:

Oh, thanks. I missed this point.

> --- a/src/pl/plperl/plperl.c
> +++ b/src/pl/plperl/plperl.c
> @@ -2112,6 +2112,8 @@ plperl_call_perl_func(plperl_proc_desc *desc, 
> FunctionCallInfo fcinfo)
> EXTEND(sp, desc->nargs);
>
> +   /* Get signature for true functions; inline blocks have no args. */
> if (fcinfo->flinfo->fn_oid)
> get_func_signature(fcinfo->flinfo->fn_oid, , );
> +   Assert(nargs == desc->nargs);
>
> for (i = 0; i < desc->nargs; i++)

Yeah that looks fine.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-11-30 Thread Rahila Syed
Hello,

Thank you for your comments.
Please find attached patch addressing following comments ,

>- duplicate_oids error in HEAD.
Check.

>- a compiler warning:
>pgstat.c:2898: warning: no previous prototype for
‘pgstat_reset_activityflag’
Check.

>One more change you could do is 's/activityflag/activity_flag/g',
Check.

>Type of the flag of vacuum activity.
The flag variable is an integer to incorporate more commands in future.

>Type of st_progress_param and so.
st_progress_param is also given a generic name to incorporate different
parameters reported from various commands.

>st_progress_param_float is currently totally useless.
Float parameter has currently been removed from the patch.

>Definition of progress_message.
>The definition of progress_message in lazy_scan_heap is "char
>[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM]" which looks to be
>inversed.
Corrected.

>The current code subtracts the number of blocks when
>skipping_all_visible_blocks is set in two places. But I think
>it is enough to decrement when skipping.
In both the places, the pages are being skipped hence the total count was
decremented.

>He suggested to keep total_heap_pages fixed while adding number
>of skipped pages to that of scanned pages.
This has been done in the attached.

>snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
>  get_namespace_name(RelationGetNamespace(rel)),
>  relname);
Check.

The previous implementation used to add total number of pages across all
indexes of a relation to total_index_pages in every scan of
indexes to account for total pages scanned. Thus, it  was equal to number
of scans * total_index_pages.

In the attached patch, total_index_pages reflects total number of pages
across all indexes of a relation.
And the column to report passes through indexes (phase 2) has been added to
account for number of passes for index and heap vacuuming.
Number of scanned index pages is reset at the end of each pass.
This makes the reporting clearer.
The percent complete does not account for index pages. It just denotes
percentage of heap scanned.

>Spotted a potential oversight regarding report of scanned_pages. It
>seems pages that are skipped because of not getting a pin, being new,
>being empty could be left out of the progress equation.
Corrected.


>It's better to
>report that some other way, like use one of the strings to report a
>"phase" of processing that we're currently performing.
Has been included in the attached.

Some more comments need to be addressed which include name change of
activity flag, reporting only changed parameters to shared memory,
ACTIVITY_IS_VACUUM flag being set unnecessarily for ANALYZE and FULL
commands ,documentation for new view.
Also, finer grain reporting from indexes and heap truncate phase is yet to
be incorporated into the patch

Thank you,
Rahila Syed
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ccc030f..d53833e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -631,6 +631,19 @@ CREATE VIEW pg_stat_activity AS
 WHERE S.datid = D.oid AND
 S.usesysid = U.oid;
 
+CREATE VIEW pg_stat_vacuum_progress AS
+SELECT
+			S.pid,
+			S.table_name,
+			S.phase,
+			S.total_heap_pages,
+			S.scanned_heap_pages,
+S.percent_complete,
+S.total_index_pages,
+S.scanned_index_pages,
+S.index_scan_count
+FROM pg_stat_get_vacuum_progress() AS S;
+
 CREATE VIEW pg_stat_replication AS
 SELECT
 S.pid,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7c4ef58..e27a8f3 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -284,6 +284,7 @@ vacuum(int options, RangeVar *relation, Oid relid, VacuumParams *params,
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
 
+		pgstat_report_activity_flag(ACTIVITY_IS_VACUUM);
 		/*
 		 * Loop to process each selected relation.
 		 */
@@ -325,6 +326,7 @@ vacuum(int options, RangeVar *relation, Oid relid, VacuumParams *params,
 	{
 		in_vacuum = false;
 		VacuumCostActive = false;
+		pgstat_reset_activity_flag();
 		PG_RE_THROW();
 	}
 	PG_END_TRY();
@@ -355,6 +357,7 @@ vacuum(int options, RangeVar *relation, Oid relid, VacuumParams *params,
 		vac_update_datfrozenxid();
 	}
 
+	pgstat_reset_activity_flag();
 	/*
 	 * Clean up working storage --- note we must do this after
 	 * StartTransactionCommand, else we might be trying to delete the active
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 2429889..1c74b51 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -439,9 +439,14 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool scan_all)
 {
 	BlockNumber nblocks,
-blkno;
+blkno,
+total_heap_pages,
+scanned_heap_pages = 0,
+

Re: [HACKERS] Proposal: Trigonometric functions in degrees

2015-11-30 Thread Michael Paquier
On Fri, Nov 27, 2015 at 6:33 PM, Dean Rasheed  wrote:
>
> On 11 November 2015 at 11:45, Dean Rasheed  wrote:
> > Thanks for testing. I'll post an updated patch sometime soon.
> >
>
> I finally got round to looking at this again, and here is an updated patch.

Cool, thanks!

> I've reverted the changes to the CHECKFLOATVAL() checks in the
> existing functions, so that can be a separate discussion. As for the
> checks in the new functions added by this patch, I've left them as
> they were on the grounds that the checks are taking place after
> argument reduction, so the arguments will not be infinite  at that
> point (and besides, I think the checks are correct as written anyway).

On this one, I agree less for the new node but I am fine to let the
committer take the final shot. Making things similar to the existing
functions seems the best approach to me though.

> I've reduced the regression tests down to checks of the cases where
> the results should be exact, so they should now be
> platform-independent. Many of the original tests were useful during
> development to ensure that sane-looking answers were being returned,
> but they didn't really add anything as regression tests, other than
> extra complication due to platform variations.

That's definitely a better thing to do. I got surprised as well for
example by how things behave differently on OSX, Linux and Windows
when testing your patch :)

+* Stitch together inverse sine and cosine functions for the ranges
+* [0, 0.5] and [0.5, 1].  Each expression below is guaranteed to return
+* exactly 30 for x=0.5, so the result is a continuous
monotonic function
+* over the full range.

+* Stitch together inverse sine and cosine functions for the ranges
+* [0, 0.5] and [0.5, 1].  Each expression below is guaranteed to return
+* exactly 60 for x=0.5, so the result is a continuous
monotonic function
+* over the full range.

Those two should mention [0,0.5] and (0.5,1]. 0.5 is only part of the
first portion. There are a couple of places where that's not exact as
well, but no real big deal.

Now on OSX the following things are inconsistent:
1) acos:
=# select acosd(1.1);
ERROR:  22003: input is out of range
LOCATION:  dacosd, float.c:1752
Time: 0.623 ms
=# select acos(1.1);
 acos
--
  NaN
(1 row)
=# select asind('Infinity'::float8);
ERROR:  22003: input is out of range
LOCATION:  dasind, float.c:1778
2) asin:
=# select asind(1.1);
ERROR:  22003: input is out of range
LOCATION:  dasind, float.c:1778
=# select asin(1.1);
 asin
--
  NaN
(1 row)
Instinctively, it seems to me that we had better return Nan for the
new asind and acosd when being out of range for OSX, Linux will
complain about an out-of-range error so the code is right in this
case.

3) those ones as well are interesting:
=# select tand(180);
 tand
--
   -0
(1 row)
=# select cotd(-90);
 cotd
--
   -0

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PoC] Asynchronous execution again (which is not parallel)

2015-11-30 Thread Kyotaro HORIGUCHI
Hello, the parallel scan became to work. So I'd like to repropose
the 'asynchronous execution' or 'early execution'.

In previous proposal, I had only foreign scan as workable
example, but now I can use the parallel execution instead to make
this distinctive from parallel execution itself.

I could put more work on this before proposal but I'd like to
show this at this time in order to judge wheter this deserves
further work.


 Overview of asynchronos execution

"Asynchronous execution" is a feature to start substantial work
of nodes before doing Exec*. This can reduce total startup time
by folding startup time of multiple execution nodes. Especially
effective for the combination of joins or appends and their
multiple children that needs long time to startup.

This patch does that by inserting another phase "Start*" between
ExecInit* and Exec* to launch parallel processing including
pgworker and FDWs before requesting the very first tuple of the
result.

 About this patch

As a proof of concept, the first tree patchs adds such start
phase to executor and add facility to trace node status for
almost all kind of the executor nodes (Part of this would be
useless, though). Then the two last implement an example usage of
the infrastracture.

The two introduced GUCs enable_parasortmerge and
enable_asyncexec respecively controls whether to use gather for
sorts under merge join and whether to make asyncronous execution
effective.

For evaluation, I made merge join to use bgworker for some
codition as an example. It is mere a mock implement but enough to
show the difference between parallel execution and async
execution (More appropriate names are welcome) and its
effectiveness. Thanks for Amit's great work.

 Performance test

Apply all the patches then do the following in order. Of course
this test is artificially made so that this patch wins:)

CREATE TABLE t1 (a int, b int);
CREATE TABLE t2 (a int, b int);
CREATE TABLE t3 (a int, b int);
INSERT INTO t1 (SELECT (a / 1000) + (a % 1000) * 1000, a FROM 
generate_series(0, 99) a);
INSERT INTO t2 (SELECT (a / 1000) + (a % 1000) * 1000, a FROM 
generate_series(0, 99) a);
INSERT INTO t3 (SELECT (a / 1000) + (a % 1000) * 1000, a FROM 
generate_series(0, 99) a);
ANALYZE t1;
ANALYZE t2;
ANALYZE t3;
SET enable_nestloop TO true;
SET enable_hashjoin TO true;
SET enable_material TO true;
SET enable_parasortmerge TO false;
SET enable_asyncexec TO false;
EXPLAIN (COSTS off, ANALYZE) SELECT * FROM t1 JOIN t2 ON (t1.a = t2.a) JOIN t3 
on (t1.a = t3.a) ORDER BY t1.a LIMIT 10;
SET enable_nestloop TO false;
SET enable_hashjoin TO false;
SET enable_material TO false;
EXPLAIN (COSTS off, ANALYZE) SELECT * FROM t1 JOIN t2 ON (t1.a = t2.a) JOIN t3 
on (t1.a = t3.a) ORDER BY t1.a LIMIT 10;
SET enable_parasortmerge TO true;
EXPLAIN (COSTS off, ANALYZE) SELECT * FROM t1 JOIN t2 ON (t1.a = t2.a) JOIN t3 
on (t1.a = t3.a) ORDER BY t1.a LIMIT 10;
SET enable_asyncexec TO true;
EXPLAIN (COSTS off, ANALYZE) SELECT * FROM t1 JOIN t2 ON (t1.a = t2.a) JOIN t3 
on (t1.a = t3.a) ORDER BY t1.a LIMIT 10;

 Test results

On my environment, the following results were given.

- The first attempt, planner chooses hash join plan and it takes about 3.3s.

- The second, Merge Joins are done in single backend, takes about 5.1s.

- The third, simply use parallel execution of MJ, takes about 5.8s

- The fourth, start execution asynchronously of MJ, takes about 3.0s.

So asynchronous exeuction at least accelerates parallel execution
for this case, even faster than the current fastest (maybe) plan.


== TODO or random thoughts, not restricted on this patch.

- This patch doesn't contain planner part, it must be aware of
  async execution in order that this can be  in effective.

- Some measture to control execution on bgworker would be
  needed. At least merge join requires position mark/reset
  functions.

- Currently, more tuples make reduce effectiveness of parallel
  execution, some method to transfer tuples in larger unit would
  be needed, or would be good to have shared workmem?

- The term "asynchronous execution" looks a little confusing with
  paralle execution. Early execution/start might be usable but
  I'm not so confident.


Any suggestions? thoughts?

I must apologize for the incomplete proposal and cluttered thoughts.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 415d4d0784e45c066b727a0a18716dd449aea044 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 8 Jul 2015 11:48:12 +0900
Subject: [PATCH 1/5] Add infrastructure for executor node run state.

This infrastructure expands the node state from what ResultNode did to
general form having four states.

The states are Inited, Started, Running and Done. Running and Done are
the same as what rs_done of ResultNode indicated. Inited state
indiates that the node has been initialized but not executed. Started
state indicates that the node has been executed 

Re: [HACKERS] Proposal: Trigonometric functions in degrees

2015-11-30 Thread Michael Paquier
On Mon, Nov 30, 2015 at 10:36 PM, Michael Paquier wrote:
> Instinctively, it seems to me that we had better return Nan for the
> new asind and acosd when being out of range for OSX, Linux will
> complain about an out-of-range error so the code is right in this
> case.

This is still mentioned upthread btw. And it does not seem to be a
good idea to change this platform dependent behavior by throwing an
error on the old functions, neither does it seem user-friendly to have
inconsistent results for the XXX function and its XXXd equivalent.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Trigonometric functions in degrees

2015-11-30 Thread Michael Paquier
On Mon, Nov 30, 2015 at 11:11 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Mon, Nov 30, 2015 at 10:36 PM, Michael Paquier wrote:
>>> Instinctively, it seems to me that we had better return Nan for the
>>> new asind and acosd when being out of range for OSX, Linux will
>>> complain about an out-of-range error so the code is right in this
>>> case.
>
>> This is still mentioned upthread btw. And it does not seem to be a
>> good idea to change this platform dependent behavior by throwing an
>> error on the old functions, neither does it seem user-friendly to have
>> inconsistent results for the XXX function and its XXXd equivalent.
>
> FWIW, I think that probably the best course of action is to go ahead
> and install POSIX-compliant error checking in the existing functions
> too.  POSIX:2008 is quite clear about this:
>
> : An application wishing to check for error situations should set errno to
> : zero and call feclearexcept(FE_ALL_EXCEPT) before calling these
> : functions. On return, if errno is non-zero or fetestexcept(FE_INVALID |
> : FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW) is non-zero, an error has
> : occurred.

OK, I have to admit I didn't know this part.

> Although I'm not too sure about Windows, the inconsistent results
> we're getting on OS X are certainly from failure to adhere to the spec.

Windows complains about out-of-range errors contrary to OSX on for
example asin or acos. So for once Linux and Windows agree with each
other.

> I concur with Peter's opinion that this is material for a separate
> patch, but it seems like it's one that had better go in first.

(I think you mean Dean here and not Peter).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some questions about the array.

2015-11-30 Thread YUriy Zhuravlev
The new version of the patch.

On Friday 27 November 2015 17:23:35 Teodor Sigaev wrote:
> 1
> Documentation isn't very informative
Added example with different results.

> 2
> Seems, error messages are too inconsistent. If you forbid omitting bound in 
> assigment then if all cases error message should be the same or close.
Done.  Skipping lower boundary is no longer an error. 

Thank you for your review.

-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml
index 4385a09..5a51e07 100644
--- a/doc/src/sgml/array.sgml
+++ b/doc/src/sgml/array.sgml
@@ -257,6 +257,25 @@ SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';
 (1 row)
 
 
+  You can skip the lower-bound or upper-bound
+  for get first or last element in slice.
+
+
+SELECT schedule[:][:] FROM sal_emp WHERE name = 'Bill';
+
+schedule
+
+ {{meeting,lunch},{training,presentation}}
+(1 row)
+
+SELECT schedule[:2][2:] FROM sal_emp WHERE name = 'Bill';
+
+schedule
+
+ {{lunch},{presentation}}
+(1 row)
+
+
   If any dimension is written as a slice, i.e., contains a colon, then all
   dimensions are treated as slices.  Any dimension that has only a single
   number (no colon) is treated as being from 1
diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 29f058c..6643714 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -268,10 +268,12 @@ ExecEvalArrayRef(ArrayRefExprState *astate,
 	bool		eisnull;
 	ListCell   *l;
 	int			i = 0,
-j = 0;
+j = 0,
+indexexpr;
 	IntArray	upper,
 lower;
 	int		   *lIndex;
+	AnyArrayType *arrays;
 
 	array_source = ExecEvalExpr(astate->refexpr,
 econtext,
@@ -293,6 +295,7 @@ ExecEvalArrayRef(ArrayRefExprState *astate,
 	foreach(l, astate->refupperindexpr)
 	{
 		ExprState  *eltstate = (ExprState *) lfirst(l);
+		eisnull = false;
 
 		if (i >= MAXDIM)
 			ereport(ERROR,
@@ -300,10 +303,23 @@ ExecEvalArrayRef(ArrayRefExprState *astate,
 	 errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
 			i + 1, MAXDIM)));
 
-		upper.indx[i++] = DatumGetInt32(ExecEvalExpr(eltstate,
-	 econtext,
-	 ,
-	 NULL));
+		if (eltstate == NULL && astate->refattrlength <= 0)
+		{
+			if (isAssignment)
+ereport(ERROR,
+	(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+	 errmsg("cannot determine upper index for empty array")));
+			arrays = (AnyArrayType *)DatumGetArrayTypeP(array_source);
+			indexexpr = AARR_LBOUND(arrays)[i] + AARR_DIMS(arrays)[i] - 1;
+		}
+		else
+			indexexpr = DatumGetInt32(ExecEvalExpr(eltstate,
+   econtext,
+   ,
+   NULL));
+
+		upper.indx[i++] = indexexpr;
+
 		/* If any index expr yields NULL, result is NULL or error */
 		if (eisnull)
 		{
@@ -321,6 +337,7 @@ ExecEvalArrayRef(ArrayRefExprState *astate,
 		foreach(l, astate->reflowerindexpr)
 		{
 			ExprState  *eltstate = (ExprState *) lfirst(l);
+			eisnull = false;
 
 			if (j >= MAXDIM)
 ereport(ERROR,
@@ -328,10 +345,20 @@ ExecEvalArrayRef(ArrayRefExprState *astate,
 		 errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)",
 j + 1, MAXDIM)));
 
-			lower.indx[j++] = DatumGetInt32(ExecEvalExpr(eltstate,
-		 econtext,
-		 ,
-		 NULL));
+			if (eltstate == NULL)
+			{
+arrays = (AnyArrayType *)DatumGetArrayTypeP(array_source);
+indexexpr = AARR_LBOUND(arrays)[j];
+			}
+			else
+indexexpr = DatumGetInt32(ExecEvalExpr(eltstate,
+	   econtext,
+	   ,
+	   NULL));
+
+			lower.indx[j++] = indexexpr;
+
+
 			/* If any index expr yields NULL, result is NULL or error */
 			if (eisnull)
 			{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 26264cb..a761263 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2417,6 +2417,8 @@ _copyAIndices(const A_Indices *from)
 
 	COPY_NODE_FIELD(lidx);
 	COPY_NODE_FIELD(uidx);
+	COPY_SCALAR_FIELD(lidx_default);
+	COPY_SCALAR_FIELD(uidx_default);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index aa6e102..e75b448 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2162,6 +2162,8 @@ _equalAIndices(const A_Indices *a, const A_Indices *b)
 {
 	COMPARE_NODE_FIELD(lidx);
 	COMPARE_NODE_FIELD(uidx);
+	COMPARE_SCALAR_FIELD(lidx_default);
+	COMPARE_SCALAR_FIELD(uidx_default);
 
 	return true;
 }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 012c14b..ed77c75 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2773,6 +2773,8 @@ _outA_Indices(StringInfo str, const A_Indices *node)
 
 	WRITE_NODE_FIELD(lidx);
 	WRITE_NODE_FIELD(uidx);
+	

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-30 Thread Alvaro Herrera
Michael Paquier wrote:
> On Mon, Nov 30, 2015 at 6:28 AM, Noah Misch  wrote:

> > The proposed code is short on guidance about when to put a function in 
> > TestLib
> > versus TestBase.  TestLib has no header comment.  The TestBase header 
> > comment
> > would permit, for example, command_ok() in that module.  I would try instead
> > keeping TestLib as the base module and moving into PostgresNode the 
> > functions
> > that deal with PostgreSQL clusters (get_new_node teardown_node psql
> > poll_query_until issues_sql_like).
> 
> PostgresNode is wanted to be a base representation of how of node is,
> not of how to operate on it. The ways to perform the tests, which
> works on a node, is wanted as a higher-level operation.
> 
> Logging and base configuration of a test set is a lower level of
> operations than PostgresNode, because cluster nodes need actually to
> perform system calls, some of those system calls like run_log allowing
> to log in the centralized log file. I have tried to make the headers
> of those modules more verbose, please see attached.

I'm not terribly convinced by this argument TBH.  Perhaps we can have
PostgresNode be one package, and the logging routines be another
package, and we create a higher-level package whose @ISA=(PostgresNode,
LoggingWhatever) and then we move the routines suggested by Noah into
that new package.  Then the tests use that instead of PostgresNode
directly.


> > > --- a/src/bin/pg_rewind/t/001_basic.pl
> > > +++ b/src/bin/pg_rewind/t/001_basic.pl
> > > @@ -1,9 +1,11 @@
> > > +# Basic pg_rewind test.
> > > +
> > >  use strict;
> > >  use warnings;
> > > -use TestLib;
> > > -use Test::More tests => 8;
> > >
> > >  use RewindTest;
> > > +use TestLib;
> > > +use Test::More tests => 8;
> >
> > Revert all changes to this file.  Audit the rest of the patch for whitespace
> > change unrelated to the subject.
> 
> Done.

I perltidied several files, though not consistently.  Regarding this
particular hunk, what is going on here is that I moved "use strict;use
warnings" as one stanza, followed by all the other "use" lines as
another stanza, alphabetically.  It was previously a bit messy, with
@EXPORTS and other stuff in between "use" lines.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers