Re: code of partition split

2018-07-22 Thread Chenxi Li
Hi,

Sorry that I misunderstood. The feature is provided in EnterpriseDB version
of Postgresql.

BTW, do you think it is needed?

Best Regards,
Chenxi Li

2018-07-23 13:05 GMT+08:00 amul sul :

> On Mon, Jul 23, 2018 at 8:38 AM Chenxi Li  wrote:
> >
> > Hi hackers,
> >
> > I'm interested in partition split and want to figure out how it works.
> Could someone tell me where the code is located? Thanks very much.
> >
> AFAIK, we don't have this SPLIT partition feature, yet.
>
> Regards,
> Amul
>


Re: JIT breaks PostGIS

2018-07-22 Thread Andres Freund
Hi,

On 2018-07-21 23:14:47 +0300, Darafei "Komяpa" Praliaskouski wrote:
> Today I spent some time closing PostGIS tickets in preparation to Monday's
> release.
> 
> One of the blockers, https://trac.osgeo.org/postgis/ticket/4125, was filed
> by Postgres APT repository maintainer Christoph Berg who noticed a test
> suite failure on Debian Stretch with Postgres 11.
> 
> Upon investigation we found:
>  - A build for Ubuntu Bionic installed on Debian Stretch passes the suite,
> requiring llvm6;
>  - A build for Debian Stretch fails the suite on a call to external library
> GEOS, showing no traces of JIT in the stacktrace;
>  - Setting jit=off lets the suite pass;
>  - The query called in clean session by itself does not crash Postgres.
> Queries above it are required to reproduce the crash;
>  - The crash affects not only Stretch, but customly collected Postgres 11 /
> clang 3.9 on Travis CI running Ubuntu Trusty:
> https://github.com/postgis/postgis/pull/262.
> 
> I suspect that a fix would require to bisect llvm/clang version which stops
> showing this behavior and making it a new minimum for JIT, if this is not a
> symptom of bigger (memory management?) problem.

It looks to me like it's a LLVM issue, specifically 
https://bugs.llvm.org/show_bug.cgi?id=34424
fixed in LLVM 5+.

It'll only be an issue for extensions that throw c++ style exceptions. I
don't think that rises to the level of disallowing any LLVM version <
5.0. I suggest postgis adds an error check to its buildprocess that
refuses to run if jit is enabled and a too old version is used?

Greetings,

Andres Freund



Re: code of partition split

2018-07-22 Thread amul sul
On Mon, Jul 23, 2018 at 8:38 AM Chenxi Li  wrote:
>
> Hi hackers,
>
> I'm interested in partition split and want to figure out how it works. Could 
> someone tell me where the code is located? Thanks very much.
>
AFAIK, we don't have this SPLIT partition feature, yet.

Regards,
Amul



Re: code of partition split

2018-07-22 Thread Amit Langote
Hi.

On 2018/07/23 12:08, Chenxi Li wrote:
> Hi hackers,
> 
> I'm interested in partition split and want to figure out how it works.
> Could someone tell me where the code is located? Thanks very much.

Can you tell what you mean by "partition split"?  I don't know of an
existing feature / command that can be used to split a partition.

Thanks,
Amit




Fix calculations on WAL recycling.

2018-07-22 Thread Kyotaro HORIGUCHI
I'll register this to the next commitfest.

https://www.postgresql.org/message-id/20180719.125117.155470938.horiguchi.kyot...@lab.ntt.co.jp

> While considering this, I found a bug in 4b0d28de06, which
> removed prior checkpoint from control file. It actually trims the
> segments before the last checkpoint's redo segment but recycling
> is still considered based on the *prevous* checkpoint. As the
> result min_wal_size doesn't work as told.  Specifically, setting
> min/max_wal_size to 48MB and advance four or more segments then
> two checkpoints leaves just one segment, which is less than
> min_wal_size.
> 
> The attached patch fixes that. One arguable point on this would
> be the removal of the behavior when RemoveXLogFile(name,
> InvalidXLogRecPtr, ..).
> 
> The only place calling the function with the parameter is
> timeline switching. Previously unconditionally 10 segments are
> recycled after switchpoint but the reason for the behavior is we
> didn't have the information on previous checkpoint at hand at the
> time. But now we can use the timeline switch point as the
> approximate of the last checkpoint's redo point and this allows
> us to use min/max_wal_size properly at the time.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

>From f2b1a0b6360263d4ddf725075daf4b56800e3e18 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 19 Jul 2018 12:13:56 +0900
Subject: [PATCH] Fix calculation base of WAL recycling

The commit 4b0d28de06 removed the prior checkpoint and related things
but that leaves WAL recycling based on the prior checkpoint. This
makes max_wal_size and min_wal_size work incorrectly. This patch makes
WAL recycling be based on the last checkpoint.
---
 src/backend/access/transam/xlog.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4049deb968..d7a61af8f1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2287,7 +2287,7 @@ assign_checkpoint_completion_target(double newval, void *extra)
  * XLOG segments? Returns the highest segment that should be preallocated.
  */
 static XLogSegNo
-XLOGfileslop(XLogRecPtr PriorRedoPtr)
+XLOGfileslop(XLogRecPtr RedoRecPtr)
 {
 	XLogSegNo	minSegNo;
 	XLogSegNo	maxSegNo;
@@ -2299,9 +2299,9 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr)
 	 * correspond to. Always recycle enough segments to meet the minimum, and
 	 * remove enough segments to stay below the maximum.
 	 */
-	minSegNo = PriorRedoPtr / wal_segment_size +
+	minSegNo = RedoRecPtr / wal_segment_size +
 		ConvertToXSegs(min_wal_size_mb, wal_segment_size) - 1;
-	maxSegNo = PriorRedoPtr / wal_segment_size +
+	maxSegNo = RedoRecPtr / wal_segment_size +
 		ConvertToXSegs(max_wal_size_mb, wal_segment_size) - 1;
 
 	/*
@@ -2316,7 +2316,7 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr)
 	/* add 10% for good measure. */
 	distance *= 1.10;
 
-	recycleSegNo = (XLogSegNo) ceil(((double) PriorRedoPtr + distance) /
+	recycleSegNo = (XLogSegNo) ceil(((double) RedoRecPtr + distance) /
 	wal_segment_size);
 
 	if (recycleSegNo < minSegNo)
@@ -3896,12 +3896,12 @@ RemoveTempXlogFiles(void)
 /*
  * Recycle or remove all log files older or equal to passed segno.
  *
- * endptr is current (or recent) end of xlog, and PriorRedoRecPtr is the
- * redo pointer of the previous checkpoint. These are used to determine
+ * endptr is current (or recent) end of xlog, and RedoRecPtr is the
+ * redo pointer of the last checkpoint. These are used to determine
  * whether we want to recycle rather than delete no-longer-wanted log files.
  */
 static void
-RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
+RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr RedoRecPtr, XLogRecPtr endptr)
 {
 	DIR		   *xldir;
 	struct dirent *xlde;
@@ -3944,7 +3944,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
 /* Update the last removed location in shared memory first */
 UpdateLastRemovedPtr(xlde->d_name);
 
-RemoveXlogFile(xlde->d_name, PriorRedoPtr, endptr);
+RemoveXlogFile(xlde->d_name, RedoRecPtr, endptr);
 			}
 		}
 	}
@@ -4006,9 +4006,11 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
 			 * remove it yet. It should be OK to remove it - files that are
 			 * not part of our timeline history are not required for recovery
 			 * - but seems safer to let them be archived and removed later.
+			 * Here, switchpoint is a good approximate of RedoRecPtr for
+			 * RemoveXlogFile since we have just done timeline switching.
 			 */
 			if (!XLogArchiveIsReady(xlde->d_name))
-RemoveXlogFile(xlde->d_name, InvalidXLogRecPtr, switchpoint);
+RemoveXlogFile(xlde->d_name, switchpoint, switchpoint);
 		}
 	}
 
@@ -4018,14 +4020,12 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI)
 /*
  * Recycle or remove a log file that's no longer 

Re: Possible performance regression in version 10.1 with pgbench read-write tests.

2018-07-22 Thread Thomas Munro
On Mon, Jul 23, 2018 at 3:40 PM, Thomas Munro
 wrote:
> I did some testing on 2-node, 4-node and 8-node systems running Linux
> 3.10.something (slightly newer but still ancient).  Only the 8-node
> box (= same one Mithun used) shows the large effect (the 2-node box
> may be a tiny bit faster patched but I'm calling that noise for now...
> it's not slower, anyway).

(I forgot to add that the 4 node system that showed no change is POWER
architecture.)

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Should contrib modules install .h files?

2018-07-22 Thread Alvaro Herrera
On 2018-Jul-23, Tom Lane wrote:

> Michael Paquier  writes:
> > On Sun, Jul 22, 2018 at 09:42:08PM -0400, Stephen Frost wrote:
> >> So, +1 from me for having a directory for each extension.
> 
> > So, like Stephen, that's a +1 from me.
> 
> Same here.  One-file-per-extension is too strongly biased to tiny
> extensions (like most of our contrib examples).
> 
> I don't have a real strong opinion on whether it's too late to
> push this into v11.  I do not think it'd break anything other than
> packagers' lists of files to be installed ... but it does seem
> like a new feature, and we're past feature freeze.

Frankly, I'd rather make things as easy as possible for third-party
extension writers.  I'd go as far as backpatching further (considering
transforms were introduced in 9.5) but I hesitate on that, because of
the packagers argument.  pg11 seems fair game to me, though.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Should contrib modules install .h files?

2018-07-22 Thread Tom Lane
Michael Paquier  writes:
> On Sun, Jul 22, 2018 at 09:42:08PM -0400, Stephen Frost wrote:
>> So, +1 from me for having a directory for each extension.

> So, like Stephen, that's a +1 from me.

Same here.  One-file-per-extension is too strongly biased to tiny
extensions (like most of our contrib examples).

I don't have a real strong opinion on whether it's too late to
push this into v11.  I do not think it'd break anything other than
packagers' lists of files to be installed ... but it does seem
like a new feature, and we're past feature freeze.

regards, tom lane



Re: Non-portable shell code in pg_upgrade tap tests

2018-07-22 Thread Tom Lane
Michael Paquier  writes:
> On Sun, Jul 22, 2018 at 02:53:51PM -0400, Tom Lane wrote:
>> A quick trawl of the buildfarm logs says most of our animals compute
>> SHELL = /bin/sh anyway, and so would be unaffected.  There's a sizable
>> population that find /bin/bash though, and one active critter that finds
>> /bin/ksh.

> Except for the FreeBSD boxes, right?  I thought that using directly
> /bin/ and not /usr/local/bin/ was considered an abuse of Linux in their
> universe.

No, I'm pretty sure that "sh" wins a place in /bin even on FreeBSD ;-)

regards, tom lane



Re: Tips on committing

2018-07-22 Thread Tom Lane
Noah Misch  writes:
> I agree we won't all want the exact same checklist.  Still, it wouldn't hurt
> to have a wiki page of checklist entry ideas from which folks cherry-pick the
> entries they like.

013f320dc reminds me of something I check for religiously: look for
alternative output files for any regression test you're updating the
output of.

Actually updating said files, once you notice you need to, can be tricky
in itself.  Most of the time it works to just apply the same patch to the
other variants as the delta you're observing for the output file that's
relevant to your own platform.  Or you may be able to change configuration
so as to replicate the correct output for another alternative file.  But
sometimes you just have to guess (and then watch the buildfarm to see if
you guessed right).

regards, tom lane



Re: Possible performance regression in version 10.1 with pgbench read-write tests.

2018-07-22 Thread Thomas Munro
On Sun, Jul 22, 2018 at 8:19 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2018-07-20 16:43:33 -0400, Tom Lane wrote:
>>> On my RHEL6 machine, with unmodified HEAD and 8 sessions (since I've
>>> only got 8 cores) but other parameters matching Mithun's example,
>>> I just got
>
>> It's *really* common to have more actual clients than cpus for oltp
>> workloads, so I don't think it's insane to test with more clients.
>
> I finished a set of runs using similar parameters to Mithun's test except
> for using 8 clients, and another set using 72 clients (but, being
> impatient, 5-minute runtime) just to verify that the results wouldn't
> be markedly different.  I got TPS numbers like this:
>
> 8 clients   72 clients
>
> unmodified HEAD 16112   16284
> with padding patch  16096   16283
> with SysV semas 15926   16064
> with padding+SysV   15949   16085
>
> This is on RHEL6 (kernel 2.6.32-754.2.1.el6.x86_64), hardware is dual
> 4-core Intel E5-2609 (Sandy Bridge era).  This hardware does show NUMA
> effects, although no doubt less strongly than Mithun's machine.
>
> I would like to see some other results with a newer kernel.  I tried to
> repeat this test on a laptop running Fedora 28, but soon concluded that
> anything beyond very short runs was mainly going to tell me about thermal
> throttling :-(.  I could possibly get repeatable numbers from, say,
> 1-minute SELECT-only runs, but that would be a different test scenario,
> likely one with a lot less lock contention.

I did some testing on 2-node, 4-node and 8-node systems running Linux
3.10.something (slightly newer but still ancient).  Only the 8-node
box (= same one Mithun used) shows the large effect (the 2-node box
may be a tiny bit faster patched but I'm calling that noise for now...
it's not slower, anyway).

On the problematic box, I also tried some different strides (char
padding[N - sizeof(sem_t)]) and was surprised by the result:

Unpatched = ~35k TPS
64 byte stride = ~35k TPS
128 byte stride = ~42k TPS
4096 byte stride = ~47k TPS

Huh.  PG_CACHE_LINE_SIZE is 128, but the true cache line size on this
system is 64 bytes.  That exaggeration turned out to do something
useful, though I can't explain it.

While looking for discussion of 128 byte cache effects I came across
the Intel "L2 adjacent cache line prefetcher"[1].  Maybe this, or some
of the other prefetchers (enabled in the BIOS) or related stuff could
be at work here.  It could be microarchitecture-dependent (this is an
old Westmere box), though I found a fairly recent discussion about a
similar effect[2] that mentions more recent hardware.  The spatial
prefetcher reference can be found in the Optimization Manual[3].

[1] 
https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors
[2] 
https://groups.google.com/forum/#!msg/mechanical-sympathy/i3-M2uCYTJE/P7vyoOTIAgAJ
[3] 
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Non-portable shell code in pg_upgrade tap tests

2018-07-22 Thread Michael Paquier
On Sun, Jul 22, 2018 at 02:53:51PM -0400, Tom Lane wrote:
> Oh!  Hm, I wonder whether we shouldn't do that, ie add SHELL = @SHELL@
> to Makefile.global.in.

That sounds like a good idea to me.

> A quick trawl of the buildfarm logs says most of our animals compute
> SHELL = /bin/sh anyway, and so would be unaffected.  There's a sizable
> population that find /bin/bash though, and one active critter that finds
> /bin/ksh.

Except for the FreeBSD boxes, right?  I thought that using directly
/bin/ and not /usr/local/bin/ was considered an abuse of Linux in their
universe.
--
Michael


signature.asc
Description: PGP signature


Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Michael Paquier
On Sun, Jul 22, 2018 at 10:19:50AM -0400, Fabien COELHO wrote:
>> Agreed. I have changed handling of the --help and --version options in all 
>> apps
>> where it exhibits the problem described, with the exception for 
>> pg_archivecleanup
>> where getopt is used instead of getopt_long. The separate patch will be 
>> proposed
>> to address it.
>> 
>> The patch is against current master. All tests pass.
> 
> I doubt that -V & -? are heavily tested:-) Patch works for me, though.

They are not, and the patch misses this area.

I don't think that it is a bad idea to improve things the way you are
doing, however you should extend program_version_ok() and
program_help_ok() in src/test/perl/TestLib.pm so as short options are
tested for two reasons:
1) We can make sure that everything is consistent and works properly
easily without testing manually your patch, which is a pain for anybody
looking at the patch.
2) Any new standalone binary added in the core tree would be able to
check natively if it handles common options correctly with TAP tests
added.

This will need tweaks for the number of tests in a couple of TAP files,
but that's worth the shot in the long term.
--
Michael


signature.asc
Description: PGP signature


Re: Should contrib modules install .h files?

2018-07-22 Thread Michael Paquier
On Sun, Jul 22, 2018 at 09:42:08PM -0400, Stephen Frost wrote:
> When I think about the demands of extensions, I tend to consider PostGIS
> the prime example and I certainly would understand if they wanted to
> install multiple headers (they have some 72 .h files from what I'm
> seeing...).
> 
> So, +1 from me for having a directory for each extension.

Definitely.  If we were to choose the one-file per extension choice,
most large extension maintainers would logically scream at us.  If for
example you look at Citus, in src/include/distributed there are a bunch
of them.  Then based on that folks could always tweak their CFLAGS
pointing to the path of the extension if they need to.

We cannot ensure either that multiple extensions do not use the same
header file names, which discards any design using a single installation
location with multiple files.

So, like Stephen, that's a +1 from me.
--
Michael


signature.asc
Description: PGP signature


Re: Add SKIP LOCKED to VACUUM and ANALYZE

2018-07-22 Thread Michael Paquier
On Thu, Jul 19, 2018 at 10:54:23AM +0900, Michael Paquier wrote:
> If you can get this refactoring done, let's look into getting those two
> parts committed first to simplify the next steps.  No need to extend the
> grammar of cluster either.  The set of options for CLUSTER should be a
> separate enum list on top of ClusterStmt in parsenodes.h, as VACUUM has
> no recheck option.  I would recommend to get cluster_rel reshaped first,
> and the ereport() calls done after.

The refactoring for CLUSTER is pretty obvious, and makes the API a bit
cleaner, so attached is a proposal of patch to do so.  Thoughts?
--
Michael
From ce4cd47c78f917827d94b13bac99457bc9d17223 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 23 Jul 2018 12:10:15 +0900
Subject: [PATCH] Refactor ClusterStmt to handle more options

This extends cluster_rel() in such a way that more options can be added
in the future, which will reduce the amount of chunk code for an
upcoming SKIP_LOCKED aimed for VACUUM.  As VACUUM FULL is a different
flavor of CLUSTER, we want to make that extensible as well to ease
integration.

This only reworks the API and its callers, without providing anything
user-facing.  Two options are present now: verbose mode and relation
recheck when doing the cluster work across multiple transactions.
---
 src/backend/commands/cluster.c |  9 ++---
 src/backend/commands/vacuum.c  |  8 ++--
 src/backend/nodes/copyfuncs.c  |  2 +-
 src/backend/nodes/equalfuncs.c |  2 +-
 src/backend/parser/gram.y  | 12 +---
 src/include/commands/cluster.h |  3 +--
 src/include/nodes/parsenodes.h |  8 +++-
 7 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 0112a87224..68be470977 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -186,7 +186,7 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 		heap_close(rel, NoLock);
 
 		/* Do the job. */
-		cluster_rel(tableOid, indexOid, false, stmt->verbose);
+		cluster_rel(tableOid, indexOid, stmt->options);
 	}
 	else
 	{
@@ -234,7 +234,8 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
 			/* functions in indexes may want a snapshot set */
 			PushActiveSnapshot(GetTransactionSnapshot());
 			/* Do the job. */
-			cluster_rel(rvtc->tableOid, rvtc->indexOid, true, stmt->verbose);
+			cluster_rel(rvtc->tableOid, rvtc->indexOid,
+		stmt->options | CLUOPT_RECHECK);
 			PopActiveSnapshot();
 			CommitTransactionCommand();
 		}
@@ -265,9 +266,11 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
  * and error messages should refer to the operation as VACUUM not CLUSTER.
  */
 void
-cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
+cluster_rel(Oid tableOid, Oid indexOid, int options)
 {
 	Relation	OldHeap;
+	bool		verbose = ((options & CLUOPT_VERBOSE) != 0);
+	bool		recheck = ((options & CLUOPT_RECHECK) != 0);
 
 	/* Check for user-requested abort. */
 	CHECK_FOR_INTERRUPTS();
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bd0f04c7e2..5736f12b8f 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1551,13 +1551,17 @@ vacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params)
 	 */
 	if (options & VACOPT_FULL)
 	{
+		int			options = 0;
+
 		/* close relation before vacuuming, but hold lock until commit */
 		relation_close(onerel, NoLock);
 		onerel = NULL;
 
+		if ((options & VACOPT_VERBOSE) != 0)
+			options |= CLUOPT_VERBOSE;
+
 		/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
-		cluster_rel(relid, InvalidOid, false,
-	(options & VACOPT_VERBOSE) != 0);
+		cluster_rel(relid, InvalidOid, options);
 	}
 	else
 		lazy_vacuum_rel(onerel, options, params, vac_strategy);
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 96836ef19c..17b650b8cb 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -3284,7 +3284,7 @@ _copyClusterStmt(const ClusterStmt *from)
 
 	COPY_NODE_FIELD(relation);
 	COPY_STRING_FIELD(indexname);
-	COPY_SCALAR_FIELD(verbose);
+	COPY_SCALAR_FIELD(options);
 
 	return newnode;
 }
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6a971d0141..378f2facb8 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -1206,7 +1206,7 @@ _equalClusterStmt(const ClusterStmt *a, const ClusterStmt *b)
 {
 	COMPARE_NODE_FIELD(relation);
 	COMPARE_STRING_FIELD(indexname);
-	COMPARE_SCALAR_FIELD(verbose);
+	COMPARE_SCALAR_FIELD(options);
 
 	return true;
 }
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 90dfac2cb1..87f5e95827 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10478,7 +10478,9 @@ ClusterStmt:
 	ClusterStmt *n = makeNode(ClusterStmt);
 	n->relation = $3;
 	n->indexname = $4;
-	n->verbose = $2;
+	n->options = 0;
+	if ($2)
+		n->options |= 

code of partition split

2018-07-22 Thread Chenxi Li
Hi hackers,

I'm interested in partition split and want to figure out how it works.
Could someone tell me where the code is located? Thanks very much.

Best Regards,
Chenxi li


Re: Get Columns from Plan

2018-07-22 Thread Isaac Morland
On 22 July 2018 at 21:56, Ed Behn  wrote:

> I'm tinkering with the idea of creating a Procedural Language plugin for
> Haskell. As such I'm reading up on the SPI and prepared statements. The
> idea is that a statement will be prepared at compile time and executed at
> run-time. Therefore, I want to be able to determine the columns (names and
> types) that are returned by a plan without executing it. It seems like this
> should be a straight-forward task, but there doesn't seem to be a mechanism
> to do this.
>
> Is there a way to get the columns for a plan at compile time? If not, why?
>

It looks to me like PQdescribePrepared() gives you most of what you want:

https://www.postgresql.org/docs/current/static/libpq-exec.html

You can get the types of the columns. However, it's not immediately obvious
to me how to get the column names. For query results there is PQfname() to
get the column names, but I believe that requires running the query. I
suppose you could add "LIMIT 0" to the end of the query and run it, but
that doesn't feel ideal.


Get Columns from Plan

2018-07-22 Thread Ed Behn
I'm tinkering with the idea of creating a Procedural Language plugin for
Haskell. As such I'm reading up on the SPI and prepared statements. The
idea is that a statement will be prepared at compile time and executed at
run-time. Therefore, I want to be able to determine the columns (names and
types) that are returned by a plan without executing it. It seems like this
should be a straight-forward task, but there doesn't seem to be a mechanism
to do this.

Is there a way to get the columns for a plan at compile time? If not, why?
   -Ed


Re: Should contrib modules install .h files?

2018-07-22 Thread Stephen Frost
Greetings,

* Andrew Gierth (and...@tao11.riddles.org.uk) wrote:
> > "Michael" == Michael Paquier  writes:
> 
>  >> No response to my followup to you in 2+ weeks. Last call?
> 
>  Michael> FWIW, I don't have any objections with experimenting on HEAD,
>  Michael> but I would vote for letting 11 out of this.
> 
> Why?

While I agree that this patch is a good improvements for us to have,
it's not something we'd back-patch to released versions and while we can
play a little looser with v11 since it's in beta, I tend to agree with
Michael that this change isn't really appropriate to include in v11 now
that we're in beta.

The current situation isn't great, but it's what people are familiar
with and may already be expecting in v11.  Ideally, we want extension
authors and others testing v11 to find actual bugs and breaking things
for them post feature-freeze and beta release isn't going to encourage
them to play with beta versions in the future, meaning we'll end up with
folks waiting until the release to do testing and end up with a poorer
release for it.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Should contrib modules install .h files?

2018-07-22 Thread Stephen Frost
Greetings,

* Andrew Gierth (and...@tao11.riddles.org.uk) wrote:
> > "Peter" == Peter Eisentraut  writes:
> 
>  > On 02.07.18 15:26, Tom Lane wrote:
>  >> FWIW, I agree with Andres' thought that each contrib module should
>  >> have its own subdirectory under $(includedir_server). Otherwise
>  >> we're going to be faced with questions about whether .h files need
>  >> to be renamed because they're not globally unique enough.
> 
>  Peter> Then they perhaps should be renamed. That seems like a much
>  Peter> simpler solution.
> 
> Personally I think that more -I options is less pain than having to
> rename things or deal with conflicts.

Yeah, I don't care for the idea that we should expect all extensions,
forever going forward, to provide one single .h file which has to be
unique and non-conflicting with all other extensions, ever.

When I think about the demands of extensions, I tend to consider PostGIS
the prime example and I certainly would understand if they wanted to
install multiple headers (they have some 72 .h files from what I'm
seeing...).

So, +1 from me for having a directory for each extension.

> Where exactly are you suggesting that they should be installed? Directly
> in $(installdir_server), or in $(installdir_server)/extension or
> equivalent?

I certainly wouldn't want extension headers being mixed in with server
headers.  I haven't got any great ideas about contrib-vs-extension, but
I'm, at least, leaning towards 'extension' as being the best answer
here.

>  Peter> The use case being discussed here is installing a data type
>  Peter> extension's header so you can write a transform for it. The
>  Peter> extension's name as well as the data type's own name already
>  Peter> have to be pretty much globally unique if you want it to be
>  Peter> useful. So it doesn't seem very difficult to me to have the
>  Peter> extension install a single header file with that same name.
> 
> That's assuming a single header file, which might be a bit more
> restrictive than absolutely necessary.

Agreed that having a single header file is overly and unnecessairly
restrictive.

>  Peter> The other side of this is that the PLs have to install their
>  Peter> header files. Which the in-core PLs already do. Would we we want
>  Peter> to move their header files under a new per-extension directory
>  Peter> scheme?
> 
> The in-core PLs could reasonably be grandfathered in in their current
> locations, at least for now.

Grandfathering them seems fine to me, but I don't hold that position
very strongly.

Thanks!

Stephen


signature.asc
Description: PGP signature


RE: [bug fix] Produce a crash dump before main() on Windows

2018-07-22 Thread Tsunakawa, Takayuki
From: Robert Haas [mailto:robertmh...@gmail.com]
> On Wed, Jul 18, 2018 at 4:38 AM, Tsunakawa, Takayuki
>  wrote:
> > IIRC, the pop up doesn't appear under Windows service.  If you start the
> database server with pg_ctl start on the command prompt, the pop up will
> appear, which I think is not bad.
> 
> Hmm.  I think that might be bad.  What makes you think it isn't?

First, as Hari-san said, starting the server with "pg_ctl start" on the command 
prompt does not feel like production use but like test environment,  and that 
usage seems mostly interactive.  Second, the current PostgreSQL is not exempt 
from the same problem: if postmaster or standalone backend crashes before 
main(), the pop up would appear.


Regards
Takayuki Tsunakawa



Re: Should contrib modules install .h files?

2018-07-22 Thread Andres Freund
Hi,

On 2018-07-23 02:02:51 +0100, Andrew Gierth wrote:
> > "Michael" == Michael Paquier  writes:
> 
>  >> No response to my followup to you in 2+ weeks. Last call?
> 
>  Michael> FWIW, I don't have any objections with experimenting on HEAD,
>  Michael> but I would vote for letting 11 out of this.
> 
> Why?

Because that's the default assumption when considerint to committ a new
feature to a feature frozen branch?  I don't really have a strong
feeling about this specific case either way, personally. But it's still
the default assumption.

Greetings,

Andres Freund



Re: JIT breaks PostGIS

2018-07-22 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Christoph Berg  writes:
> > The question will be coming up eventually, though, and I think the
> > options on the packaging side are:
> 
> > 1) Disable jit completely
> > 2) Compile --with-llvm, but disable jit in the config by default
> > 3) Compile --with-llvm, but disable jit for older llvm versions
> > 4) Enable jit everywhere where llvm >= 3.9 is available
> 
> > Option 4 is what the Debian packages implement now, but it might make
> > sense to go to 2 or 3 for PG11 (only).
> 
> Well, right now JIT is certainly beta-quality code, so you ought
> to expect bugs.  We have an open item at
> https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items
> to decide whether to ship v11 with JIT enabled by default or not,
> but I don't expect that decision will be taken until much closer
> to release.  Until then, I think you should be doing (4) so that
> we can gather data to inform the eventual decision.

+1.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Should contrib modules install .h files?

2018-07-22 Thread Andrew Gierth
> "Michael" == Michael Paquier  writes:

 >> No response to my followup to you in 2+ weeks. Last call?

 Michael> FWIW, I don't have any objections with experimenting on HEAD,
 Michael> but I would vote for letting 11 out of this.

Why?

-- 
Andrew.



Re: Should contrib modules install .h files?

2018-07-22 Thread Michael Paquier
On Mon, Jul 23, 2018 at 01:48:21AM +0100, Andrew Gierth wrote:
> > "Peter" == Peter Eisentraut  writes:
> 
>>> Anyone have any objection to putting this into 11beta if it works,
>>> as well as 12devel?
> 
>  Peter> Yes, I have expressed concerns about this approach elsewhere in
>  Peter> this thread.
> 
> No response to my followup to you in 2+ weeks. Last call?

FWIW, I don't have any objections with experimenting on HEAD, but I
would vote for letting 11 out of this.
--
Michael


signature.asc
Description: PGP signature


Re: Should contrib modules install .h files?

2018-07-22 Thread Andrew Gierth
> "Peter" == Peter Eisentraut  writes:

 >> Anyone have any objection to putting this into 11beta if it works,
 >> as well as 12devel?

 Peter> Yes, I have expressed concerns about this approach elsewhere in
 Peter> this thread.

No response to my followup to you in 2+ weeks. Last call?

-- 
Andrew (irc:RhodiumToad)



Re: More consistency for some file-related error message

2018-07-22 Thread Michael Paquier
On Fri, Jul 20, 2018 at 03:41:08PM +0900, Michael Paquier wrote:
> Okay, I just looked again at this point, and among the new messages only
> what's in XLogFileCopy has been bothering setting errno to 0 (see
> 811b6e3), so let's remove it in this case.

So, I have been through the patch set once again and pushed the patch to
make more error messages consistent, as well as the patch to set up
proper errcodes for new error messages.  There are perhaps more
improvements which could be done for the error messages, but that's not
absolutely clear either as the context used is actually useful in those
remaining.
--
Michael


signature.asc
Description: PGP signature


Re: wrong query result with jit_above_cost= 0

2018-07-22 Thread Andres Freund
Hi,

On 2018-07-07 21:41:05 +0200, Dmitry Dolgov wrote:
> Ok, looks like I found the issue. v_aggcontext & v_curaggcontext have nothing
> to do here (this change was just masking a problem by changing a memory 
> context
> so that the wrong one will never be used). The problem was that in the
> llvmjit_expr in AGG_INIT_TRANS section we need to assign a current memory
> context from op->d.agg_init_trans.aggcontext (see the attached patch),
> otherwise we'll get in the situation when current memory context is 
> hashcontext
> instead of aggcontext.

Nice catch! I pushed your fix, but I also made it set current_set.


> Also, I found one suspicious thing, in AGG_PLAIN_TRANS section we don't
> switch the memory context back in the branch with ExecAggTransReparent. I
> never found any consequences of that, but just in case I believe it makes 
> sense
> to do so.

I'll look at that next.


> And the last thing - where can I find a documentation about how to properly
> apply patches for GDB & perf support to llvm? I remember they were posted 
> here,
> and found some of them here [1] from Andres, but apparently part of them was
> already applied on top of llvm. Looks like for the gdb support I need to apply
> 0006-ORC-JIT-event-listener-support (since there is a gdb listener mentioned
> there), but with this patch I have an error:
> 
> error: ‘ObjHandleT’ was not declared in this scope
> 
> So I'm confused how should it be?

I've merged the GDB part into LLVM, and am about to merge the perf part
too. I plan to push a fix to PG adapting it to use the agreed upon /
merged LLVM APIs. Then you'll just need a recent LLVM
checkout. Unfortunately the relevant LLVM internal APIs have changed
quite rapidly over the last few releases (and a lot within individual
releases), so it's not easy to provide a patch for the individual versions.

Greetings,

Andres Freund



pgbench - remove double declaration of hash functions

2018-07-22 Thread Fabien COELHO


I noticed that hash functions appear twice in the list of pgbench 
functions, although once is enough. The code is functional nevertheless, 
but it looks silly. This was added by "e51a04840a1" back in March, so 
should be removed from 11 and 12dev.


--
Fabien.diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 8447e14d14..66288632d1 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -366,15 +366,6 @@ static const struct
{
"hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
},
-   {
-   "hash", PGBENCH_NARGS_HASH, PGBENCH_HASH_MURMUR2
-   },
-   {
-   "hash_murmur2", PGBENCH_NARGS_HASH, PGBENCH_HASH_MURMUR2
-   },
-   {
-   "hash_fnv1a", PGBENCH_NARGS_HASH, PGBENCH_HASH_FNV1A
-   },
/* keep as last array element */
{
NULL, 0, 0


Re: [HACKERS] Bug in to_timestamp().

2018-07-22 Thread Alexander Korotkov
On Sun, Jul 22, 2018 at 6:22 PM David G. Johnston
 wrote:
> On Sat, Jul 21, 2018 at 2:34 PM, Alexander Korotkov 
>  wrote:
>>
>> So, as I get from this thread, current patch brings these function very 
>> close to Oracle behavior.  The only divergence found yet is related to 
>> handling of unmatched tails of input strings (Oracle throws an error while 
>> we swallow that silently) [1].  But this issue can be be addressed by a 
>> separate patch.
>>
>> Patch seems to be in a pretty good shape.  So, the question is whether there 
>> is a consensus that we want a backward-incompatible behavior change, which 
>> this patch does.
>>
>> My personal opinion is +1 for committing this, because I see that this patch 
>> takes a lot of community efforts on discussion, coding, review etc.  All 
>> these efforts give a substantial result in a patch, which makes behavior 
>> more Oracle-compatible and (IMHO) more clear.
>>
>> However, in this thread other opinions were expressed.  In particular, David 
>> G. Johnston expressed opinion that we shouldn't change behavior of existing 
>> functions, alternatively we could introduce new functions with new behavior. 
>>  However, I see David doesn't participate this thread for a quite long time.
>
> Been lurking about...I'm still of the opinion that this capability should 
> exist in PostgreSQL as "our" function and not just as an Oracle compatibility.

For sure, we're not going to copy Oracle behavior "bug to bug".  But
when users find our behavior to be confusing, there is nothing wrong
to look for Oracle behavior and accept it if it looks good.

> That said the thing I'm most curious to read is the release note entry that 
> would go along with this change that informs users what will be changing: 
> silently, visibly, or otherwise.

I'm sure that release note entry should directly inform users about
backward-incompatible changes in to_timestamp()/to_date() functions.
Users need to be advised to test their applications before porting
them to new major release of PostgreSQL.

> Knowing how much we (don't) diverge from our chosen "standard" after making 
> this change is important but the change in behavior from current is, IMO, 
> more important in deciding whether this particular change is worth making.
> My re-read of the thread the other day left me with a feeling of contentment 
> that this was an acceptable change but I also get the feeling like I'm 
> missing the downside trade-off too...I was hoping your review would help in 
> that regard but as it did not speak to specific incompatibilities it has not.

So, if I understand you correctly, downside of trade-off are use-cases
when current behavior looks correct, while patched behavior looks
incorrect.  Yes, while looking at this thread we can't find such
use-cases.  Intuitively, they should be present.  But I thought about
that and didn't find it yet...

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: patch to allow disable of WAL recycling

2018-07-22 Thread Tomas Vondra
On 07/21/2018 12:04 AM, Jerry Jelinek wrote:
> Thomas,
> 
> Thanks for your offer to run some tests on different OSes and
> filesystems that you have. Anything you can provide here would be much
> appreciated. I don't have anything other than our native SmartOS/ZFS
> based configurations, but I might be able to setup some VMs and get
> results that way. I should be able to setup a VM running FreeBSD. If you
> have a chance to collect some data, just let me know the exact
> benchmarks you ran and I'll run the same things on the FreeBSD VM.
> Obviously you're under no obligation to do any of this, so if you don't
> have time, just let me know and I'll see what I can do on my own.
> 

Sounds good. I plan to start with the testing in a couple of days - the
boxes are currently running some other tests at the moment. Once I have
some numbers I'll share them here, along with the test scripts etc.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgbench-ycsb

2018-07-22 Thread Fabien COELHO


Just to clarify - if I understand Anthony correctly, this proposal is 
not about implementing exactly YCSB as it is, but more about using 
zipfian distribution for an id in the regular pgbench table structure 
in conjunction with read/write balance to simulate something similar 
to it.


Ok, I misunderstood. My 0.02€: If it does not implement YCSB, and the
point is not to implement YCSB, then do not call it YCSB:-)

Maybe there could be other simpler builtins to use non uniform
distributions: {zipf,exp,...}-{simple,select} and default values
(exp_param, zipf_param?) for the random distribution parameters.

  \set id random_zipfian(1, 10*:scale, :zipf_param)
  \set val random(-5000, 5000)
  UPDATE pgbench_whatever ...;

Then

  pgbench -b zipf-se@1 -b zipf-si@1 [ -D zipf_param=1.1 ... ] -T 1 ...

And probably instead of implementing the exact YCSB workload inside 
pgbench, it makes more sense to add PostgreSQL Jsonb as one of the 
options into the framework itself (I was in the middle of it few years 
ago, but then was distracted by some interesting benchmarking 
results).


Sure.


Hello,
thank you for your interest. I'm still improving this idea, the patch
and I'm very happy about the discussion we have. It really helps.

The idea was to implement the workloads as close to YCSB as possible
using pgbench.


Basically I'm against having something called YCSB if it is not YCSB;-)


So, the schema it should be applied to - is default schema generated by
pgbnench -i (pgbench_accounts).


This is a contradiction, because pgbench_accounts table is in no way, even 
remotely, conformant to the YCSB benchmark test table.


So for me there are three possibilities:

(1) do nothing, always an option as committers may be against extending 
pgbench in this direction anyway. Personally I'm fine with having it.


(2) implement YCSB cleanly, i.e. both initialization and operations, at 
least if this is "reasonable" (i.e. it does not result in 2000 lines of 
new code). ISTM that it can be done, given that the YCSB schema is very 
simple, hence I suggested "pgbench -i --schema yscb" to trigger a non 
default initialization.


(3) if you are interested in demonstrating non uniform distribution on 
pgbench_accounts, I'm also fine with it, just do so, but do *NOT* call it 
YCSB.


Also it seems that the YCSB bench uses some hashing to mix keys and avoid 
having 1 as the most frequent, 2 as the second, and so on. There is a hash 
function in pgbench which can be used (although the solution is not 
perfect, some values cannot be reached), but it is used by YCSB. Otherwise 
I'm planning to submit a pseudo-random permutation function to ease this 
some day, provided that the size of the table stays constant.


--
Fabien.

Re: [HACKERS] plpgsql - additional extra checks

2018-07-22 Thread Tomas Vondra



On 07/19/2018 02:50 PM, Pavel Stehule wrote:
> 
> 
> 2018-07-15 1:38 GMT+02:00 Tomas Vondra  >:
> 
> Hi,
> 
> I've been looking at the version submitted on Thursday, planning to
> commit it, but I think it needs a wee bit more work on the error
> messages.
> 
> 1) The example in sgml docs does not work, because it's missing a
> semicolon after the CREATE FUNCTION. And the output was not updated
> after tweaking the messages, so it still has uppercase+comma.
> 
> 
> fixed
>  
> 
> 2) The
> 
>   /* translator: %s represent a name of extra check */
> 
> comment should be
> 
>   /* translator: %s represents a name of an extra check */
> 
> 3) Both Andres and Alvaro suggested to pass extra_errors/extra_warnings
> as a variable too, turning the errhint into
> 
>   errhint("%s check of %s is active.",
>           "too_many_rows",
>           (too_many_rows_level == ERROR) ? "extra_errors" :
>                                            "extra_checks");
> 
> or something like that. Any particular reason not to do that?
> 
> 
> errdetail was used on first place already.
> 
> Now, I skip detail in this first case, and elsewhere I move this info to
> detail, and hint has text that you proposed.
> 
> 
> Sorry for the bikeshedding, but I'd like my first commit not to be
> immediately torn apart by wolves ;-)
> 
> 
>  no problem
> 
> 
> 4) I think the errhint() is used incorrectly. The message should be
> about how to fix the issue, but messages like
> 
>   HINT:  strict_multi_assignement check of extra_warnings is active.
> 
> clearly does not help in this regard. This information should be either
> in errdetail() is deemed useful, or left out entirely. And the errhint()
> should say something like:
> 
>   errdetail("Make sure the query returns a single row, or use LIMIT 1.")
> 
> and
> 
>   errdetail("Make sure the query returns the exact list of columns.")
> 
> 
> should be fixed too
> 

Seems OK to me. I'll go over the patch once more tomorrow and I plan to
get it committed.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: cost_sort() improvements

2018-07-22 Thread Tomas Vondra
On 07/12/2018 04:31 PM, Teodor Sigaev wrote:
>> At least [1] and [2] hit into to that issues and have an
>> objections/questions about correctness of cost sort estimation.
>> Suggested patch tries to improve current estimation and solve that
>> issues.
> 
> Sorry for long delay but issue was even more complicated than I thought.
> When I tried to add cost_sort to GROUP BY patch I found it isn't work
> well as I hoped :(. The root of problem is suggested cost_sort
> improvement doesn't take into account uniqueness of first column and
> it's cost always the same. I tried to make experiments with all the same
> and all unique column and found that execution time could be
> significantly differ (up to 3 times on 1e7 randomly generated integer
> values). And I went to read book and papers.
> 
> So, I suggest new algorithm based on ideas in [3], [4]. In term of that
> papers, let I use designations  from previous my email and Xi - number
> of tuples with key Ki, then estimation is:
> log(N! / (X1! * X2! * ..))  ~  sum(Xi * log(N/Xi))
> In our case all Xi are the same because now we don't have an estimation of
> group sizes, we have only estimation of number of groups. In this case,
> formula becomes: N * log(NumberOfGroups). Next, to support correct
> estimation
> of multicolumn sort we need separately compute each column, so, let k is a
> column number, Gk - number of groups  defined by k columns:
> N * sum( FCk * log(Gk) )
> and FCk is a sum(Cj) over k columns. Gk+1 is defined as
> estimate_num_groups(NGk) - i.e. it's recursive, that's means that
> comparison of k-th columns includes all comparison j-th columns < k.
> 
> Next, I found that this formula gives significant underestimate with N <
> 1e4 and using [5] (See Chapter 8.2 and Theorem 4.1) found that we can
> use N + N * log(N) formula which actually adds a multiplier in simple
> case but it's unclear for me how to add that multimplier to multicolumn
> formula, so I added just a separate term proportional to N.
> 

I'm sorry, but I'm getting lost in the notation and how you came up with
those formulas - I don't know where to look in the papers, books etc.

Could you perhaps summarize the reasoning or at least point me to the
relevant parts of the sources, so that I know which parts to focus on?

> As demonstration, I add result of some test, *sh and *plt are scripts to
> reproduce results. Note, all charts are normalized because  computed
> cost as not a milliseconds. p.pl is a parser for JSON format of explain
> analyze.
> 
> N test - sort unique values with different number of tuples
> NN test - same as previous one but sort of single value (all the same
> values)
> U test - fixed number of total values (1e7) but differ number of unique
> values
> 
> Hope, new estimation much more close to real sort timing. New estimation
> function gives close result to old estimation function on trivial
> examples, but ~10% more expensive, and three of regression tests aren't
> passed, will look closer later. Patch doesn't include  regression test
> changes.

Interesting. I wonder if there's a way to address the difference at the
lower end?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Fabien COELHO


Hello

My 0.02€:


Thanks! I made the same modification of src/interfaces/ecpg/preproc/ecpg.c, but
in other cases it's either not a problem (as with src/bin/psql/startup.c or
src/timezone/zic.c) or the solution is too complex to be added to the current
patch: e.g. these tools in src/bin/scripts use handle_help_version_opts function
from src/bin/scripts/common.c which cannot be refactored so straightforward.


I think this function could be simply removed: it is just calling its 
third argument on help, all printing the version with the command name on 
version.



It's possible to remove it and modify each tool to parse the --help and
--version args for themselves but I would not include it in the same patch as
it's already not too short for a "fix" patch and these changes are better to be
discussed separately IMHO. Do you think the handle_help_version_opts function
should be removed and these args should be parsed in each src/bin/scripts app?


Given the contents of "handle_help_version_opts", I see no issue with 
managing an update in the same patch as other commands, if this is to be 
done.



There are several cases where separate comparisons of argv[1] are made to detect
"--help" or "--version" before non-root user is enforced (to make it possible to
the root user to check the version) e.g. src/bin/pg_upgrade/option.c
— in this case I left this comparisons untouched while expanding the 
switch statement of getopt_long, so non-root user sees the correct 
behavior and root still sees "cannot be run as root" error when trying # 
pg_upgrade -v --help.


This seems reasonable.

The alternative is to wrap these argv[...] comparisons in a for 
statement to iterate through all the arguments. Another option is to 
enforce non-root after getopt_long parsing but it's harder to be sure 
that the patch does not alter the apps behavior unexpected way.


Personnaly I would not bother that a must-not-be-root command does not 
process its options when called as root, but people may object on this 
one.



There are also the few apps when getopt is used instead of getopt_long, so I
decided not to fix these in the current patch because it's not so obvious. It's
possible either to replace getopt with getopt_long (and add long options, which
may be nice) or wrap --help/--version parsing in a for loop.


I'd go for homogeneity.

About the patch.

Some commands take care to manage their option struct and/or switch cases 
more or less in alphabetical order (with errors, eg dbname/d in pg_dumpall 
is misplaced), and that you strive to be consistent with that. You missed

on some cases though: pg_test_fsync, pg_test_timing, ecpg.

For some commands (initdb, pg_basebackup, pg_receivewal...), I see that 
?/V are already in the option struct but the option management is missing 
from the switch, so this is also fixing minor bugs.


You have changed the behavior in some commands: eg -? in pg_rewind used to 
point to --help, now it directly provide said help. I'm fine with that.


For commands which do not use getopt_long, probably the change belongs to 
another patch.


--
Fabien.

Re: cost_sort() improvements

2018-07-22 Thread Tomas Vondra


On 07/12/2018 05:00 PM, Teodor Sigaev wrote:
> 
>> One more thought about estimating the worst case - I wonder if simply
>> multiplying the per-tuple cost by 1.5 is the right approach. It does not
>> seem particularly principled, and it's trivial simple to construct
>> counter-examples defeating it (imagine columns with 99% of the rows
>> having the same value, and then many values in exactly one row - that
>> will defeat the ndistinct-based group size estimates).
> 
> Agree. But right now that is best what we have. and constant 1.5 easily
> could be changed to 1 or 10 - it is just arbitrary value, intuitively
> chosen.  As I mentioned in v7 patch description, new estimation function
> provides ~10% bigger estimation and this constant should not be very
> large, because we could easily get overestimation.
> 
>>
>> The reason why constructing such counter-examples is so simple is that
>> this relies entirely on the ndistinct stats, ignoring the other stats we
>> already have. I think this might leverage the per-column MCV lists (and
>> eventually multi-column MCV lists, if that ever gets committed).
>>
>> We're estimating the number of tuples in group (after fixing values in
>> the preceding columns), because that's what determines the number of
>> comparisons we need to make at a given stage. The patch does this by
>> estimating the average group size, by estimating ndistinct of the
>> preceding columns G(i-1), and computing ntuples/G(i-1).
>>
>> But consider that we also have MCV lists for each column - if there is a
>> very common value, it should be there. So let's say M(i) is a frequency
>> of the most common value in i-th column, determined either from the MCV
>> list or as 1/ndistinct for that column. Then if m(i) is minimum of M(i)
>> for the fist i columns, then the largest possible group of values when
>> comparing values in i-th column is
>>
>>  N * m(i-1)
>>
>> This seems like a better way to estimate the worst case. I don't know if
>> this should be used instead of NG(i), or if those two estimates should
>> be combined in some way.
> Agree, definitely we need an estimation of larger group size to use it
> in cost_sort. But I don't feel power to do that, pls, could you attack a
> computing group size issue? Thank you.
> 

Attached is a simple patch illustrating this MCV-based approach. I don't
claim it's finished / RFC, but hopefully it's sufficient to show what I
meant. I'm not sure how to plug it into the v8 of your patch at this
point, so I've simply added an elog to estimate_num_groups to also print
the estimate or largest group for GROUP BY queries.

It's largely a simplified copy of estimate_num_groups() and there's a
couple of comments of how it might be improved further.

>>
>> I think estimating the largest group we need to sort should be helpful
>> for the incremental sort patch, so I'm adding Alexander to this
>> thread.Agree
> I think so. suggested estimation algorithm should be modified to support
> leading preordered keys and this form could be directly used in GROUP BY
> and incremental sort patches.

Right. What I think the function estimating the group size could do in
case of incremental sort is producing two values - maximum size of the
leading group, and maximum group size overall. The first one would be
useful for startup cost, the second one for total cost.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f1c78ffb65..2d91c17866 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3424,6 +3424,9 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
 	ListCell   *l;
 	int			i;
 
+	/* DEBUG */
+	elog(WARNING, "estimate_largest_group (%d exprs) %f", list_length(groupExprs), estimate_largest_group(root, groupExprs, input_rows));
+
 	/*
 	 * We don't ever want to return an estimate of zero groups, as that tends
 	 * to lead to division-by-zero and other unpleasantness.  The input_rows
@@ -3735,6 +3738,265 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
 }
 
 /*
+ * estimate_largest_group	- Estimate size of largest group of rows
+ *
+ * Given a query having a GROUP BY or ORDER BY clause, estimate the size
+ * of the largest group.
+ *
+ * Inputs:
+ *	root - the query
+ *	groupExprs - list of expressions being grouped by
+ *	input_rows - number of rows estimated to arrive at the group/unique
+ *		filter step
+ *
+ * Estimating average size of a group can be done simply as 1/N where
+ * N is the number of groups obtained from estimate_num_groups. But for
+ * various places we are more interested in a worst-case scenario, i.e.
+ * the size of the largest group of rows (which may be significantly
+ * larger than the average group).
+ *
+ * To estimate size of the largest group, we use MCV (and additional
+ * 

Re: JIT breaks PostGIS

2018-07-22 Thread Tom Lane
Christoph Berg  writes:
> The question will be coming up eventually, though, and I think the
> options on the packaging side are:

> 1) Disable jit completely
> 2) Compile --with-llvm, but disable jit in the config by default
> 3) Compile --with-llvm, but disable jit for older llvm versions
> 4) Enable jit everywhere where llvm >= 3.9 is available

> Option 4 is what the Debian packages implement now, but it might make
> sense to go to 2 or 3 for PG11 (only).

Well, right now JIT is certainly beta-quality code, so you ought
to expect bugs.  We have an open item at
https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items
to decide whether to ship v11 with JIT enabled by default or not,
but I don't expect that decision will be taken until much closer
to release.  Until then, I think you should be doing (4) so that
we can gather data to inform the eventual decision.

regards, tom lane



Re: JIT breaks PostGIS

2018-07-22 Thread Christoph Berg
Re: Andres Freund 2018-07-22 <20180722185106.qxc5ie745tqda...@alap3.anarazel.de>
> > Does that mean JIT is not ready for prime time yet and should be
> > disabled in default installations? Or does it just mean that llvm 4.0
> > is old?
> 
> I don't know yet, it's probably just some small bug. But let me debug it
> first.

Sure.

The question will be coming up eventually, though, and I think the
options on the packaging side are:

1) Disable jit completely
2) Compile --with-llvm, but disable jit in the config by default
3) Compile --with-llvm, but disable jit for older llvm versions
4) Enable jit everywhere where llvm >= 3.9 is available

Option 4 is what the Debian packages implement now, but it might make
sense to go to 2 or 3 for PG11 (only).

Christoph



Re: Non-portable shell code in pg_upgrade tap tests

2018-07-22 Thread Tom Lane
Noah Misch  writes:
> On Sun, Jul 22, 2018 at 10:46:03AM -0400, Tom Lane wrote:
>> The pg_upgrade makefile does in fact use $(SHELL), so it will default to
>> whatever shell configure used.

> It will not, because we don't set $(SHELL) anywhere.  $(SHELL) is not @SHELL@.
> In our makefiles, $(SHELL) is always /bin/sh, the GNU make default.

Oh!  Hm, I wonder whether we shouldn't do that, ie add SHELL = @SHELL@
to Makefile.global.in.

A quick trawl of the buildfarm logs says most of our animals compute
SHELL = /bin/sh anyway, and so would be unaffected.  There's a sizable
population that find /bin/bash though, and one active critter that finds
/bin/ksh.

regards, tom lane



Re: JIT breaks PostGIS

2018-07-22 Thread Andres Freund
On 2018-07-22 20:49:51 +0200, Christoph Berg wrote:
> Re: To Andres Freund 2018-07-21 <20180721203644.gb10...@msg.df7cb.de>
> > That said, I'm just rebuilding postgresql-11 on stretch to use
> > llvm-4.0 instead of 3.9.
> 
> This didn't change anything, it's still crashing on the same query
> from tickets.sql.
> 
> Does that mean JIT is not ready for prime time yet and should be
> disabled in default installations? Or does it just mean that llvm 4.0
> is old?

I don't know yet, it's probably just some small bug. But let me debug it
first.

Greetings,

Andres Freund



Re: JIT breaks PostGIS

2018-07-22 Thread Christoph Berg
Re: To Andres Freund 2018-07-21 <20180721203644.gb10...@msg.df7cb.de>
> That said, I'm just rebuilding postgresql-11 on stretch to use
> llvm-4.0 instead of 3.9.

This didn't change anything, it's still crashing on the same query
from tickets.sql.

Does that mean JIT is not ready for prime time yet and should be
disabled in default installations? Or does it just mean that llvm 4.0
is old?

Christoph



Re: Non-portable shell code in pg_upgrade tap tests

2018-07-22 Thread Noah Misch
On Sun, Jul 22, 2018 at 10:46:03AM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > I'd say the right way to fix this is the one specified in
> > https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/The-Make-Macro-SHELL.html,
> > in particular:
> 
> >   Using @SHELL@ means that your makefile will benefit from the same improved
> >   shell, such as bash or ksh, that was discovered during configure, so that
> >   you aren't fighting two different sets of shell bugs between the two
> >   contexts.
> 
> The pg_upgrade makefile does in fact use $(SHELL), so it will default to
> whatever shell configure used.

It will not, because we don't set $(SHELL) anywhere.  $(SHELL) is not @SHELL@.
In our makefiles, $(SHELL) is always /bin/sh, the GNU make default.

> The bigger picture here is that it's worth maintaining a common coding
> style in this script.  Randomly using `...` in some places and $(...)
> in others just increases the reader's cognitive load, as does spelling
> conditional blocks in multiple styles.  So I'd have felt these changes
> were appropriate on style grounds even if there were no portability
> complaint.

For 986127e narrowly, I agree.  However, I do want folks able to use $(...)
where it saves backslashes; that's more valuable than `...` uniformity.  For
commit 0426932, as the comment above the changed code indicates, the
conditional blocks are copies of "configure" code.  Consistency with their
source was more valuable than style, particularly since the code is
security-critical.



Re: pgbench-ycsb

2018-07-22 Thread a . bykov

On 2018-07-22 16:56, Fabien COELHO wrote:
Just to clarify - if I understand Anthony correctly, this proposal is 
not about
implementing exactly YCSB as it is, but more about using zipfian 
distribution
for an id in the regular pgbench table structure in conjunction with 
read/write

balance to simulate something similar to it.


Ok, I misunderstood. My 0.02€: If it does not implement YCSB, and the
point is not to implement YCSB, then do not call it YCSB:-)

Maybe there could be other simpler builtins to use non uniform
distributions: {zipf,exp,...}-{simple,select} and default values
(exp_param, zipf_param?) for the random distribution parameters.

  \set id random_zipfian(1, 10*:scale, :zipf_param)
  \set val random(-5000, 5000)
  UPDATE pgbench_whatever ...;

Then

  pgbench -b zipf-se@1 -b zipf-si@1 [ -D zipf_param=1.1 ... ] -T 1 
...


And probably instead of implementing the exact YCSB workload inside 
pgbench, it
makes more sense to add PostgreSQL Jsonb as one of the options into 
the
framework itself (I was in the middle of it few years ago, but then 
was

distracted by some interesting benchmarking results).


Sure.


Hello,
thank you for your interest. I'm still improving this idea, the patch
and I'm very happy about the discussion we have. It really helps.

The idea was to implement the workloads as close to YCSB as possible
using pgbench.

So, the schema it should be applied to - is default schema generated by
pgbnench -i (pgbench_accounts).

--
Anthony Bykov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Andrei Korigodski
Fabien COELHO  writes:
> There seems to be other instances as well

Thanks! I made the same modification of src/interfaces/ecpg/preproc/ecpg.c, but
in other cases it's either not a problem (as with src/bin/psql/startup.c or
src/timezone/zic.c) or the solution is too complex to be added to the current
patch: e.g. these tools in src/bin/scripts use handle_help_version_opts function
from src/bin/scripts/common.c which cannot be refactored so straightforward.
It's possible to remove it and modify each tool to parse the --help and
--version args for themselves but I would not include it in the same patch as
it's already not too short for a "fix" patch and these changes are better to be
discussed separately IMHO. Do you think the handle_help_version_opts function
should be removed and these args should be parsed in each src/bin/scripts app?

There are several cases where separate comparisons of argv[1] are made to detect
"--help" or "--version" before non-root user is enforced (to make it possible to
the root user to check the version) e.g. src/bin/pg_upgrade/option.c — in this
case I left this comparisons untouched while expanding the switch statement of
getopt_long, so non-root user sees the correct behavior and root still sees
"cannot be run as root" error when trying # pg_upgrade -v --help. The
alternative is to wrap these argv[...] comparisons in a for statement to
iterate through all the arguments. Another option is to enforce non-root after
getopt_long parsing but it's harder to be sure that the patch does not alter
the apps behavior unexpected way.

There are also the few apps when getopt is used instead of getopt_long, so I
decided not to fix these in the current patch because it's not so obvious. It's
possible either to replace getopt with getopt_long (and add long options, which
may be nice) or wrap --help/--version parsing in a for loop.

--
Andrei
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3f203c6ca6..7a23c0b1ab 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -3130,23 +3130,9 @@ main(int argc, char *argv[])
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("initdb"));
 
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage(progname);
-			exit(0);
-		}
-		if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
-		{
-			puts("initdb (PostgreSQL) " PG_VERSION);
-			exit(0);
-		}
-	}
-
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "dD:E:kL:nNU:WA:sST:X:g", long_options, _index)) != -1)
+	while ((c = getopt_long(argc, argv, "dD:E:kL:nNU:WA:sST:X:gV?", long_options, _index)) != -1)
 	{
 		switch (c)
 		{
@@ -3243,6 +3229,14 @@ main(int argc, char *argv[])
 			case 'g':
 SetDataDirectoryCreatePerm(PG_DIR_MODE_GROUP);
 break;
+			case 'V':
+puts("initdb (PostgreSQL) " PG_VERSION);
+exit(0);
+break;
+			case '?':
+usage(progname);
+exit(0);
+break;
 			default:
 /* getopt_long already emitted a complaint */
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index ef4cfc4384..03f7875514 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2169,24 +2169,9 @@ main(int argc, char **argv)
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
 
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage();
-			exit(0);
-		}
-		else if (strcmp(argv[1], "-V") == 0
- || strcmp(argv[1], "--version") == 0)
-		{
-			puts("pg_basebackup (PostgreSQL) " PG_VERSION);
-			exit(0);
-		}
-	}
-
 	atexit(cleanup_directories_atexit);
 
-	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP",
+	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvPV?",
 			long_options, _index)) != -1)
 	{
 		switch (c)
@@ -2328,6 +2313,14 @@ main(int argc, char **argv)
 			case 'P':
 showprogress = true;
 break;
+			case 'V':
+puts("pg_basebackup (PostgreSQL) " PG_VERSION);
+exit(0);
+break;
+			case '?':
+usage();
+exit(0);
+break;
 			case 3:
 verify_checksums = false;
 break;
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 8be8d48a8a..5318454d67 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -509,22 +509,7 @@ main(int argc, char **argv)
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
 
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage();
-			exit(0);
-		}
-		else if (strcmp(argv[1], "-V") == 0 ||
- strcmp(argv[1], "--version") == 0)
-		{
-			puts("pg_receivewal (PostgreSQL) " PG_VERSION);
-			

Re: [HACKERS] Bug in to_timestamp().

2018-07-22 Thread David G. Johnston
On Sat, Jul 21, 2018 at 2:34 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> So, as I get from this thread, current patch brings these function very
> close to Oracle behavior.  The only divergence found yet is related to
> handling of unmatched tails of input strings (Oracle throws an error while
> we swallow that silently) [1].  But this issue can be be addressed by a
> separate patch.
>
> Patch seems to be in a pretty good shape.  So, the question is whether
> there is a consensus that we want a backward-incompatible behavior change,
> which this patch does.
>
> My personal opinion is +1 for committing this, because I see that this
> patch takes a lot of community efforts on discussion, coding, review etc.
> All these efforts give a substantial result in a patch, which makes
> behavior more Oracle-compatible and (IMHO) more clear.
>
> However, in this thread other opinions were expressed.  In
> particular, David G. Johnston expressed opinion that we shouldn't change
> behavior of existing functions, alternatively we could introduce new
> functions with new behavior.  However, I see David doesn't participate this
> thread for a quite long time.
>

​Been lurking about...I'm still of the opinion that this capability should
exist in PostgreSQL as "our" function and not just as an Oracle
compatibility.

That said the thing I'm most curious to read is the release note entry that
would go along with this change that informs users what will be changing:
silently, visibly, or otherwise.  Knowing how much we (don't) diverge from
our chosen "standard" after making this change is important but the change
in behavior from current is, IMO, more important in deciding whether this
particular change is worth making.

My re-read of the thread the other day left me with a feeling of
contentment that this was an acceptable change but I also get the feeling
like I'm missing the downside trade-off too...I was hoping your review
would help in that regard but as it did not speak to specific
incompatibilities it has not.

David J.


Re: Non-portable shell code in pg_upgrade tap tests

2018-07-22 Thread Tom Lane
Noah Misch  writes:
> I'd say the right way to fix this is the one specified in
> https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/The-Make-Macro-SHELL.html,
> in particular:

>   Using @SHELL@ means that your makefile will benefit from the same improved
>   shell, such as bash or ksh, that was discovered during configure, so that
>   you aren't fighting two different sets of shell bugs between the two
>   contexts.

The pg_upgrade makefile does in fact use $(SHELL), so it will default to
whatever shell configure used.  However, I'd expect the Autoconf guys
to target a pretty darn low common denominator shell-wise.  It's unlikely
that the patches we just repaired were completely port-tested before
commit, and we now know that the buildfarm has been falling down on the
job as well.  (Partly my fault, since gaur/pademelon don't run the
TAP tests either.  I'm hesitant to triple their already long cycle
times, but ...)

The bigger picture here is that it's worth maintaining a common coding
style in this script.  Randomly using `...` in some places and $(...)
in others just increases the reader's cognitive load, as does spelling
conditional blocks in multiple styles.  So I'd have felt these changes
were appropriate on style grounds even if there were no portability
complaint.

In the long run, no doubt it'd be better to rewrite test.sh in Perl,
but this is what we've got today.

regards, tom lane



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Fabien COELHO




Agreed. I have changed handling of the --help and --version options in all apps
where it exhibits the problem described, with the exception for 
pg_archivecleanup
where getopt is used instead of getopt_long. The separate patch will be proposed
to address it.

The patch is against current master. All tests pass.


I doubt that -V & -? are heavily tested:-) Patch works for me, though.

There seems to be other instances as well:

 ./src/interfaces/ecpg/preproc/ecpg.c:   while ((c = getopt_long(argc, argv, 
"vcio:I:tD:dC:r:h", ecpg_options, NULL)) != -1)
 ./src/bin/scripts/clusterdb.c:  while ((c = getopt_long(argc, argv, 
"h:p:U:wWeqd:at:v", long_options, )) != -1)
 ./src/bin/scripts/createdb.c:   while ((c = getopt_long(argc, argv, 
"h:p:U:wWeO:D:T:E:l:", long_options, )) != -1)
 ./src/bin/scripts/dropuser.c:   while ((c = getopt_long(argc, argv, "h:p:U:wWei", 
long_options, )) != -1)
 ./src/bin/scripts/pg_isready.c: while ((c = getopt_long(argc, argv, 
"d:h:p:qt:U:", long_options, NULL)) != -1)
 ./src/bin/scripts/dropdb.c: while ((c = getopt_long(argc, argv, "h:p:U:wWei", 
long_options, )) != -1)
 ./src/bin/scripts/vacuumdb.c:   while ((c = getopt_long(argc, argv, 
"h:p:U:wWeqd:zZFat:fvj:", long_options, )) != -1)
 ./src/bin/scripts/createuser.c: while ((c = getopt_long(argc, argv, 
"h:p:U:g:wWedDsSaArRiIlLc:PE",
 ./src/bin/scripts/reindexdb.c:  while ((c = getopt_long(argc, argv, 
"h:p:U:wWeqS:d:ast:i:v", long_options, )) != -1)

 ./src/interfaces/ecpg/preproc/ecpg.c:   if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
 ./src/timezone/zic.c:   else if (strcmp(argv[k], "--help") == 0)
 ./src/backend/main/main.c:  if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
 ./src/bin/pg_archivecleanup/pg_archivecleanup.c:if (strcmp(argv[1], 
"--help") == 0 || strcmp(argv[1], "-?") == 0)
 ./src/bin/pg_upgrade/option.c:  if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
 ./src/bin/pg_ctl/pg_ctl.c:  if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
 ./src/bin/psql/startup.c:   if ((strcmp(argv[1], "-?") == 0) || (argc == 2 
&& (strcmp(argv[1], "--help") == 0)))
 ./src/bin/scripts/common.c: if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
=> implementation shared by many C "scripts".
 ./src/bin/pg_config/pg_config.c:if (strcmp(argv[i], "--help") == 0 || 
strcmp(argv[i], "-?") == 0)
 ./contrib/oid2name/oid2name.c:  if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
 ./contrib/pg_standby/pg_standby.c:  if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)
 ./contrib/vacuumlo/vacuumlo.c:  if (strcmp(argv[1], "--help") == 0 || 
strcmp(argv[1], "-?") == 0)

--
Fabien.



Re: pgbench-ycsb

2018-07-22 Thread Fabien COELHO



Just to clarify - if I understand Anthony correctly, this proposal is not about
implementing exactly YCSB as it is, but more about using zipfian distribution
for an id in the regular pgbench table structure in conjunction with read/write
balance to simulate something similar to it.


Ok, I misunderstood. My 0.02€: If it does not implement YCSB, and the 
point is not to implement YCSB, then do not call it YCSB:-)


Maybe there could be other simpler builtins to use non uniform 
distributions: {zipf,exp,...}-{simple,select} and default values 
(exp_param, zipf_param?) for the random distribution parameters.


  \set id random_zipfian(1, 10*:scale, :zipf_param)
  \set val random(-5000, 5000)
  UPDATE pgbench_whatever ...;

Then

  pgbench -b zipf-se@1 -b zipf-si@1 [ -D zipf_param=1.1 ... ] -T 1 ...


And probably instead of implementing the exact YCSB workload inside pgbench, it
makes more sense to add PostgreSQL Jsonb as one of the options into the
framework itself (I was in the middle of it few years ago, but then was
distracted by some interesting benchmarking results).


Sure.

--
Fabien.

Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist

2018-07-22 Thread Alexander Korotkov
On Tue, Jul 10, 2018 at 10:03 PM Heikki Linnakangas  wrote:
>
> On 28/02/18 18:03, Anastasia Lubennikova wrote:
> > I want to propose a bunch of patches which allow to reduce WAL traffic
> > generated by CREATE INDEX for GiST, GIN and SP-GiST. Similarly to b-tree
> > and RUM, we can now log index pages of other access methods only once
> > in the end of indexbuild process.
>
> Makes sense. Is there any scenario where the current method is better?
> In theory, doing another pass through the whole index, to create the WAL
> records, requires more I/O. But in practice, it seems that the reduction
> in WAL is almost certainly a win.

It's frequently advised to move WAL to the separate storage device.
Then, it's not hard to imagine how current method can be faster, when
storage device containing index is much more loaded than storage
device containing WAL.  But despite being faster it would still
consume way more overall I/O resources.  I'm not sure if we should
provide some option to switch between WAL methods (or heuristics)...

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Andrei Korigodski
Tom Lane  writes:
> Andres Freund  writes:
> > On July 21, 2018 11:15:51 PM PDT, Tom Lane  wrote:
> > > This is, in fact, how it's done in all PG apps.
> > Think there's a fair argument that we should improve that at some point...
> Perhaps.  Peter E. might remember why it's like that.

It was done this way because then there was HAVE_GETOPT_LONG define for systems
that doesn't support getopt_long, see commit 41fde5460387 ("Polish help output.
Allow --help to work with BSD getopts.", Peter Eisentraut, 2001-01-06). Now this
define is not used by any app in src/bin, so I believe there is no need for this
workaround anymore.

By the way, this approach is already not used in pg_waldump and psql handles the
arguments more complex way to avoid the problem under discussion.

> But I'm dubious about changing it in only one app.

Agreed. I have changed handling of the --help and --version options in all apps
where it exhibits the problem described, with the exception for 
pg_archivecleanup
where getopt is used instead of getopt_long. The separate patch will be proposed
to address it.

The patch is against current master. All tests pass.

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3f203c6ca6..7a23c0b1ab 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -3130,23 +3130,9 @@ main(int argc, char *argv[])
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("initdb"));
 
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage(progname);
-			exit(0);
-		}
-		if (strcmp(argv[1], "--version") == 0 || strcmp(argv[1], "-V") == 0)
-		{
-			puts("initdb (PostgreSQL) " PG_VERSION);
-			exit(0);
-		}
-	}
-
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "dD:E:kL:nNU:WA:sST:X:g", long_options, _index)) != -1)
+	while ((c = getopt_long(argc, argv, "dD:E:kL:nNU:WA:sST:X:gV?", long_options, _index)) != -1)
 	{
 		switch (c)
 		{
@@ -3243,6 +3229,14 @@ main(int argc, char *argv[])
 			case 'g':
 SetDataDirectoryCreatePerm(PG_DIR_MODE_GROUP);
 break;
+			case 'V':
+puts("initdb (PostgreSQL) " PG_VERSION);
+exit(0);
+break;
+			case '?':
+usage(progname);
+exit(0);
+break;
 			default:
 /* getopt_long already emitted a complaint */
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index ef4cfc4384..03f7875514 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -2169,24 +2169,9 @@ main(int argc, char **argv)
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
 
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage();
-			exit(0);
-		}
-		else if (strcmp(argv[1], "-V") == 0
- || strcmp(argv[1], "--version") == 0)
-		{
-			puts("pg_basebackup (PostgreSQL) " PG_VERSION);
-			exit(0);
-		}
-	}
-
 	atexit(cleanup_directories_atexit);
 
-	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP",
+	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvPV?",
 			long_options, _index)) != -1)
 	{
 		switch (c)
@@ -2328,6 +2313,14 @@ main(int argc, char **argv)
 			case 'P':
 showprogress = true;
 break;
+			case 'V':
+puts("pg_basebackup (PostgreSQL) " PG_VERSION);
+exit(0);
+break;
+			case '?':
+usage();
+exit(0);
+break;
 			case 3:
 verify_checksums = false;
 break;
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 8be8d48a8a..5318454d67 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -509,22 +509,7 @@ main(int argc, char **argv)
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
 
-	if (argc > 1)
-	{
-		if (strcmp(argv[1], "--help") == 0 || strcmp(argv[1], "-?") == 0)
-		{
-			usage();
-			exit(0);
-		}
-		else if (strcmp(argv[1], "-V") == 0 ||
- strcmp(argv[1], "--version") == 0)
-		{
-			puts("pg_receivewal (PostgreSQL) " PG_VERSION);
-			exit(0);
-		}
-	}
-
-	while ((c = getopt_long(argc, argv, "D:d:E:h:p:U:s:S:nwWvZ:",
+	while ((c = getopt_long(argc, argv, "D:d:E:h:p:U:s:S:nwWvVZ:?",
 			long_options, _index)) != -1)
 	{
 		switch (c)
@@ -584,6 +569,10 @@ main(int argc, char **argv)
 			case 'v':
 verbose++;
 break;
+			case 'V':
+puts("pg_receivewal (PostgreSQL) " PG_VERSION);
+exit(0);
+break;
 			case 'Z':
 compresslevel = atoi(optarg);
 if (compresslevel < 0 || compresslevel > 9)
@@ -593,6 +582,10 @@ main(int argc, char **argv)
 	exit(1);
 }
 break;
+			case '?':
+usage();
+exit(0);
+break;
 /* action */
 			case 1:
 do_create_slot = true;
diff --git a/src/bin/pg_basebackup/pg_recvlogical.c 

Re: pgbench-ycsb

2018-07-22 Thread Dmitry Dolgov
> On Sat, 21 Jul 2018 at 22:41, Fabien COELHO  wrote:
>
> >> Could you provide a link to the specification?
> >>
> >> I cannot find something simple, and I was kind of hoping to avoid diving
> >> into the source code of the java tool on github:-) In particular, I'm
> >> looking for a description of the expected underlying schema and its size
> >> (scale) parameters.
> >
> > There are the description files for different workloads, like [1], (with the
> > custom amount of records, of course) and the schema [2]. Would this
> > information be enough?
> >
> > [1]: 
> > https://github.com/brianfrankcooper/YCSB/blob/master/workloads/workloada
> > [2]: 
> > https://github.com/brianfrankcooper/YCSB/blob/master/jdbc/src/main/resources/sql/create_table.sql
>
> The second link is a start.
>
> I notice that the submitted patch transactions do not apply to this
> schema, which is significantly different from the pgbench TPC-B (like)
> benchmark.
>
> The YCSB schema is key -> fields[0-9], all of them TEXT, somehow expected
> to be 100 bytes each, and update is expected to update one of these
> fields.
>
> This suggest that maybe a -i extension would be in order. Possibly
>
> pgbench -i -s 1 --layout={tpcb,ycsb} (or schema ?)
>
> where "tpcb" would be the default?
>
> I'm sceptical about using a textual primary key as it corresponds more to
> NoSQL limitations than to an actual design choice. I'd be okay with INT8
> as a pkey.
>
> I find the YSCB tablename "usertable" especially unhelpful. Maybe
> "pgbench_ycsb"?

Just to clarify - if I understand Anthony correctly, this proposal is not about
implementing exactly YCSB as it is, but more about using zipfian distribution
for an id in the regular pgbench table structure in conjunction with read/write
balance to simulate something similar to it.

And probably instead of implementing the exact YCSB workload inside pgbench, it
makes more sense to add PostgreSQL Jsonb as one of the options into the
framework itself (I was in the middle of it few years ago, but then was
distracted by some interesting benchmarking results).



Re: Non-portable shell code in pg_upgrade tap tests

2018-07-22 Thread Noah Misch
On Fri, Jul 20, 2018 at 11:09:13AM -0400, Tom Lane wrote:
> Victor Wagner  writes:
> > Tom Lane  wrote:
> >> Please send a patch.  Most of us do not have access to old shells
> 
> > Here it goes. Previous letter was written before fixed tests were
> > completed, because this old machine is slow.
> 
> Thanks.  Will check on my own dinosaurs, and push if I don't find
> a problem.

I'd say the right way to fix this is the one specified in
https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/The-Make-Macro-SHELL.html,
in particular:

  Using @SHELL@ means that your makefile will benefit from the same improved
  shell, such as bash or ksh, that was discovered during configure, so that
  you aren't fighting two different sets of shell bugs between the two
  contexts.

The Solaris 10 /bin/sh can't even run most of "configure", but Solaris 10 also
provides /bin/ksh and /usr/xpg4/bin/sh with the usual modern features.
("configure" works on Solaris 10 by finding a good shell and re-execing
itself.)

Maintaining shell scripts to an even lower common denominator than Autoconf
would be a good deal less reliable.



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Andres Freund



On July 21, 2018 11:52:05 PM PDT, Tom Lane  wrote:
>But I'm dubious
>about changing it in only one app.

Agreed.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Tom Lane
Andres Freund  writes:
> On July 21, 2018 11:15:51 PM PDT, Tom Lane  wrote:
>> This is, in fact, how it's done in all PG apps.

> Think there's a fair argument that we should improve that at some point...

Perhaps.  Peter E. might remember why it's like that.  But I'm dubious
about changing it in only one app.

regards, tom lane



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Andres Freund



On July 21, 2018 11:15:51 PM PDT, Tom Lane  wrote:
>Andrei Korigodski  writes:
>> There is a small catch in the parsing of --help and --version args by
>pgbench:
>> they are processed correctly only as the first argument.
>
>This is, in fact, how it's done in all PG apps.

Think there's a fair argument that we should improve that at some point...

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: pgbench: improve --help and --version parsing

2018-07-22 Thread Tom Lane
Andrei Korigodski  writes:
> There is a small catch in the parsing of --help and --version args by pgbench:
> they are processed correctly only as the first argument.

This is, in fact, how it's done in all PG apps.

regards, tom lane