date:20190111

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-01-11 Thread Dean Rasheed

On Fri, 11 Jan 2019, 21:18 Tomas Vondra 
> On 1/10/19 4:20 PM, Dean Rasheed wrote:
> > ...
> >
> > So perhaps what we should do for multivariate stats is simply use the
> > relative standard error approach (i.e., reuse the patch in [2] with a
> > 20% RSE cutoff). That had a lot of testing at the time, against a wide
> > range of data distributions, and proved to be very good, not to
> > mention being very simple.
> >
> > That approach would encompass both groups more and less common than
> > the base frequency, because it relies entirely on the group appearing
> > enough times in the sample to infer that any errors on the resulting
> > estimates will be reasonably well controlled. It wouldn't actually
> > look at the base frequency at all in deciding which items to keep.
> >
>
> I've been looking at this approach today, and I'm a bit puzzled. That
> patch essentially uses SRE to compute mincount like this:
>
> mincount = n*(N-n) / (N-n+0.04*n*(N-1))
>
> and then includes all items more common than this threshold.

Right.

How could
> that handle items significantly less common than the base frequency?
>

Well what I meant was that it will *allow* items significantly less common
than the base frequency, because it's not even looking at the base
frequency. For example, if the table size were N=100,000 and we sampled
n=10,000 rows from that, mincount would work out as 22. So it's easy to
construct allowed items more common than that and still significantly less
common than their base frequency.

A possible refinement would be to say that if there are more than
stats_target items more common than this mincount threshold, rather than
excluding the least common ones to get the target number of items, exclude
the ones closest to their base frequencies, on the grounds that those are
the ones for which the MCV stats will make the least difference. That might
complicate the code somewhat though -- I don't have it in front of me, so I
can't remember if it even tracks more than stats_target items.

Regards,
Dean

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Pavel Stehule

pá 11. 1. 2019 v 20:11 odesílatel Robert Haas 
napsal:

> On Fri, Jan 11, 2019 at 2:04 PM Andres Freund  wrote:
> > > Maybe we could consider a more extensible syntax that is attached to
> > > the contained SELECT rather than the containing WITH.  Then CTEs would
> > > be less special; there'd be a place to put hints controlling top-level
> > > queries, subselects, views etc too (perhaps eventually join hints,
> > > parallelism hints etc, but "materialize this" would be just another
> > > one of those things).  That'd be all-in.
> >
> > I think you have some purity arguments here, but the likelihood of us
> > developing a full-blown solution is not that high, and the lack of
> > inlinable CTEs is *really* hurting us. As long as the design doesn't
> > block a full solution, if we go there, I think it's a very acceptable
> > blemish in comparison to the benefits we'd get.
>
> Also, it seems to me that this is properly a property of the
> individual WITH clause, not the query as a whole.
>
> I mean I suppose we could do
>
> WITH or_with_out_you OPTIONS (materialized false) AS (SELECT 'mariah
> carey') SELECT ...
>
> That'd allow for extensibility, have the write scope, and look like
> what we do elsewhere.  It looks a little less elegant than
>
> WITH cte_name [[NOT] MATERIALIZED] AS (query) main_query...
>
> ...but maybe elegance for extensibility is a good trade.
>

I like this explicit syntax (both variant can be used). From my
perspective, it is much better than hints in comments.

Regards

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>

O_DIRECT for relations and SLRUs (Prototype)

2019-01-11 Thread Michael Paquier

Hi all,
(Added Kevin in CC)

There have been over the ages discussions about getting better
O_DIRECT support to close the gap with other players in the database
market, but I have not actually seen on those lists a patch which
makes use of O_DIRECT for relations and SLRUs (perhaps I missed it,
anyway that would most likely conflict now).

Attached is a toy patch that I have begun using for tests in this
area.  That's nothing really serious at this stage, but you can use
that if you would like to see the impact of O_DIRECT.  Of course,
things get significantly slower.  The patch is able to compile, pass
regression tests, and looks stable.  So that's usable for experiments.
The patch uses a GUC called direct_io, enabled to true to ease
regression testing when applying it.

Note that pg_attribute_aligned() cannot be used as that's not an
option with clang and a couple of other comilers as far as I know, so
the patch uses a simple set of placeholder buffers large enough to be
aligned with the OS pages, which should be 4k for Linux by the way,
and not set to BLCKSZ, but for WAL's O_DIRECT we don't really care
much with such details.

If there is interest for such things, perhaps we could get a patch
sorted out, with some angles of attack like:
- Move to use of page-aligned buffers for relations and SLRUs.
- Split use of O_DIRECT for SLRU and relations into separate GUCs.
- Perhaps other things.
However this is a large and very controversial topic, and of course
more complex than the experiment attached, still this prototype is fun
to play with.

Thanks for reading!
--
Michael
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index 3623352b9c..a3ba8cddba 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -120,6 +120,13 @@ typedef enum
 	SLRU_CLOSE_FAILED
 } SlruErrorCause;
 
+/*
+ * Static area used as placeholder for O_DIRECT operations.  Make sure
+ * to keep its size at twice the block size so as it would be aligned
+ * with the OS pages.
+ */
+static char direct_buf[BLCKSZ + BLCKSZ];
+
 static SlruErrorCause slru_errcause;
 static int	slru_errno;
 
@@ -644,6 +651,12 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
 	int			offset = rpageno * BLCKSZ;
 	char		path[MAXPGPATH];
 	int			fd;
+	int			direct_flag = 0;
+	char	   *aligned_buf = (char *) TYPEALIGN(BLCKSZ, direct_buf);
+
+	/* Use direct I/O if configured as such */
+	if (direct_io)
+		direct_flag |= O_DIRECT | O_SYNC;
 
 	SlruFileName(ctl, path, segno);
 
@@ -654,7 +667,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
 	 * SlruPhysicalWritePage).  Hence, if we are InRecovery, allow the case
 	 * where the file doesn't exist, and return zeroes instead.
 	 */
-	fd = OpenTransientFile(path, O_RDWR | PG_BINARY);
+	fd = OpenTransientFile(path, O_RDWR | PG_BINARY | direct_flag);
 	if (fd < 0)
 	{
 		if (errno != ENOENT || !InRecovery)
@@ -681,7 +694,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
 
 	errno = 0;
 	pgstat_report_wait_start(WAIT_EVENT_SLRU_READ);
-	if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
+	if (read(fd, aligned_buf, BLCKSZ) != BLCKSZ)
 	{
 		pgstat_report_wait_end();
 		slru_errcause = SLRU_READ_FAILED;
@@ -698,6 +711,9 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno)
 		return false;
 	}
 
+	/* copy contents back to the slot */
+	memcpy(shared->page_buffer[slotno], aligned_buf, BLCKSZ);
+
 	return true;
 }
 
@@ -724,6 +740,12 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
 	int			offset = rpageno * BLCKSZ;
 	char		path[MAXPGPATH];
 	int			fd = -1;
+	int			direct_flag = 0;
+	char	   *aligned_buf = (char *) TYPEALIGN(BLCKSZ, direct_buf);
+
+	/* Use direct I/O if configured as such */
+	if (direct_io)
+		direct_flag |= O_DIRECT | O_SYNC;
 
 	/*
 	 * Honor the write-WAL-before-data rule, if appropriate, so that we do not
@@ -804,7 +826,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
 		 * don't use O_EXCL or O_TRUNC or anything like that.
 		 */
 		SlruFileName(ctl, path, segno);
-		fd = OpenTransientFile(path, O_RDWR | O_CREAT | PG_BINARY);
+		fd = OpenTransientFile(path, O_RDWR | O_CREAT | PG_BINARY | direct_flag);
 		if (fd < 0)
 		{
 			slru_errcause = SLRU_OPEN_FAILED;
@@ -840,9 +862,12 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
 		return false;
 	}
 
+	/* use aligned buffer for O_DIRECT */
+	memcpy(aligned_buf, shared->page_buffer[slotno], BLCKSZ);
+
 	errno = 0;
 	pgstat_report_wait_start(WAIT_EVENT_SLRU_WRITE);
-	if (write(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ)
+	if (write(fd, aligned_buf, BLCKSZ) != BLCKSZ)
 	{
 		pgstat_report_wait_end();
 		/* if write didn't set errno, assume problem is no disk space */
@@ -858,12 +883,13 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
 
 	/*
 	 * If not part of Flush, need to fsync now.  We assume this happens
-	 *

Re: Checksum errors in pg_stat_database

2019-01-11 Thread David Steele


On 1/11/19 10:25 PM, Magnus Hagander wrote:
On Fri, Jan 11, 2019 at 9:20 PM Tomas Vondra 
On 1/11/19 7:40 PM, Robert Haas wrote:

 > But I'm tentatively in favor of your proposal anyway, because it's
 > pretty simple and cheap and might help people, and doing something
 > noticeably better is probably annoyingly complicated.
 >

+1

Yeah, that's the idea behind it -- it's cheap, and an 
early-warning-indicator.


+1

--
-David
da...@pgmasters.net

Re: Pluggable Storage - Andres's take

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-12 01:35:06 +0100, Dmitry Dolgov wrote:
> > On Sat, Dec 15, 2018 at 8:37 PM Andres Freund  wrote:
> >
> > We need to dump the table access method at dump time, otherwise we loose
> > that information.
> 
> As a result of the discussion in [1] (btw, thanks for starting it), here is
> proposed solution with tracking current default_table_access_method. Next I'll
> tackle similar issue for psql and probably add some tests for both patches.

Thanks!

> +/*
> + * Set the proper default_table_access_method value for the table.
> + */
> +static void
> +_selectTableAccessMethod(ArchiveHandle *AH, const char *tableam)
> +{
> + PQExpBuffer cmd = createPQExpBuffer();
> + const char *want, *have;
> +
> + have = AH->currTableAm;
> + want = tableam;
> +
> + if (!want)
> + return;
> +
> + if (have && strcmp(want, have) == 0)
> + return;
> +
> +
> + appendPQExpBuffer(cmd, "SET default_table_access_method = %s;", 
> tableam);

This needs escaping, at the very least with "", but better with proper
routines for dealing with identifiers.



> @@ -5914,7 +5922,7 @@ getTables(Archive *fout, int *numTables)
> "tc.relfrozenxid AS 
> tfrozenxid, "
> "tc.relminmxid AS tminmxid, "
> "c.relpersistence, 
> c.relispopulated, "
> -   "c.relreplident, c.relpages, "
> +   "c.relreplident, c.relpages, 
> am.amname AS amname, "

That AS doesn't do anything, does it?


>   /* other fields were zeroed above */
>  
> @@ -9355,7 +9370,7 @@ dumpComment(Archive *fout, const char *type, const char 
> *name,
>* post-data.
>*/
>   ArchiveEntry(fout, nilCatalogId, createDumpId(),
> -  tag->data, namespace, NULL, owner,
> +  tag->data, namespace, NULL, owner, 
> NULL,
>"COMMENT", SECTION_NONE,
>query->data, "", NULL,
>&(dumpId), 1,

We really ought to move the arguments to a struct, so we don't generate
quite as much useless diffs whenever we do a change around one of
these...

Greetings,

Andres Freund

Re: Unified logging system for command-line programs

2019-01-11 Thread Donald Dong



> On Jan 11, 2019, at 9:14 AM, Peter Eisentraut 
>  wrote:
> 
>> The patch cannot be applied directly on HEAD. So I patched it on top of 
>> 60d99797bf.
> 
> Here is an updated patch with the merge conflicts of my own design
> resolved.  No functionality changes.
> 
>> When I call pg_log_error() in initdb, I see
>> 
>> Program received signal SIGSEGV, Segmentation fault.
>> __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
>> 62  ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
>> (gdb) bt
>> #0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62
>> #1  0x55568f96 in dopr.constprop ()
>> #2  0x55569ddb in pg_vsnprintf ()
>> #3  0x55564236 in pg_log_generic ()
>> #4  0xc240 in main ()
> 
> What do you mean exactly by "I call pg_log_error()"?  The existing calls
> in initdb clearly work, at least some of them, that is covered by the
> test suite.  Are you adding new calls?

Thank you. I did add a new call for my local testing. There are no more errors
after re-applying the patch on master.

>> I'm not sure what would be causing this behavior. I would appreciate
>> references or docs for testing and debugging patches more efficiently.
>> Now I'm having difficulties loading symbols of initdb in gdb.
> 
> The above looks like you'd probably get a better insight by compiling
> with -O0 or some other lower optimization setting.
> 
> There is also this:
> https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

Thank you for the reference. That's very helpful!


I noticed in some places such as

pg_log_error("no data directory specified");
fprintf(stderr,
_("You must identify the directory 
where the data for this database system\n"
...

and

pg_log_warning("enabling \"trust\" authentication for local 
connections");
fprintf(stderr, _("You can change this by editing pg_hba.conf or using 
the option -A, or\n"
"--auth-local and --auth-host, the next 
time you run initdb.\n"));

, pg_log  does not completely replace fprintf. Would it be better to use pg_log
so the logging level can also filter these messages?

Re: Pluggable Storage - Andres's take

2019-01-11 Thread Dmitry Dolgov

> On Sat, Dec 15, 2018 at 8:37 PM Andres Freund  wrote:
>
> We need to dump the table access method at dump time, otherwise we loose
> that information.

As a result of the discussion in [1] (btw, thanks for starting it), here is
proposed solution with tracking current default_table_access_method. Next I'll
tackle similar issue for psql and probably add some tests for both patches.

[1]: 
https://www.postgresql.org/message-id/flat/20190107235616.6lur25ph22u5u5av%40alap3.anarazel.de

pg_dump_access_method.patch
Description: Binary data

Re: could recovery_target_timeline=latest be the default in standby mode?

2019-01-11 Thread Michael Paquier

On Fri, Jan 11, 2019 at 11:17:48AM +0100, Peter Eisentraut wrote:
> Attached revised 0002 with those changes.

This one looks fine.

> In that test, if I change the 'current' to 'latest', the test doesn't
> fail, so it's probably not a good test.

I can see your point.  You would need a diverging timeline to test for
'latest', which can surely be done as part of 003_recovery_targets.pl.
It seems to me that that the test has initial value to make sure that
we replay up to the end of the produced timeline's data, which is
something untested now as the script has only restart points set to
before the end of the timeline.  If you think that's not a good
addition now, I am also fine to not include it.
--
Michael


signature.asc
Description: PGP signature

Re: Prevent extension creation in temporary schemas

2019-01-11 Thread Michael Paquier

On Sat, Jan 12, 2019 at 08:34:37AM +0900, Michael Paquier wrote:
> Then the extension is showing up as beginning to be present for other
> users.  I am mainly wondering if this case has actually been thought
> about in the past or discussed, and what to do about that and if we
> need to do something.

The point here is about the visibility in \dx.
--
Michael


signature.asc
Description: PGP signature

Re: Prevent extension creation in temporary schemas

2019-01-11 Thread Michael Paquier

On Fri, Jan 11, 2019 at 02:22:01PM -0500, Robert Haas wrote:
> On Sun, Jan 6, 2019 at 10:26 PM Michael Paquier  wrote:
>> This combination makes no actual sense, so wouldn't it be better to
>> restrict the case?
> 
> Hmm.  What exactly doesn't make sense about it?

In my mind, extensions are designed to be database-wide objects which
are visible to all sessions.  Perhaps I am just confused by what I
think they should be, and I can see no trace on the archives about
concept of extensions + temp schema as base (adding Dimitri in CC if
he has an idea).  I can see as well that there have been stuff about
using temporary objects in extension script though ("Fix bugs with
temporary or transient tables used in extension scripts" in release
notes of 9.1).

For most of extensions, this can randomly finish with strange error
messages, say that:
=# create extension file_fdw with schema pg_temp_3;
ERROR:  42883: function file_fdw_handler() does not exist
LOCATION:  LookupFuncName, parse_func.c:2088

There are cases where the extension can be created:
=# create extension pgcrypto with schema pg_temp_3;
CREATE EXTENSION
Time: 36.567 ms
=# \dx pgcrypto
  List of installed extensions
   Name   | Version |  Schema   |   Description
--+-+---+-
 pgcrypto | 1.3 | pg_temp_3 | cryptographic functions
(1 row)

Then the extension is showing up as beginning to be present for other
users.  I am mainly wondering if this case has actually been thought
about in the past or discussed, and what to do about that and if we
need to do something.  Temporary extensions can exist as long as the
extension script does not include for example REVOKE queries on the
functions it creates (which should actually work?), and there is a
separate thread about restraining 2PC when touching the temporary
namespace for the creation of many objects, and extensions are one
case discussed.  Still the concept looks a bit wider, so I spawned a
separate thread.
--
Michael


signature.asc
Description: PGP signature

Re: port of INSTALL file generation to XSLT

2019-01-11 Thread Mitar

Hi!

On Fri, Jan 11, 2019 at 1:05 PM Tom Lane  wrote:
> Failure would leave a .tmp file behind, but I doubt we care enough
> about that to work harder than this.

Maybe just make sure that "make clean" removes it?


Mitar

-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m

Re: Three animals fail test-decoding-check on REL_10_STABLE

2019-01-11 Thread Tom Lane

I wrote:
> So it appears that in v10,
>   ./configure ... --enable-tap-tests ...
>   make
>   make install
>   cd contrib/test_decoding
>   make check
> fails due to failure to install test_decoding into the tmp_install
> tree, while it works in v11.  Moreover, that's not specific to
> gaur: it happens on my Linux box too.  I'm not very sure why only
> three buildfarm animals are unhappy --- maybe in the buildfarm
> context it requires a specific combination of options to show the
> problem.

While I think I've fixed this bug, I'm still quite confused about why
only some buildfarm animals showed the problem.  Comparing log files,
it seems that the ones that were working were relying on having
done a complete temp-install at a higher level, while the ones that
were failing were trying to make a temp install from scratch in
contrib/test_decoding and hence seeing the bug.  For example,
longfin's test-decoding-check log starts out

napshot: 2019-01-11 21:12:17

/Applications/Xcode.app/Contents/Developer/usr/bin/make -C 
../../src/test/regress all
/Applications/Xcode.app/Contents/Developer/usr/bin/make -C ../../../src/port all
/Applications/Xcode.app/Contents/Developer/usr/bin/make -C ../backend 
submake-errcodes
make[3]: Nothing to be done for `submake-errcodes'.

while gaur's starts out

Snapshot: 2019-01-11 07:30:45

rm -rf '/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install
/bin/sh ../../config/install-sh -c -d 
'/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install/log
make -C '../..' 
DESTDIR='/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install install 
>'/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install/log/install.log 
2>&1
make -j1  checkprep 
>>'/home/bfarm/bf-data/REL_10_STABLE/pgsql.build'/tmp_install/log/install.log 
2>&1
make -C ../../src/test/regress all
make[1]: Entering directory 
`/home/bfarm/bf-data/REL_10_STABLE/pgsql.build/src/test/regress'
make -C ../../../src/port all
make[2]: Entering directory 
`/home/bfarm/bf-data/REL_10_STABLE/pgsql.build/src/port'
make -C ../backend submake-errcodes
make[3]: Entering directory 
`/home/bfarm/bf-data/REL_10_STABLE/pgsql.build/src/backend'
make[3]: Nothing to be done for `submake-errcodes'.

These two animals are running the same buildfarm client version,
and I don't see any relevant difference in their configurations,
so why are they behaving differently?  Andrew, any ideas?

regards, tom lane

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

2019-01-11 Thread Michael Paquier

On Fri, Jan 11, 2019 at 12:52:08PM -0500, Robert Haas wrote:
> With the patch, the PrimaryConnInfo and PrimarySlotName arguments are
> removed from RequestXLogStreaming.  That means that the new
> walreceiver could come to a different conclusion about the values of
> those arguments than the startup process.  In particular, it could end
> up thinking that primary_conninfo is empty when, if the startup
> process had thought that, the walreceiver would never have been
> launched in the first place.  But it's not obvious that you've added
> any logic in WALReceiverMain or elsewhere to compensate for this
> possibility -- what would happen in that case?  Would we crash?
> Connect to the wrong server?

If I contemplate the patch this morning there is this bit:
@@ -291,32 +295,40 @@ WalReceiverMain(void)
/* Unblock signals (they were blocked when the postmaster forked
us) */
PG_SETMASK();

+   /*
+* Fail immediately if primary_conninfo goes missing, better safe than
+* sorry.
+*/
+   if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0)
+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("cannot connect to the primary server as 
\"primary_conninfo\" is not defined")));

So the answer to your question is: the WAL receiver fails to start.

> I might be wrong here, but I'm inclined to think that this scenario
> hasn't really been contemplated carefully by the patch authors.  There
> are no added TAP tests for the scenario where the values differ
> between the two processes, no code comments which explain why it's OK
> if that happens, really no mention of it in the patch at all.  And on
> that basis I'm inclined to think that Andres is really quite correct
> to be worried about this.  The problem he's talking about here is very
> low-probability because the race condition is narrow, but it's real,
> and it surely needs to be handled somehow.

primary_conninfo and primary_slot_name are PGC_POSTMASTER now, so
adding tests now don't really make sense.
--
Michael


signature.asc
Description: PGP signature

declaration-after-statement (was Re: Ryu floating point output patch)

2019-01-11 Thread Andrew Gierth

> "Andres" == Andres Freund  writes:

 >> 4. Can we allow declaration-after-statement please? That would allow
 >> keeping this code significantly closer to its upstream.

 Andres> As I said on IRC: I'm personally on-board with changing this
 Andres> styilistic requirement, but I also think that it'd be OK to
 Andres> just disable the warning for the individual object files when
 Andres> they're imported (likely using target specific makefile
 Andres> variable assignements) if we cannot get agreement on the global
 Andres> policy change.

So are there any strong disagreements with the idea of disabling the
warning at least for these specific files? If I don't hear any
objections, I'll go for that in the next version of the patch.

-- 
Andrew (irc:RhodiumToad)

How to decode the output from pgoutput

2019-01-11 Thread Ami Ganguli

Hi,

I'm writing a tool to process a logical replication stream.  The intent is
to use publications and subscriptions as an initial filter, and then use
the replication stream to trigger external events.  So my tool should
connect to the master in the same manner as a replication slave, but it
does different things with the data.

So far I've used pg_recvlogical.c as a guide and I'm successfully
connecting to the master, creating a replication slot, and subscribing to a
couple of publications.

But now I'm stuck at further interpreting the data.  Can anybody point me
to further documentation or the right code to look at to figure out the
format of the WAL data stream?

Cheers,
Ami.

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-12 01:55:23 +0300, Sergei Kornilov wrote:
> > pg_ctl reload is executed (or the equivalent),
> > different backends reload the file at different times.
> > There are no added TAP tests for the scenario where the values differ 
> > between
> the two processes, no code comments which explain why it's OK
> 
> But wait, connection string can not be changed via reload, only during cluster
> start. How it can differs between processes?

One of the major reasons for moving recovery parameters from
recovery.conf to GUCs was to make them changable at runtime in the
future.

Greetings,

Andres Freund

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

2019-01-11 Thread Sergei Kornilov

Hello> pg_ctl reload is executed (or the equivalent),> different backends reload the file at different times.> There are no added TAP tests for the scenario where the values differ between the two processes, no code comments which explain why it's OKBut wait, connection string can not be changed via reload, only during cluster start. How it can differs between processes? regards, Sergei

Re: problems with foreign keys on partitioned tables

2019-01-11 Thread Alvaro Herrera

Hi Amit

On 2019-Jan-09, Amit Langote wrote:

> I noticed a couple of problems with foreign keys on partitioned tables.

Ouch, thanks for reporting.  I think 0001 needs a bit of a tweak in pg11
to avoid an ABI break -- I intend to study this one and try to push
early next week.  I'm going to see about pushing 0002 shortly,
adding some of your tests.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Three animals fail test-decoding-check on REL_10_STABLE

2019-01-11 Thread Tom Lane

I wrote:
> There's no obvious difference between bedda9fbb and 6dd690be3,
> so I surmise that that patch depended somehow on some previous
> work that only went into v11 not v10.  Haven't found what, yet.

Ah, looks like it was 42e61c774.  I'll push a fix shortly.

regards, tom lane

Re: minor fix in CancelVirtualTransaction

2019-01-11 Thread Alvaro Herrera

On 2019-Jan-04, Peter Eisentraut wrote:

> Your reasoning seems correct to me.
> 
> Maybe add a code comment along the lines of "once we have found the
> right ... we don't need to check the remaining ...".
> 
> Or, you can make this even more clear by comparing the backendId
> directly with the proc entry:

I did both (the second idea was a non-obvious very nice cleanup --
thanks).  Patch attached.

However, now I realize that this code is not covered at all, so I'll put
this patch to sleep until I write some test for it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index faa46742bf..984ab7258a 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2679,13 +2679,10 @@ CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
 	{
 		int			pgprocno = arrayP->pgprocnos[index];
 		PGPROC	   *proc = [pgprocno];
-		VirtualTransactionId procvxid;
 
-		GET_VXID_FROM_PGPROC(procvxid, *proc);
-
-		if (procvxid.backendId == vxid.backendId)
+		if (vxid.backendId == proc->backendId)
 		{
-			if (procvxid.localTransactionId == vxid.localTransactionId)
+			if (vxid.localTransactionId == proc->lxid)
 			{
 proc->recoveryConflictPending = true;
 pid = proc->pid;
@@ -2695,9 +2692,14 @@ CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
 	 * Kill the pid if it's still here. If not, that's what we
 	 * wanted so ignore any errors.
 	 */
-	(void) SendProcSignal(pid, sigmode, vxid.backendId);
+	(void) SendProcSignal(pid, sigmode, proc->backendId);
 }
 			}
+
+			/*
+			 * As soon as we find the right proc entry, we don't need to check
+			 * the remaining ones.
+			 */
 			break;
 		}
 	}

Re: Three animals fail test-decoding-check on REL_10_STABLE

2019-01-11 Thread Tom Lane

I wrote:
> Thomas Munro  writes:
>> Does this mean it didn't build the test_decoding module?

> I'm wondering if it built it but didn't install it, as a result of
> some problem with
>> bedda9fbb7 Mon Dec 31 21:57:57 2018 UTC  Process EXTRA_INSTALL
>> serially, during the first temp-install.

So it appears that in v10,

./configure ... --enable-tap-tests ...
make
make install
cd contrib/test_decoding
make check

fails due to failure to install test_decoding into the tmp_install
tree, while it works in v11.  Moreover, that's not specific to
gaur: it happens on my Linux box too.  I'm not very sure why only
three buildfarm animals are unhappy --- maybe in the buildfarm
context it requires a specific combination of options to show the
problem.

There's no obvious difference between bedda9fbb and 6dd690be3,
so I surmise that that patch depended somehow on some previous
work that only went into v11 not v10.  Haven't found what, yet.

regards, tom lane

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Merlin Moncure

On Tue, Jul 24, 2018 at 6:10 PM Andres Freund  wrote:
> My read of the concensus (in which I am in the majority, so I might be
> biased) is that we do want inlining to be the default. We were thinking
> that it'd be necessary to provide a way to force inlining on the SQL
> level for individual CTEs.

This is correct.  Suggesting that we need syntax to disabling inlining
at the CTE level, and/or GUC to control the behavior (which I agree
should be defualted to inline).  Something like
enable_cte_inline=true; I'm not very enthusiastic about explicitly
breaking intentionally introduced optimization fences and then forcing
people to inject our OFFSET 0 hack.   This is just too unpleasant to
contemplate...what  happens if we come up with a better implemntation
of OFFSET?  yuck.

Thanks for providing this, CTE plan problems are a real bugaboo.

merlin

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-01-11 Thread Tomas Vondra



On 1/10/19 4:20 PM, Dean Rasheed wrote:
> ...
>
> So perhaps what we should do for multivariate stats is simply use the
> relative standard error approach (i.e., reuse the patch in [2] with a
> 20% RSE cutoff). That had a lot of testing at the time, against a wide
> range of data distributions, and proved to be very good, not to
> mention being very simple.
> 
> That approach would encompass both groups more and less common than
> the base frequency, because it relies entirely on the group appearing
> enough times in the sample to infer that any errors on the resulting
> estimates will be reasonably well controlled. It wouldn't actually
> look at the base frequency at all in deciding which items to keep.
> 

I've been looking at this approach today, and I'm a bit puzzled. That
patch essentially uses SRE to compute mincount like this:

mincount = n*(N-n) / (N-n+0.04*n*(N-1))

and then includes all items more common than this threshold. How could
that handle items significantly less common than the base frequency?

Or did you mean to use the SRE, but in some different way?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: port of INSTALL file generation to XSLT

2019-01-11 Thread Tom Lane

I wrote:
> 1. No pandoc on borka, where we build tarballs:
> pgsql@borka:~$ which pandoc
> pgsql@borka:~$ 

That part's sorted, anyway.

pgsql@borka:~$ pandoc --version
pandoc 1.17.2
Compiled with texmath 0.8.6.7, highlighting-kate 0.6.3.
Syntax highlighting is supported for the following languages:
...

> 2. If there's no pandoc, this coding silently produces a zero-size
> INSTALL file.  I do not find that acceptable.

Seems like it might be sufficient for the rule to be

$(PANDOC) $< -t plain > $@.tmp
$(ICONV) -f utf8 -t us-ascii//TRANSLIT < $@.tmp > $@
rm -f $@.tmp

Failure would leave a .tmp file behind, but I doubt we care enough
about that to work harder than this.

regards, tom lane

Re: Checksum errors in pg_stat_database

2019-01-11 Thread Magnus Hagander

On Fri, Jan 11, 2019 at 9:20 PM Tomas Vondra 
wrote:

>
>
>
> On 1/11/19 7:40 PM, Robert Haas wrote:
> > On Fri, Jan 11, 2019 at 5:21 AM Magnus Hagander 
> wrote:
> >> Would it make sense to add a column to pg_stat_database showing
> >> the total number of checksum errors that have occurred in a database?
> >>
> >> It's really a ">1 means it's bad", but it's a lot easier to monitor
> >> that in the statistics views, and given how much a lot of people
> >> set their systems out to log, it's far too easy to miss individual
> >> checksum matches in the logs.
> >>
> >> If we track it at the database level, I don't think the overhead
> >> of adding one more counter would be very high either.
> >
> > It's probably not the idea way to track it.  If you have a terabyte or
> > fifty of data, and you see that you have some checksum failures, good
> > luck finding the offending blocks.
> >
>
> Isn't that somewhat similar to deadlocks, which we also track in
> pg_stat_database? The number of deadlocks is rather useless on it's own,
> you need to dive into the server log to find the details. Same for
> checksum errors.
>

It is a bit similar yeah. Though a checksum counter is really a "you need
to look at fixing this right away" in a bit more sense than deadlocks. But
yes, the fact that we already tracks deadlocks there is a good example. (Of
course, I believe I added that one at some point as well, so I'm clearly
biased there)


> But I'm tentatively in favor of your proposal anyway, because it's
> > pretty simple and cheap and might help people, and doing something
> > noticeably better is probably annoyingly complicated.
> >
>
> +1
>

Yeah, that's the idea behind it -- it's cheap, and an
early-warning-indicator.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: Checksum errors in pg_stat_database

2019-01-11 Thread Tomas Vondra





On 1/11/19 7:40 PM, Robert Haas wrote:
> On Fri, Jan 11, 2019 at 5:21 AM Magnus Hagander  wrote:
>> Would it make sense to add a column to pg_stat_database showing
>> the total number of checksum errors that have occurred in a database?
>> 
>> It's really a ">1 means it's bad", but it's a lot easier to monitor
>> that in the statistics views, and given how much a lot of people
>> set their systems out to log, it's far too easy to miss individual
>> checksum matches in the logs.
>>
>> If we track it at the database level, I don't think the overhead 
>> of adding one more counter would be very high either.
> 
> It's probably not the idea way to track it.  If you have a terabyte or
> fifty of data, and you see that you have some checksum failures, good
> luck finding the offending blocks.
> 

Isn't that somewhat similar to deadlocks, which we also track in
pg_stat_database? The number of deadlocks is rather useless on it's own,
you need to dive into the server log to find the details. Same for
checksum errors.

> But I'm tentatively in favor of your proposal anyway, because it's
> pretty simple and cheap and might help people, and doing something
> noticeably better is probably annoyingly complicated.
> 

+1

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Ryu floating point output patch

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-11 14:54:55 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > And of course it'd change the dump's text contents between ryu and
> > non-ryu backends even with extra_float_digits = 3, but the resulting
> > floats ought to be the same. It's just that ryu is better at figuring
> > out what the minimal text representation is than the current code.
> 
> I'm fairly concerned about this blithe assertion that the ryu code is
> smarter than the rest of the world.

The paper on which ryu is based seems to have quite some careful
analysis behind it, explaining why it's correct (i.e. why the algorithm
generates the minimal output, but not output that's imprecise). Although
I certainly didn't invest substantial amounts of time reviewing the
correctness of the logic therein, and would quite possibly fail due to
insufficient maths chops if I tried.

> In particular, how does it know how every strtod() on the planet will
> react to specific input?

strtod()'s job ought to computationally be significantly easier than the
other way round, no? And if there's buggy strtod() implementations out
there, why would they be guaranteed to do the correct thing with our
current output, but not with ryu's?

Greetings,

Andres Freund

Re: Ryu floating point output patch

2019-01-11 Thread Tom Lane

Andres Freund  writes:
> And of course it'd change the dump's text contents between ryu and
> non-ryu backends even with extra_float_digits = 3, but the resulting
> floats ought to be the same. It's just that ryu is better at figuring
> out what the minimal text representation is than the current code.

I'm fairly concerned about this blithe assertion that the ryu code is
smarter than the rest of the world.  In particular, how does it know
how every strtod() on the planet will react to specific input?

regards, tom lane

Re: Feature: temporary materialized views

2019-01-11 Thread Mitar

Hi!

On Fri, Jan 11, 2019 at 8:51 AM Andreas Karlsson  wrote:
> Her is quick initial review. I will do more testing later.

Thanks for doing the review!

> In create_ctas_internal() why do you copy the relation even when you do
> not modify it?

I was modelling this after code in view.c [1]. I can move copy into the "if".

> Is it really ok to just remove SECURITY_RESTRICTED_OPERATION from
> ExecCreateTableAs()? I feel it is there for a good reason and that we
> preferably want to reduce the duration of SECURITY_RESTRICTED_OPERATION
> to only include when we actually execute the query.

The comment there said that this is not really necessary for security:

"This is not necessary for security, but this keeps the behavior
similar to REFRESH MATERIALIZED VIEW.  Otherwise, one could create a
materialized view not possible to refresh."

Based on my experimentation, this is required to be able to use
temporary materialized views, but it does mean one has to pay
attention from where one can refresh. For example, you cannot refresh
from outside of the current session, because temporary object is not
available there. I have not seen any other example where refresh would
not be possible.

This is why I felt comfortable removing this. Also, no test failed
after removing this.

[1] 
https://github.com/postgres/postgres/blob/master/src/backend/commands/view.c#L554

Mitar

-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Mike Rylander

On Fri, Jan 11, 2019 at 2:10 PM Robert Haas  wrote:
>
> On Fri, Jan 11, 2019 at 2:04 PM Andres Freund  wrote:
> > > Maybe we could consider a more extensible syntax that is attached to
> > > the contained SELECT rather than the containing WITH.  Then CTEs would
> > > be less special; there'd be a place to put hints controlling top-level
> > > queries, subselects, views etc too (perhaps eventually join hints,
> > > parallelism hints etc, but "materialize this" would be just another
> > > one of those things).  That'd be all-in.
> >
> > I think you have some purity arguments here, but the likelihood of us
> > developing a full-blown solution is not that high, and the lack of
> > inlinable CTEs is *really* hurting us. As long as the design doesn't
> > block a full solution, if we go there, I think it's a very acceptable
> > blemish in comparison to the benefits we'd get.
>
> Also, it seems to me that this is properly a property of the
> individual WITH clause, not the query as a whole.
>
> I mean I suppose we could do
>
> WITH or_with_out_you OPTIONS (materialized false) AS (SELECT 'mariah
> carey') SELECT ...
>
> That'd allow for extensibility, have the write scope, and look like
> what we do elsewhere.  It looks a little less elegant than
>
> WITH cte_name [[NOT] MATERIALIZED] AS (query) main_query...
>
> ...but maybe elegance for extensibility is a good trade.
>

Here, have $0.02 from the peanut gallery...

I mildly prefer the latter, elegant spelling, but if CTE inlining does
become a thing then I would /really/ want some way, any way, of
telling Postgres that I want it to materialize a particular CTE.

I use that currently-documented property of CTEs to structure large,
complicated OLAP queries on a regular basis, for performance.
Sometimes, such as when you have dozens of tables in a complex join
tree, breaking the query into logically related chunks (which I know
about, but the planner does not) via CTE is the only way to give the
planner a fighting chance of finding a good plan.  Otherwise you get
stuck in the GEQO ghetto, or planning time is some non-trivial
multiple of execution time.

Thanks,

--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@equinoxinitiative.org
 | web:  http://equinoxinitiative.org

Re: Prevent extension creation in temporary schemas

2019-01-11 Thread Robert Haas

On Sun, Jan 6, 2019 at 10:26 PM Michael Paquier  wrote:
> This combination makes no actual sense, so wouldn't it be better to
> restrict the case?

Hmm.  What exactly doesn't make sense about it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 2:04 PM Andres Freund  wrote:
> > Maybe we could consider a more extensible syntax that is attached to
> > the contained SELECT rather than the containing WITH.  Then CTEs would
> > be less special; there'd be a place to put hints controlling top-level
> > queries, subselects, views etc too (perhaps eventually join hints,
> > parallelism hints etc, but "materialize this" would be just another
> > one of those things).  That'd be all-in.
>
> I think you have some purity arguments here, but the likelihood of us
> developing a full-blown solution is not that high, and the lack of
> inlinable CTEs is *really* hurting us. As long as the design doesn't
> block a full solution, if we go there, I think it's a very acceptable
> blemish in comparison to the benefits we'd get.

Also, it seems to me that this is properly a property of the
individual WITH clause, not the query as a whole.

I mean I suppose we could do

WITH or_with_out_you OPTIONS (materialized false) AS (SELECT 'mariah
carey') SELECT ...

That'd allow for extensibility, have the write scope, and look like
what we do elsewhere.  It looks a little less elegant than

WITH cte_name [[NOT] MATERIALIZED] AS (query) main_query...

...but maybe elegance for extensibility is a good trade.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-12 07:58:39 +1300, Thomas Munro wrote:
> On Sat, Jan 12, 2019 at 7:19 AM Andres Freund  wrote:
> > On 2019-01-11 11:12:25 -0500, Robert Haas wrote:
> > > I actually think that we should go "all in" here and allow the user to
> > > specify either that they want materialization or that they don't want
> > > materialization.  If they specify neither, then we make some decision
> > > which we may change in a future release.  If they do specify
> > > something, we do that.
> >
> > +many
> 
> I think the syntax as proposed is almost OK if we only want to
> grandfather in our historically hintful CTEs but discourage the
> development of any future kinds of hints.  Even then I don't love the
> way it formalises a semi-procedural step at the same language level as
> a glorious declarative relational query.
> 
> Maybe we could consider a more extensible syntax that is attached to
> the contained SELECT rather than the containing WITH.  Then CTEs would
> be less special; there'd be a place to put hints controlling top-level
> queries, subselects, views etc too (perhaps eventually join hints,
> parallelism hints etc, but "materialize this" would be just another
> one of those things).  That'd be all-in.

I think you have some purity arguments here, but the likelihood of us
developing a full-blown solution is not that high, and the lack of
inlinable CTEs is *really* hurting us. As long as the design doesn't
block a full solution, if we go there, I think it's a very acceptable
blemish in comparison to the benefits we'd get.

Greetings,

Andres Freund

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Thomas Munro

On Sat, Jan 12, 2019 at 7:19 AM Andres Freund  wrote:
> On 2019-01-11 11:12:25 -0500, Robert Haas wrote:
> > I actually think that we should go "all in" here and allow the user to
> > specify either that they want materialization or that they don't want
> > materialization.  If they specify neither, then we make some decision
> > which we may change in a future release.  If they do specify
> > something, we do that.
>
> +many

I think the syntax as proposed is almost OK if we only want to
grandfather in our historically hintful CTEs but discourage the
development of any future kinds of hints.  Even then I don't love the
way it formalises a semi-procedural step at the same language level as
a glorious declarative relational query.

Maybe we could consider a more extensible syntax that is attached to
the contained SELECT rather than the containing WITH.  Then CTEs would
be less special; there'd be a place to put hints controlling top-level
queries, subselects, views etc too (perhaps eventually join hints,
parallelism hints etc, but "materialize this" would be just another
one of those things).  That'd be all-in.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 1:49 PM David Fetter  wrote:
> I don't see those as the same thing even slightly. Functions are
> Turing complete, generally speaking, which means that unless we send
> along those descriptors, we're asking the planner to solve the Halting
> Problem.

So... your argument is that functions are Turing-complete, but the
queries which call those functions somehow aren't?  Actually, there's
probably a decent argument that WITH RECURSIVE is Turing-complete even
without any fancy functions.  See
https://wiki.postgresql.org/wiki/Turing_Machine_(with_recursive)

> > OK, I know that's a bit of a straw man -- you're talking about hints
> > within a query, not DDL.  Still, I think our theory about not having
> > hints is that we should have the optimizer try to figure it out
> > instead of making the user specify the behavior that they want -- and
> > I think sometimes that's setting the bar at an impossible level.
>
> There is a worked example that's open source.
> https://github.com/ossc-db/pg_hint_plan
>
> Have we looked over it seriously for inclusion in PostgreSQL?

That really has very little to do with what's under discussion here,
unless you're proposing that the right strategy for determining
whether to materialize the CTE or not is to execute the query both
ways and then use that to construct the plan we use to execute the
query.

> When they're specifying it, are they specifying it globally, or
> per WITH clause, or...?

Per WITH clause.  That's the proposal which is under discussion here,
not anything else.  Did you read the thread?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread David Fetter

On Fri, Jan 11, 2019 at 11:12:25AM -0500, Robert Haas wrote:
> > I know this is a thorny topic, but I have to say that I am uneasy
> > about the MATERIALIZED syntax.  Here's how you write that in some
> > other RDBMS that loves hints:
> >
> > WITH foo AS (SELECT /*+ MATERIALIZE */ ...)
> >
> > I understood that it was a long standing project policy that we don't
> > want planner hints, but now we have a proposal to support one with a
> > top-level non-standard syntax.  If we take this syntax, should we not
> > also accept MATERIALIZED in front of subselects?
> >
> > -1
> 
> I think this is unduly negative.  Do you want to also remove the
> IMMUTABLE, STABLE, and VOLATILE keywords from functions because those
> are hints to the planner as to how those things should be treated?

I don't see those as the same thing even slightly. Functions are
Turing complete, generally speaking, which means that unless we send
along those descriptors, we're asking the planner to solve the Halting
Problem.

> Should we remove CREATE INDEX -- which is, after all, a non-standard
> extension to SQL syntax -- because it presumes that the user is in a
> better position than we are to know which columns ought to be
> indexed?

Not yet.  Maybe in a decade or two.

> OK, I know that's a bit of a straw man -- you're talking about hints
> within a query, not DDL.  Still, I think our theory about not having
> hints is that we should have the optimizer try to figure it out
> instead of making the user specify the behavior that they want -- and
> I think sometimes that's setting the bar at an impossible level.

There is a worked example that's open source.
https://github.com/ossc-db/pg_hint_plan

Have we looked over it seriously for inclusion in PostgreSQL?

> It's not like this is a thing where we can get this right 90% of the
> time and with some more planner work we can drive it up to near 100%.
> We're going to be wrong a lot, even if we do expensive things like try
> planning it both ways and see which one comes out cheaper on cost, and
> we don't like driving up planner CPU consumption, either.  So it seems
> to me that letting the user say what they want is a very pragmatic
> approach.  Sometimes, like with things that have side effects, it's
> the only way to know that we're even delivering the right answer; and
> even when it's just an optimization problem, it's nice to give users a
> way of fixing our planning failures that is better than asking them to
> wait until we invent a way of improving the optimization decision in
> some future release - which we may never even do.
> 
> I actually think that we should go "all in" here and allow the user to
> specify either that they want materialization or that they don't want
> materialization.  If they specify neither, then we make some decision
> which we may change in a future release.  If they do specify
> something, we do that.

When they're specifying it, are they specifying it globally, or
per WITH clause, or...?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Ryu floating point output patch

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-11 13:33:54 -0500, Robert Haas wrote:
> On Fri, Jan 11, 2019 at 1:30 PM Andres Freund  wrote:
> > On 2019-01-11 12:55:54 -0500, Robert Haas wrote:
> > > About as much as I can say is that if you commit this patch, and as a
> > > result, a dump-and-restore from v11 to v12 causes a change in the bits
> > > on disk that would not have occurred without this patch, that's very
> > > bad.  What ramifications there will be if the output changes in ways
> > > that don't affect the bits that get written to the platter -- I don't
> > > know.
> >
> > Are you trying to highlight the potential dangers of this patch, or are
> > you seeing a concrete risk of that in the discussion? Both ryu's and the
> > current output (as used by pg_dump, with extra_float_digits = 3) ought
> > to be roundtrip safe.  There's possibly some chance for differences in
> > insignificant bits, but that ought to be pretty much it?
> >
> > The current discussion would probably change the data that's stored on disk
> > between ryu/non-ryu for queries like
> > CREATE TABLE AS SELECT floatcol::text::float FROM ...
> > or
> > CREATE TABLE AS SELECT floatcol::text FROM ...
> >
> > when not specifyiung extra_float_digits = 3, but the ryu case would be
> > more precise, so that ought to be good?
> 
> I'm not complaining about the patch.  I'm saying that the patch can't
> change people's existing data when they dump and restore.  If you're
> saying it doesn't, cool.  I don't mind if the answers to queries get
> more precise -- that's good, as you say -- just that we don't change
> their data.

It'd potentially change data between an non-ryu->{ryu,non-ryu} and
ryu->ryu versions, for dumps that didn't specify extra_float_digits =
3. But that'd not be pg_dump, and already broken previously, so I don't
have particularly much sympathy?

And of course it'd change the dump's text contents between ryu and
non-ryu backends even with extra_float_digits = 3, but the resulting
floats ought to be the same. It's just that ryu is better at figuring
out what the minimal text representation is than the current code.

Greetings,

Andres Freund

Re: Checksum errors in pg_stat_database

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 5:21 AM Magnus Hagander  wrote:
> Would it make sense to add a column to pg_stat_database showing the total 
> number of checksum errors that have occurred in a database?
>
> It's really a ">1 means it's bad", but it's a lot easier to monitor that in 
> the statistics views, and given how much a lot of people set their systems 
> out to log, it's far too easy to miss individual checksum matches in the logs.
>
> If we track it at the database level, I don't think the overhead of adding 
> one more counter would be very high either.

It's probably not the idea way to track it.  If you have a terabyte or
fifty of data, and you see that you have some checksum failures, good
luck finding the offending blocks.

But I'm tentatively in favor of your proposal anyway, because it's
pretty simple and cheap and might help people, and doing something
noticeably better is probably annoyingly complicated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Ryu floating point output patch

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 1:30 PM Andres Freund  wrote:
> On 2019-01-11 12:55:54 -0500, Robert Haas wrote:
> > About as much as I can say is that if you commit this patch, and as a
> > result, a dump-and-restore from v11 to v12 causes a change in the bits
> > on disk that would not have occurred without this patch, that's very
> > bad.  What ramifications there will be if the output changes in ways
> > that don't affect the bits that get written to the platter -- I don't
> > know.
>
> Are you trying to highlight the potential dangers of this patch, or are
> you seeing a concrete risk of that in the discussion? Both ryu's and the
> current output (as used by pg_dump, with extra_float_digits = 3) ought
> to be roundtrip safe.  There's possibly some chance for differences in
> insignificant bits, but that ought to be pretty much it?
>
> The current discussion would probably change the data that's stored on disk
> between ryu/non-ryu for queries like
> CREATE TABLE AS SELECT floatcol::text::float FROM ...
> or
> CREATE TABLE AS SELECT floatcol::text FROM ...
>
> when not specifyiung extra_float_digits = 3, but the ryu case would be
> more precise, so that ought to be good?

I'm not complaining about the patch.  I'm saying that the patch can't
change people's existing data when they dump and restore.  If you're
saying it doesn't, cool.  I don't mind if the answers to queries get
more precise -- that's good, as you say -- just that we don't change
their data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Ryu floating point output patch

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-11 12:55:54 -0500, Robert Haas wrote:
> About as much as I can say is that if you commit this patch, and as a
> result, a dump-and-restore from v11 to v12 causes a change in the bits
> on disk that would not have occurred without this patch, that's very
> bad.  What ramifications there will be if the output changes in ways
> that don't affect the bits that get written to the platter -- I don't
> know.

Are you trying to highlight the potential dangers of this patch, or are
you seeing a concrete risk of that in the discussion? Both ryu's and the
current output (as used by pg_dump, with extra_float_digits = 3) ought
to be roundtrip safe.  There's possibly some chance for differences in
insignificant bits, but that ought to be pretty much it?

The current discussion would probably change the data that's stored on disk
between ryu/non-ryu for queries like
CREATE TABLE AS SELECT floatcol::text::float FROM ...
or
CREATE TABLE AS SELECT floatcol::text FROM ...

when not specifyiung extra_float_digits = 3, but the ryu case would be
more precise, so that ought to be good?

Greetings,

Andres Freund

Re: New vacuum option to do only freezing

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 1:14 AM Masahiko Sawada  wrote:
> Attached the updated patch. Please review it.

I'm quite confused by this patch.  It seems to me that the easiest way
to implement this patch would be to (1) make lazy_space_alloc take the
maxtuples = MaxHeapTuplesPerPage branch when the new option is
specified, and then (2) forget about them after each page i.e.

if (nindexes == 0 &&
vacrelstats->num_dead_tuples > 0)
{
...
}
else if (skipping index cleanup)
vacrelstats->num_dead_tuples = 0;

I don't see why it should touch the logic inside lazy_vacuum_page() or
the decision about whether to truncate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Andres Freund

On 2019-01-11 11:12:25 -0500, Robert Haas wrote:
> I actually think that we should go "all in" here and allow the user to
> specify either that they want materialization or that they don't want
> materialization.  If they specify neither, then we make some decision
> which we may change in a future release.  If they do specify
> something, we do that.

+many

Re: Acceptable/Best formatting of callbacks (for pluggable storage)

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-11 09:42:19 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > The pluggable storage patchset has a large struct full of callbacks, and
> > a bunch of wrapper functions for calling those callbacks. While
> > starting to polish the patchset, I tried to make the formatting nice.
> > ...
> > So, putting the parameter list, both in use and declaration, entirely
> > into a new line yields decent formatting with pgindent. But it's kinda
> > weird.  I can't really come up with a better alternative, and after a
> > few minutes it looks pretty reasonable.
> 
> > Comments? Better alternatives?
> 
> Use shorter method names?  This sounds like an ugly workaround for
> a carpal-tunnel-syndrome-inducing design.

I'm confused. What did I write about that has unreasonably long names?
And if you're referring to the wider design, that all seems fairly
fundamental to something needing callbacks - not exactly a first in
postgres.

Greetings,

Andres Freund

Re: Ryu floating point output patch

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 12:33 PM Andrew Gierth
 wrote:
> > "Robert" == Robert Haas  writes:
>  Robert> I don't have an opinion on the code that is part of this patch
>
> At this point, frankly, I'm less interested in opinions on the code
> itself (which can be addressed later if need be) than on answers to (or
> discussion of) the questions asked upthread.

Unfortunately, I can't help you very much there.  My knowledge is
insufficient to give intelligent answers to those questions, and I
don't think giving you unintelligent answers is a good idea.  About as
much as I can say is that if you commit this patch, and as a result, a
dump-and-restore from v11 to v12 causes a change in the bits on disk
that would not have occurred without this patch, that's very bad.
What ramifications there will be if the output changes in ways that
don't affect the bits that get written to the platter -- I don't know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Acceptable/Best formatting of callbacks (for pluggable storage)

2019-01-11 Thread Andres Freund

Hi,

On 2019-01-11 10:32:03 -0500, Robert Haas wrote:
> On Thu, Jan 10, 2019 at 11:45 PM Andres Freund  wrote:
> > void(*relation_set_new_filenode) (Relation relation,
> >   char persistence,
> >   TransactionId *freezeXid,
> >   MultiXactId *minmulti);
> 
> Honestly, I don't see the problem with that, really.

It's just hard to read if there's a lot of callbacks defined, the more
accurate the name, the more deeply indented. Obviously that's always a
concern and thing to balance, but the added indentation due to the
whitespace, and the parens, * and whitespace between ) ( make it worse.

> But you could
> also use the style that is used in fdwapi.h, where we have a typedef
> for each callback first, and then the actual structure just declares a
> function pointer of each time.  That saves a bit of horizontal space
> and might look a little nicer.

It's what the patch did at first.  It doesn't save much space, because
the indentation due to the typedef at the start of the line is about as
much as defining in the struct adds, and we often add a _function
suffix.  It additionally adds a fair bit of mental overhead - there's
another set of names that one needs to keep track of, figuring out what
a callback means requires looking in an additional place.  I found that
removing that indirection made for a significantly more pleasant
experience working on the patchset.

Just as an example of why I think this isn't great:

typedef Size (*EstimateDSMForeignScan_function) (ForeignScanState *node,
 ParallelContext *pcxt);
typedef void (*InitializeDSMForeignScan_function) (ForeignScanState *node,
   ParallelContext *pcxt,
   void *coordinate);
typedef void (*ReInitializeDSMForeignScan_function) (ForeignScanState *node,
 ParallelContext *pcxt,
 void *coordinate);
typedef void (*InitializeWorkerForeignScan_function) (ForeignScanState *node,
  shm_toc *toc,
  void *coordinate);
typedef void (*ShutdownForeignScan_function) (ForeignScanState *node);
typedef bool (*IsForeignScanParallelSafe_function) (PlannerInfo *root,
RelOptInfo *rel,
RangeTblEntry *rte);
typedef List *(*ReparameterizeForeignPathByChild_function) (PlannerInfo *root,
List *fdw_private,
RelOptInfo 
*child_rel);

that's a lot of indentation variability in a small amount of space - I
find it noticably slower to mentally parse due to that. Compare that
with:

typedef Size (*EstimateDSMForeignScan_function)
(ForeignScanState *node,
 ParallelContext *pcxt);

typedef void (*InitializeDSMForeignScan_function)
(ParallelContext *pcxt,
 void *coordinate);

typedef void (*ReInitializeDSMForeignScan_function)
(ForeignScanState *node,
 ParallelContext *pcxt,
 void *coordinate);

typedef void (*InitializeWorkerForeignScan_function)
(ForeignScanState *node,
 shm_toc *toc,
 void *coordinate);

typedef void (*ShutdownForeignScan_function)
(ForeignScanState *node);

typedef bool (*IsForeignScanParallelSafe_function)
(PlannerInfo *root,
 RelOptInfo *rel,
 RangeTblEntry *rte);

typedef List *(*ReparameterizeForeignPathByChild_function)
(PlannerInfo *root,
 List *fdw_private,
 RelOptInfo *child_rel);

I find the second formatting considerably easier to read, albeit not for
the first few seconds.

Greetings,

Andres Freund

Re: Making WAL receiver startup rely on GUC context for primary_conninfo and primary_slot_name

2019-01-11 Thread Robert Haas

On Thu, Jan 10, 2019 at 7:42 PM Andres Freund  wrote:
> I still think this whole direction of accessing the GUC in walreceiver
> is a bad idea and shouldn't be pursued further.  There's definite
> potential for startup process and WAL receiver having different states
> of GUCs, the code doesn't get meaningfully simpler, the GUC value checks
> in walreceiver make for horrible reporting up the chain.

I'm trying to assess the validity of this complaint.  It seems to me
that it might advance the discussion to be more clear about what the
problem is here.  When pg_ctl reload is executed (or the equivalent),
different backends reload the file at different times.  Currently,
because the values are explicitly passed down from the startup process
to the walreceiver, it's guaranteed that they will both use the same
value.  With this change, one might see an older value and the other
the current value.  So there's room to be nervous about code like
this:

if (PrimaryConnInfo && strcmp(PrimaryConnInfo, "") != 0)
{
... assorted other code ...
RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
 PrimarySlotName);
receivedUpto = 0;
}

With the patch, the PrimaryConnInfo and PrimarySlotName arguments are
removed from RequestXLogStreaming.  That means that the new
walreceiver could come to a different conclusion about the values of
those arguments than the startup process.  In particular, it could end
up thinking that primary_conninfo is empty when, if the startup
process had thought that, the walreceiver would never have been
launched in the first place.  But it's not obvious that you've added
any logic in WALReceiverMain or elsewhere to compensate for this
possibility -- what would happen in that case?  Would we crash?
Connect to the wrong server?

I might be wrong here, but I'm inclined to think that this scenario
hasn't really been contemplated carefully by the patch authors.  There
are no added TAP tests for the scenario where the values differ
between the two processes, no code comments which explain why it's OK
if that happens, really no mention of it in the patch at all.  And on
that basis I'm inclined to think that Andres is really quite correct
to be worried about this.  The problem he's talking about here is very
low-probability because the race condition is narrow, but it's real,
and it surely needs to be handled somehow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Ryu floating point output patch

2019-01-11 Thread Andrew Gierth

> "Robert" == Robert Haas  writes:

 Robert> I don't have an opinion on the code that is part of this patch

At this point, frankly, I'm less interested in opinions on the code
itself (which can be addressed later if need be) than on answers to (or
discussion of) the questions asked upthread.

-- 
Andrew (irc:RhodiumToad)

Re: Ryu floating point output patch

2019-01-11 Thread Robert Haas

On Thu, Jan 10, 2019 at 11:10 PM Tom Lane  wrote:
> Andres Freund  writes:
> > On 2019-01-10 23:02:01 -0500, Chapman Flack wrote:
> >> Does the project have an established view on non-ASCII names in
> >> sources or docs?
> >> AFAICS [1], the name of the algorithm may be Ryū.
>
> > I think it'd be a really bad idea to start having non-ascii
> > filenames, and I quite doubt that I'm alone in that.
>
> Non-ASCII filenames seem right out.  I thought the question was
> about whether we even want to have non-ASCII characters within
> source code (my view: avoid if possible) or in docs (we do,
> but it's better if you can make it into html entities).

+1 to all that.

I don't have an opinion on the code that is part of this patch, but
the benchmark results are impressive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Remove all "INTERFACE ROUTINES" style comments

2019-01-11 Thread Robert Haas

On Thu, Jan 10, 2019 at 7:05 PM Thomas Munro
 wrote:
> +1

+1

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: What to name the current heap after pluggable storage / what to rename?

2019-01-11 Thread Robert Haas

On Thu, Jan 10, 2019 at 7:05 PM Andres Freund  wrote:
> ISTM that the first block would best belong into new files like
> access/rel[ation].h and access/common/rel[ation].h.

+1.

> I think the second
> set should be renamed to be table_open() (with backward compat
> redirects, there's way way too many references) and should go into
> access/table.h access/table/table.c alongside tableam.[ch],

Sounds reasonable.

> but I could
> also see just putting them into relation.[ch].

I would view that as a less-preferred alternative, but not crazy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Feature: temporary materialized views

2019-01-11 Thread Andreas Karlsson

On 12/28/18 8:48 AM, Mitar wrote:> One more version of the patch with 
more deterministic tests.


Her is quick initial review. I will do more testing later.

It applies builds and passes the tests.

The feature seems useful and also improves consistency, if we have 
temporary tables and temporary views there should logically also be 
temporary materialized tables.


As for you leaving out ON COMMIT I feel that it is ok since of the 
existing options only really DROP makes any sense (you cannot truncate 
materialized views) and since temporary views do not have any ON COMMIT 
support.


= Comments on the code

The changes to the code are small and look mostly correct.

In create_ctas_internal() why do you copy the relation even when you do 
not modify it?


Is it really ok to just remove SECURITY_RESTRICTED_OPERATION from 
ExecCreateTableAs()? I feel it is there for a good reason and that we 
preferably want to reduce the duration of SECURITY_RESTRICTED_OPERATION 
to only include when we actually execute the query.


Andreas

Re: add_partial_path() may remove dominated path but still in use

2019-01-11 Thread Robert Haas

On Thu, Jan 10, 2019 at 9:10 PM Kohei KaiGai  wrote:
> 2019年1月11日(金) 5:52 Robert Haas :
> > On Wed, Jan 9, 2019 at 12:44 AM Kohei KaiGai  wrote:
> > > So, is it sufficient if set_rel_pathlist_hook is just relocated in
> > > front of the generate_gather_paths?
> > > If we have no use case for the second hook, here is little necessity
> > > to have the post_rel_pathlist_hook() here.
> > > (At least, PG-Strom will use the first hook only.)
> >
> > +1.  That seems like the best way to be consistent with the principle
> > that we need to have all the partial paths before generating any
> > Gather paths.
> >
> Patch was updated, just for relocation of the set_rel_pathlist_hook.
> Please check it.

Seems reasonable to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Misleading panic message in backend/access/transam/xlog.c

2019-01-11 Thread Robert Haas

On Wed, Jan 9, 2019 at 8:38 PM Michael Paquier  wrote:
> On Wed, Jan 09, 2019 at 05:09:19PM -0800, Andres Freund wrote:
> > IIRC we have other such errors including offset and length (and if
> > not we'll grow some). It should be formatted as a genetic write
> > error with the file name, no reference to log file, etc, even if
> > there's no precedent.
>
> Yeah, there are a couple of them:
> access/transam/xlog.c:
> errmsg("could not read from log segment %s, offset %u: %m",

In smgr.c, we have:

"could not read block %u in file \"%s\": %m"

That seems to be the closet thing we have to a generic message
template right now, but it's not entirely generic because it talks
about blocks.  Maybe we should go with something like:

"could not read %u bytes in file \"%s\" at offset %u: %m"

...and use that for both WAL and smgr.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Early WIP/PoC for inlining CTEs

2019-01-11 Thread Robert Haas

> I know this is a thorny topic, but I have to say that I am uneasy
> about the MATERIALIZED syntax.  Here's how you write that in some
> other RDBMS that loves hints:
>
> WITH foo AS (SELECT /*+ MATERIALIZE */ ...)
>
> I understood that it was a long standing project policy that we don't
> want planner hints, but now we have a proposal to support one with a
> top-level non-standard syntax.  If we take this syntax, should we not
> also accept MATERIALIZED in front of subselects?
>
> -1

I think this is unduly negative.  Do you want to also remove the
IMMUTABLE, STABLE, and VOLATILE keywords from functions because those
are hints to the planner as to how those things should be treated?
Should we remove CREATE INDEX -- which is, after all, a non-standard
extension to SQL syntax -- because it presumes that the user is in a
better position than we are to know which columns ought to be indexed?

OK, I know that's a bit of a straw man -- you're talking about hints
within a query, not DDL.  Still, I think our theory about not having
hints is that we should have the optimizer try to figure it out
instead of making the user specify the behavior that they want -- and
I think sometimes that's setting the bar at an impossible level.  In
the case of things that have side effects, like FOR UPDATE/SHARE or
volatile functions, we really can't know the user's intention unless
the user tells us.  But even in cases where there are no side effects
and it's just a question of finding the most efficient plan, it
doesn't seem crazy to me to allow the user to give us a clue about
what they have in mind.

It's not like this is a thing where we can get this right 90% of the
time and with some more planner work we can drive it up to near 100%.
We're going to be wrong a lot, even if we do expensive things like try
planning it both ways and see which one comes out cheaper on cost, and
we don't like driving up planner CPU consumption, either.  So it seems
to me that letting the user say what they want is a very pragmatic
approach.  Sometimes, like with things that have side effects, it's
the only way to know that we're even delivering the right answer; and
even when it's just an optimization problem, it's nice to give users a
way of fixing our planning failures that is better than asking them to
wait until we invent a way of improving the optimization decision in
some future release - which we may never even do.

I actually think that we should go "all in" here and allow the user to
specify either that they want materialization or that they don't want
materialization.  If they specify neither, then we make some decision
which we may change in a future release.  If they do specify
something, we do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Acceptable/Best formatting of callbacks (for pluggable storage)

2019-01-11 Thread Robert Haas

On Fri, Jan 11, 2019 at 9:42 AM Tom Lane  wrote:
> Use shorter method names?  This sounds like an ugly workaround for
> a carpal-tunnel-syndrome-inducing design.

What would you suggest instead of something like
"relation_set_new_filenode"?  I agree that shorter names have some
merit, but it's not always easy to figure out how to shorten them
without making the result unclear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Acceptable/Best formatting of callbacks (for pluggable storage)

2019-01-11 Thread Robert Haas

On Thu, Jan 10, 2019 at 11:45 PM Andres Freund  wrote:
> void(*relation_set_new_filenode) (Relation relation,
>   char persistence,
>   TransactionId *freezeXid,
>   MultiXactId *minmulti);

Honestly, I don't see the problem with that, really.  But you could
also use the style that is used in fdwapi.h, where we have a typedef
for each callback first, and then the actual structure just declares a
function pointer of each time.  That saves a bit of horizontal space
and might look a little nicer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] generated columns

2019-01-11 Thread Pavel Stehule

Hi

pá 11. 1. 2019 v 9:31 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 06/11/2018 13:27, Peter Eisentraut wrote:
> > This is a small bug that I will fix in my next update.
>
> Time for another update.  Lot's of rebasing, many things fixed,
> including the ADD COLUMN bug you found, replication, foreign tables,
> better caching, some corner cases in trigger behavior, more
> documentation.  This addresses everything I've had in my notes, so it's
> functionally and logically complete from my perspective.
>

I am looking on this patch - it is great feature.

The documentation contains paragraph

+  The generation expression can only use immutable functions and cannot
+  use subqueries or reference anything other than the current row in
any
+  way.

It is necessary for stored columns?

I tested it with pseudo constant - current_timestamp, session_user. But
current_database() is disallowed.

on second hand, this is strange

postgres=# create table foo3 (inserted text generated always as
(current_timestamp) virtual);
CREATE TABLE

Regards

Pavel

>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: port of INSTALL file generation to XSLT

2019-01-11 Thread Tom Lane

Peter Eisentraut  writes:
> On 09/01/2019 10:05, Mi Tar wrote:
>> I tested this. Patch applied cleanly and INSTALL file was produced. 
>> Formatting looks differently from before, but I think that it looks better. 
>> We lost central alignment of some headings, but many code/command snippets 
>> are now better/correctly indented. So I think this is overall better.

> Committed with a fix for that, thanks.

So this has got a couple of problems:

1. No pandoc on borka, where we build tarballs:

pgsql@borka:~$ which pandoc
pgsql@borka:~$ 

Probably we can get the sysadmin team to fix that, but ...

2. If there's no pandoc, this coding silently produces a zero-size
INSTALL file.  I do not find that acceptable.

pgsql@borka:~$ mk-release 96b8b8b6f9d8de4af01a77797273ad88c7a8e32e
...
Commit 96b8b8b6f9d8de4af01a77797273ad88c7a8e32e packaged as  
postgresql-12devel.tar.bz2 postgresql-12devel.tar.bz2.md5 
postgresql-12devel.tar.bz2.sha256 postgresql-12devel.tar.gz 
postgresql-12devel.tar.gz.md5 postgresql-12devel.tar.gz.sha256
pgsql@borka:~$ tar tvfj output/postgresql-12devel.tar.bz2 | grep INSTALL
-rw-r--r-- pgsql/pgsql   0 2019-01-11 15:09 postgresql-12devel/INSTALL


regards, tom lane

Re: Three animals fail test-decoding-check on REL_10_STABLE

2019-01-11 Thread Tom Lane

Thomas Munro  writes:
> Only gaur shows useful logs:

>   SELECT 'init' FROM
> pg_create_logical_replication_slot('regression_slot',
> 'test_decoding');
> ! ERROR:  could not access file "test_decoding": No such file or directory

> Does this mean it didn't build the test_decoding module?

I'm wondering if it built it but didn't install it, as a result of
some problem with

> bedda9fbb7 Mon Dec 31 21:57:57 2018 UTC  Process EXTRA_INSTALL
> serially, during the first temp-install.

Will take a look later, but since gaur is so slow, it may be awhile
before I have any answers.

regards, tom lane

Re: Acceptable/Best formatting of callbacks (for pluggable storage)

2019-01-11 Thread Tom Lane

Andres Freund  writes:
> The pluggable storage patchset has a large struct full of callbacks, and
> a bunch of wrapper functions for calling those callbacks. While
> starting to polish the patchset, I tried to make the formatting nice.
> ...
> So, putting the parameter list, both in use and declaration, entirely
> into a new line yields decent formatting with pgindent. But it's kinda
> weird.  I can't really come up with a better alternative, and after a
> few minutes it looks pretty reasonable.

> Comments? Better alternatives?

Use shorter method names?  This sounds like an ugly workaround for
a carpal-tunnel-syndrome-inducing design.

regards, tom lane

Re: unnecessary creation of FSM files during bootstrap mode

2019-01-11 Thread Tom Lane

Amit Kapila  writes:
> On Fri, Jan 11, 2019 at 5:00 AM Tom Lane  wrote:
>> It's also possible that you just aren't exercising the cases where
>> trouble occurs.  In particular, noting this bit in InsertOneValue():
>> OidOutputFunctionCall(typoutput, values[i];

> I have tried initdb with --debug option (If I am not wrong, it runs
> initdb under DEBUG5 mode) and didn't hit any problem after applying
> the patch.  Are you expecting that we might try to open pg_proc at
> that place which can lead to the problem?

Yes, I was concerned about regprocout in particular.  It might be
okay as long as we don't try to add any regproc columns to the
BKI_BOOTSTRAP catalogs.

regards, tom lane

Re: port of INSTALL file generation to XSLT

2019-01-11 Thread Peter Eisentraut

On 09/01/2019 10:05, Mi Tar wrote:
> I tested this. Patch applied cleanly and INSTALL file was produced. 
> Formatting looks differently from before, but I think that it looks better. 
> We lost central alignment of some headings, but many code/command snippets 
> are now better/correctly indented. So I think this is overall better.
> 
> Documentation about documentation generation should be updated though. Please 
> add pandoc to requirements here [1].
> 
> [1] https://www.postgresql.org/docs/devel/docguide-toolsets.html

Committed with a fix for that, thanks.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0

2019-01-11 Thread Etsuro Fujita


(2019/01/11 13:46), Amit Langote wrote:

On 2019/01/10 15:07, Etsuro Fujita wrote:

(2019/01/10 10:41), Amit Langote wrote:

That's a loaded meaning and abusing it to mean something else can be
challenged, but we can live with that if properly documented.  Speaking of
which:

  /* used by partitionwise joins: */
  boolconsider_partitionwise_join;/* consider
partitionwise join
   * paths? (if partitioned
rel) */

Maybe, mention here how it will be abused in back-branches for
non-partitioned relations?


Will do.


Thank you.


I know we don't yet reach a consensus on what to do in details to 
address this issue, but for the above, how about adding comments like 
this to set_append_rel_size(), instead of the header file:


/*
 * If we consider partitionwise joins with the parent rel, do 
the same

 * for partitioned child rels.
 *
 * Note: here we abuse the consider_partitionwise_join flag for 
child

 * rels that are not partitioned, to tell try_partitionwise_join()
 * that their targetlists and EC entries have been generated.
 */
if (rel->consider_partitionwise_join)
childrel->consider_partitionwise_join = true;

ISTM that that would be more clearer than the header file.

Updated patch attached, which also updated other comments a little bit.

Best regards,
Etsuro Fujita
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
***
*** 1018,1059  set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
  		Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
  
  		/*
! 		 * Copy/Modify targetlist. Even if this child is deemed empty, we need
! 		 * its targetlist in case it falls on nullable side in a child-join
! 		 * because of partitionwise join.
! 		 *
! 		 * NB: the resulting childrel->reltarget->exprs may contain arbitrary
! 		 * expressions, which otherwise would not occur in a rel's targetlist.
! 		 * Code that might be looking at an appendrel child must cope with
! 		 * such.  (Normally, a rel's targetlist would only include Vars and
! 		 * PlaceHolderVars.)  XXX we do not bother to update the cost or width
! 		 * fields of childrel->reltarget; not clear if that would be useful.
! 		 */
! 		childrel->reltarget->exprs = (List *)
! 			adjust_appendrel_attrs(root,
!    (Node *) rel->reltarget->exprs,
!    1, );
! 
! 		/*
! 		 * We have to make child entries in the EquivalenceClass data
! 		 * structures as well.  This is needed either if the parent
! 		 * participates in some eclass joins (because we will want to consider
! 		 * inner-indexscan joins on the individual children) or if the parent
! 		 * has useful pathkeys (because we should try to build MergeAppend
! 		 * paths that produce those sort orderings). Even if this child is
! 		 * deemed dummy, it may fall on nullable side in a child-join, which
! 		 * in turn may participate in a MergeAppend, where we will need the
! 		 * EquivalenceClass data structures.
! 		 */
! 		if (rel->has_eclass_joins || has_useful_pathkeys(root, rel))
! 			add_child_rel_equivalences(root, appinfo, rel, childrel);
! 		childrel->has_eclass_joins = rel->has_eclass_joins;
! 
! 		/*
! 		 * We have to copy the parent's quals to the child, with appropriate
! 		 * substitution of variables.  However, only the baserestrictinfo
! 		 * quals are needed before we can check for constraint exclusion; so
! 		 * do that first and then check to see if we can disregard this child.
  		 *
  		 * The child rel's targetlist might contain non-Var expressions, which
  		 * means that substitution into the quals could produce opportunities
--- 1018,1028 
  		Assert(childrel->reloptkind == RELOPT_OTHER_MEMBER_REL);
  
  		/*
! 		 * We have to copy the parent's targetlist and quals to the child,
! 		 * with appropriate substitution of variables.  However, only the
! 		 * baserestrictinfo quals are needed before we can check for
! 		 * constraint exclusion; so do that first and then check to see if we
! 		 * can disregard this child.
  		 *
  		 * The child rel's targetlist might contain non-Var expressions, which
  		 * means that substitution into the quals could produce opportunities
***
*** 1187,1197  set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
  			continue;
  		}
  
! 		/* CE failed, so finish copying/modifying join quals. */
  		childrel->joininfo = (List *)
  			adjust_appendrel_attrs(root,
     (Node *) rel->joininfo,
     1, );
  
  		/*
  		 * Note: we could compute appropriate attr_needed data for the child's
--- 1156,1191 
  			continue;
  		}
  
! 		/*
! 		 * CE failed, so finish copying/modifying targetlist and join quals.
! 		 *
! 		 * NB: the resulting childrel->reltarget->exprs may contain arbitrary
! 		 * expressions, which otherwise would not occur in a rel's targetlist.
! 		 * Code that might be looking

Re: unnecessary creation of FSM files during bootstrap mode

2019-01-11 Thread Amit Kapila

On Fri, Jan 11, 2019 at 5:00 AM Tom Lane  wrote:
> It's also possible that you just aren't exercising the cases where
> trouble occurs.  In particular, noting this bit in InsertOneValue():
>
> /*
>  * We use ereport not elog here so that parameters aren't evaluated unless
>  * the message is going to be printed, which generally it isn't
>  */
> ereport(DEBUG4,
> (errmsg_internal("inserted -> %s",
>  OidOutputFunctionCall(typoutput, values[i];
>
> I'd counsel running initdb under DEBUG4 or higher before deciding
> you're out of the woods.
>

I have tried initdb with --debug option (If I am not wrong, it runs
initdb under DEBUG5 mode) and didn't hit any problem after applying
the patch.  Are you expecting that we might try to open pg_proc at
that place which can lead to the problem?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Displaying and dumping of table access methods

2019-01-11 Thread Amit Khandekar

On Fri, 11 Jan 2019 at 14:47, Dmitry Dolgov <9erthali...@gmail.com> wrote:
>
> > On Fri, Jan 11, 2019 at 6:02 AM Amit Khandekar  
> > wrote:
> >
> > > Yes, sounds like a reasonable approach, I can proceed with it.
> >
> > Dmitry, I believe you have taken the pg_dump part only. If that's
> > right, I can proceed with the psql part. Does that sound right ?
>
> Well, actually I've meant that I'm going to proceed with both, since I've
> posted both psql and pg_dump patches. But of course you're welcome to submit
> any new version or improvements you want.

Ok, I will review the patches that you send, and we can work on
improvements if needed. Thanks.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Checksum errors in pg_stat_database

2019-01-11 Thread Magnus Hagander

Would it make sense to add a column to pg_stat_database showing the total
number of checksum errors that have occurred in a database?

It's really a ">1 means it's bad", but it's a lot easier to monitor that in
the statistics views, and given how much a lot of people set their systems
out to log, it's far too easy to miss individual checksum matches in the
logs.

If we track it at the database level, I don't think the overhead of adding
one more counter would be very high either.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: could recovery_target_timeline=latest be the default in standby mode?

2019-01-11 Thread Peter Eisentraut

On 08/01/2019 04:37, Michael Paquier wrote:
> On Mon, Dec 31, 2018 at 01:21:00PM +0200, David Steele wrote:
>> This patch looks good to me.
> 
> 0001 looks fine to me.

committed that one

> -to the latest timeline found in the archive, which is useful in
> -a standby server. Other than that you only need to set this parameter
> +to the latest timeline found in the archive.  That is the default.
> +   
> Isn't it useful to still mention that the default is useful especially
> for standby servers?
> 
> -the WAL archive. If you plan to have multiple standby servers for high
> -availability purposes, set recovery_target_timeline to
> -latest, to make the standby server follow the 
> timeline change
> -that occurs at failover to another standby.
> +the WAL archive.
> I think that we should still keep this recommendation as well, as well
> as the one below.

Attached revised 0002 with those changes.

>> There don't seem to be any tests for recovery_target_timeline=current. This
>> is an preexisting condition but it probably wouldn't hurt to add one.
> 
> Yes, I got to wonder while looking at this patch why we don't have one
> yet in 003_recovery_targets.pl.  That's easy enough to do thanks to
> the extra rows inserted after doing the stuff for the LSN-based
> restart point, and attached is a patch to add the test.  Peter, could
> you merge it with 0001?  I am fine to take care of that myself if
> necessary.

In that test, if I change the 'current' to 'latest', the test doesn't
fail, so it's probably not a good test.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From b8b4e27b322765b9592698a15946a626498365b5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 11 Jan 2019 10:36:10 +0100
Subject: [PATCH v3 2/2] Change default of recovery_target_timeline to 'latest'

This is what one usually wants for recovery and almost always wants
for a standby.

Discussion: 
https://www.postgresql.org/message-id/flat/6dd2c23a-4162-8469-410f-bfe146e28...@2ndquadrant.com/
Reviewed-by: David Steele 
Reviewed-by: Michael Paquier 
---
 doc/src/sgml/config.sgml  |  8 ++--
 doc/src/sgml/high-availability.sgml   |  6 +++---
 src/backend/access/transam/xlog.c |  2 +-
 src/backend/utils/misc/guc.c  |  7 ---
 src/backend/utils/misc/postgresql.conf.sample |  2 +-
 src/bin/pg_rewind/RewindTest.pm   |  2 --
 src/test/recovery/t/004_timeline_switch.pl|  1 -
 src/test/recovery/t/009_twophase.pl   | 12 
 src/test/recovery/t/012_subtransactions.pl| 12 
 9 files changed, 15 insertions(+), 37 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f64402adbe..b6f5822b84 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3353,10 +3353,14 @@ Recovery Target
 Specifies recovering into a particular timeline.  The value can be a
 numeric timeline ID or a special value.  The value
 current recovers along the same timeline that was
-current when the base backup was taken.  That is the default.  The
+current when the base backup was taken.  The
 value latest recovers
 to the latest timeline found in the archive, which is useful in
-a standby server. Other than that you only need to set this parameter
+a standby server.  latest is the default.
+   
+
+   
+You usually only need to set this parameter
 in complex re-recovery situations, where you need to return to
 a state that itself was reached after a point-in-time recovery.
 See  for discussion.
diff --git a/doc/src/sgml/high-availability.sgml 
b/doc/src/sgml/high-availability.sgml
index d8fd195da0..4882b20828 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -690,8 +690,8 @@ Setting Up a Standby Server
 standby.signal in the standby's cluster data
 directory. Set  to a simple command 
to copy files from
 the WAL archive. If you plan to have multiple standby servers for high
-availability purposes, set recovery_target_timeline to
-latest, to make the standby server follow the timeline 
change
+availability purposes, make sure that 
recovery_target_timeline is set to
+latest (the default), to make the standby server follow 
the timeline change
 that occurs at failover to another standby.

 
@@ -1024,7 +1024,7 @@ Cascading Replication

 If an upstream standby server is promoted to become new master, downstream
 servers will continue to stream from the new master if
-recovery_target_timeline is set to 
'latest'.
+recovery_target_timeline is set to 
'latest' (the default).

 

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 9823b75767..2ab7d804f0

Re: Statement-level Triggers For Uniqueness Checks

2019-01-11 Thread Peter Eisentraut

On 08/01/2019 23:26, Corey Huinker wrote:
> On Fri, Jan 4, 2019 at 7:49 AM Peter Eisentraut
>  wrote:
>>
>> On 25/12/2018 00:56, Corey Huinker wrote:
>>> The regression diff (attached) seems to imply that the triggers simply
>>> are not firing, though.
>>
>> The reason for this was explained by Dean.  If you take out the check
>> that he mentioned, then your trigger fires but crashes.  In your changed
>> unique_key_recheck(), "slot" is not initialized before use (or ever).
> 
> Thanks. I'll be revisiting this shortly. Dean's information made me
> think the potential for a gain is smaller than initially imagined.

I think those are independent considerations.  The "recheckIndexes"
logic just tracks what indexes have potential conflicts to check later.
Whether that checking later happens in a row or statement trigger should
not matter.  What you need to do is adapt the "recheckIndexes" logic
from row triggers to statement triggers, e.g., expand
ExecASInsertTriggers() to look more like ExecARInsertTriggers().

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Displaying and dumping of table access methods

2019-01-11 Thread Dmitry Dolgov

> On Fri, Jan 11, 2019 at 6:02 AM Amit Khandekar  wrote:
>
> > Yes, sounds like a reasonable approach, I can proceed with it.
>
> Dmitry, I believe you have taken the pg_dump part only. If that's
> right, I can proceed with the psql part. Does that sound right ?

Well, actually I've meant that I'm going to proceed with both, since I've
posted both psql and pg_dump patches. But of course you're welcome to submit
any new version or improvements you want.

Re: Displaying and dumping of table access methods

2019-01-11 Thread Peter Eisentraut

On 10/01/2019 04:58, Amit Kapila wrote:
>> One flag that covers all things that make psql output less useful for
>> regression test output, or one flag that just controls the table access
>> method display.
>>
> +1 for the later (one flag that just controls the table access method
> display) as that looks clean.

Yeah, I'd prefer a specific flag.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [Proposal] Add accumulated statistics

2019-01-11 Thread Pavel Stehule

pá 11. 1. 2019 v 2:10 odesílatel Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> napsal:

> From: Robert Haas [mailto:robertmh...@gmail.com]
> > My theory is that the number of wait events is NOT useful information,
> > or at least not nearly as useful the results of a sampling approach.
> > The data that LWLOCK_STATS produce are downright misleading -- they
> > lead you to think that the bottlenecks are in different places than
> > they really are, because the locks that produce the most waiting can
> > be 5th or 10th in terms of the number of wait events.
>
> I understood you're saying that the number of waits alone does not
> necessarily indicate the bottleneck, because a wait with fewer counts but
> longer time can take a large portion of the entire SQL execution time.  So,
> wait time is also useful.  I think that's why Oracle describes and MySQL
> provides precise count and time without sampling.
>

the cumulated lock statistics maybe doesn't help with debugging - but it is
very good indicator of database (in production usage) health.

Regards

Pavel


>
>

71 matches

Mail list logo