date:20200301

Re: Cleanup - Removal of apply_typmod function in #if 0

2020-03-01 Thread Peter Eisentraut


On 2019-10-26 10:36, vignesh C wrote:

One of the function apply_typmod in numeric.c file present within #if 0.
This is like this for many years.
I felt this can be removed.
Attached patch contains the changes to handle removal of apply_typmod
present in #if 0.


committed

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] WAL logging problem in 9.4.3?

2020-03-01 Thread Kyotaro Horiguchi

At Sun, 1 Mar 2020 11:56:32 -0800, Noah Misch  wrote in 
> On Thu, Feb 27, 2020 at 04:00:24PM +0900, Kyotaro Horiguchi wrote:
> > It sounds somewhat obscure.
> 
> I see.  I won't use that.

Thanks.

> > Coulnd't we enumetate examples? And if we
> > could use pg_relation_filenode, I think we can use just
> > "filenode". (Thuogh the word is used in the documentation, it is not
> > defined anywhere..)
> 
> func.sgml does define the term.  Nonetheless, I'm not using it.

Ah, "The filenode is the base component oif the file name(s) used for
the relation".. So it's very similar to "on-disk file" in a sense.

> > 
> > In minimal level, no information is logged for
> > permanent relations for the remainder of a transaction that creates
> > them or changes their filenode. For example, CREATE
> > TABLE, CLUSTER or REFRESH MATERIALIZED VIEW are the command of that
> > category.
> > 
> > 
> > # sorry for bothering you..
> 
> Including examples is fine.  Attached v36nm has just comment and doc changes.
> Would you translate this into back-patch versions for v9.5 through v12?

The explicit list of commands that initiate the WAL-skipping mode
works for me. I'm going to work on the tranlation right now.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG?] postgres_fdw incorrectly updates remote table if it has inherited children.

2020-03-01 Thread Etsuro Fujita

Hi KaiGai-san,

On Sun, Mar 1, 2020 at 1:47 PM Kohei KaiGai  wrote:
> It looks to me the latest patch was submitted by Fujita-san, Oct-2018.
> Then, Tom pointer out this simple approach has a problem of inefficient remote
> query plan because of no intelligence on the structure of remote tables mapped
> by postgres_fdw. After that, the patch has been left for a year.

Unfortunately, I didn't have time to work on that (and won't in the
development cycle for PG13.)

> Indeed, it is not an ideal query plan to execute for each updated rows...
>
> postgres=# explain select * from rtable_parent where tableoid = 126397
> and ctid = '(0,11)'::tid;
>QUERY PLAN
> -
>  Append  (cost=0.00..5.18 rows=2 width=50)
>->  Seq Scan on rtable_parent  (cost=0.00..1.15 rows=1 width=31)
>  Filter: ((tableoid = '126397'::oid) AND (ctid = '(0,11)'::tid))
>->  Tid Scan on rtable_child  (cost=0.00..4.02 rows=1 width=68)
>  TID Cond: (ctid = '(0,11)'::tid)
>  Filter: (tableoid = '126397'::oid)
> (6 rows)

IIRC, I think one of Tom's concerns about the solution I proposed was
that it added the tableoid restriction clause to the remote
UPDATE/DELETE query even if the remote table is not an inheritance
set.  To add the clause only if the remote table is an inheritance
set, what I have in mind is to 1) introduce a new postgres_fdw table
option to indicate whether the remote table is an inheritance set or
not, and 2) determine whether to add the clause or not, using the
option.

Best regards,
Etsuro Fujita

Re: more ALTER .. DEPENDS ON EXTENSION fixes

2020-03-01 Thread Ahsan Hadi

On Sat, Feb 29, 2020 at 2:38 AM Alvaro Herrera 
wrote:

> On 2020-Feb-28, ahsan hadi wrote:
>
>
> > Tested the pg_dump patch for dumping "ALTER .. DEPENDS ON EXTENSION" in
> case of indexes, functions, triggers etc. The "ALTER .. DEPENDS ON
> EXTENSION" is included in the dump. However in some case not sure why
> "ALTER INDEX.DEPENDS ON EXTENSION" is repeated several times in the
> dump?
>
> Hi, thanks for testing.
>
> Are the repeated commands for the same index, same extension?


Yes same index and same extension...


>   Did you
> apply the same command multiple times before running pg_dump?
>

Yes but in some cases I applied the command once and it appeared multiple
times in the dump..


>
> There was an off-list complaint that if you repeat the ALTER .. DEPENDS
> for the same object on the same extension, then the same dependency is
> registered multiple times.  (You can search pg_depend for "deptype = 'x'"
> to see that).  I suppose that would lead to the line being output
> multiple times by pg_dump, also.  Is that what you did?
>

I checked out pg_depend for "deptype='x'" the same dependency is registered
multiple times...

>
> If so: Patch 0002 is supposed to fix that problem, by raising an error
> if the dependency is already registered ... though it occurs to me now
> that it would be more in line with custom to make the command a silent
> no-op.  In fact, doing that would cause old dumps (generated with
> databases containing duplicated entries) to correctly restore a single
> entry, without error.  Therefore my inclination now is to change 0002
> that way and push and backpatch it ahead of 0001.
>

Makes sense, will also try our Patch 0002.

>
> I realize just now that I have failed to verify what happens with
> partitioned indexes.
>

Yes I also missed this one..


>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


-- 
Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: ahsan.h...@highgo.ca

Re: BUG #16108: Colorization to the output of command-line has unproperly behaviors at Windows platform

2020-03-01 Thread Michael Paquier

On Mon, Feb 24, 2020 at 06:56:05PM +0100, Juan José Santamaría Flecha wrote:
> The patch was originaly reported for Windows, but looking into Peter's
> patch, I think this issue affects other systems unless we use stricter
> logic to detect a colorable terminal when using the "auto" option.
> Probably, the way to go is leaving this patch as WIN32 only and thinking
> about a future patch.

It is better to not mix issues.  You can actually bump on similar
coloring issues depending on your configuration, with OSX or even
Linux.

> The logic I have seen on new terminals is that VT100 is supported but
> disabled. Would you find clearer? "Attempt to enable VT100 sequence
> processing. If it is not possible consider it as unsupported."
> 
> Please find attached a patch addressing these comments.

I was reading the thread for the first time, and got surprised first
with the argument about "always" which gives the possibility to print
incorrect characters even if the environment does not allow coloring.
However, after looking at logging.c, the answer is pretty clear what
always is about as it enforces colorization, so this patch looks
correct to me.

On top of that, and that's a separate issue, I have noticed that we
have exactly zero documentation about PG_COLORS (the plural flavor,
not the singular), but we have code for it in common/logging.c..

Anyway, committed down to 12, after tweaking a few things.
--
Michael

signature.asc
Description: PGP signature

Re: WIP: WAL prefetch (another approach)

2020-03-01 Thread Thomas Munro

On Wed, Feb 12, 2020 at 7:52 PM Thomas Munro  wrote:
> 1.  It now uses effective_io_concurrency to control how many
> concurrent prefetches to allow.  It's possible that we should have a
> different GUC to control "maintenance" users of concurrency I/O as
> discussed elsewhere[1], but I'm staying out of that for now; if we
> agree to do that for VACUUM etc, we can change it easily here.  Note
> that the value is percolated through the ComputeIoConcurrency()
> function which I think we should discuss, but again that's off topic,
> I just want to use the standard infrastructure here.

I started a separate thread[1] to discuss that GUC, because it's
basically an independent question.  Meanwhile, here's a new version of
the WAL prefetch patch, with the following changes:

1.  A monitoring view:

  postgres=# select * from pg_stat_wal_prefetcher ;
   prefetch | skip_hit | skip_new | skip_fpw | skip_seq | distance | queue_depth
  
--+--+--+--+--+--+-
  95854 |   291458 |  435 |0 |26245 |   261800 |  10
  (1 row)

That shows a bunch of counters for blocks prefetched and skipped for
various reasons.  It also shows the current read-ahead distance (in
bytes of WAL) and queue depth (an approximation of how many I/Os might
be in flight, used for rate limiting; I'm struggling to come up with a
better short name for this).  This can be used to see the effects of
experiments with different settings, eg:

  alter system set effective_io_concurrency = 20;
  alter system set wal_prefetch_distance = '256kB';
  select pg_reload_conf();

2.  A log message when WAL prefetching begins and ends, so you can see
what it did during crash recovery:

  LOG:  WAL prefetch finished at 0/C5E98758; prefetch = 1112628,
skip_hit = 3607540,
skip_new = 45592, skip_fpw = 0, skip_seq = 177049, avg_distance =
247907.942532,
avg_queue_depth = 22.261352

3.  A bit of general user documentation.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGJUw08dPs_3EUcdO6M90GnjofPYrWp4YSLaBkgYwS-AqA%40mail.gmail.com
From a61b4e00c42ace5db1608e02165f89094bf86391 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 3 Dec 2019 17:13:40 +1300
Subject: [PATCH 1/5] Allow PrefetchBuffer() to be called with a SMgrRelation.

Previously a Relation was required, but it's annoying to have
to create a "fake" one in recovery.
---
 src/backend/storage/buffer/bufmgr.c | 77 -
 src/include/storage/bufmgr.h|  3 ++
 2 files changed, 47 insertions(+), 33 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 5880054245..6e0875022c 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -519,6 +519,48 @@ ComputeIoConcurrency(int io_concurrency, double *target)
 	return (new_prefetch_pages >= 0.0 && new_prefetch_pages < (double) INT_MAX);
 }
 
+void
+SharedPrefetchBuffer(SMgrRelation smgr_reln, ForkNumber forkNum, BlockNumber blockNum)
+{
+#ifdef USE_PREFETCH
+	BufferTag	newTag;		/* identity of requested block */
+	uint32		newHash;	/* hash value for newTag */
+	LWLock	   *newPartitionLock;	/* buffer partition lock for it */
+	int			buf_id;
+
+	Assert(BlockNumberIsValid(blockNum));
+
+	/* create a tag so we can lookup the buffer */
+	INIT_BUFFERTAG(newTag, smgr_reln->smgr_rnode.node,
+   forkNum, blockNum);
+
+	/* determine its hash code and partition lock ID */
+	newHash = BufTableHashCode();
+	newPartitionLock = BufMappingPartitionLock(newHash);
+
+	/* see if the block is in the buffer pool already */
+	LWLockAcquire(newPartitionLock, LW_SHARED);
+	buf_id = BufTableLookup(, newHash);
+	LWLockRelease(newPartitionLock);
+
+	/* If not in buffers, initiate prefetch */
+	if (buf_id < 0)
+		smgrprefetch(smgr_reln, forkNum, blockNum);
+
+	/*
+	 * If the block *is* in buffers, we do nothing.  This is not really ideal:
+	 * the block might be just about to be evicted, which would be stupid
+	 * since we know we are going to need it soon.  But the only easy answer
+	 * is to bump the usage_count, which does not seem like a great solution:
+	 * when the caller does ultimately touch the block, usage_count would get
+	 * bumped again, resulting in too much favoritism for blocks that are
+	 * involved in a prefetch sequence. A real fix would involve some
+	 * additional per-buffer state, and it's not clear that there's enough of
+	 * a problem to justify that.
+	 */
+#endif
+}
+
 /*
  * PrefetchBuffer -- initiate asynchronous read of a block of a relation
  *
@@ -550,39 +592,8 @@ PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
 	}
 	else
 	{
-		BufferTag	newTag;		/* identity of requested block */
-		uint32		newHash;	/* hash value for newTag */
-		LWLock	   *newPartitionLock;	/* buffer partition lock for it */
-		int			buf_id;
-
-		/* create a tag so we can lookup the buffer */
-		INIT_BUFFERTAG(newTag, reln->rd_smgr->smgr_rnode.node,
-

Re: [PATCH v1] Allow COPY "text" to output a header and add header matching mode to COPY FROM

2020-03-01 Thread Surafel Temesgen

On Mon, Mar 2, 2020 at 2:45 AM Rémi Lapeyre  wrote:

>
> I created an entry for this patch in the new CommiFest but it seems that
> it is not finding it. Is there anything that I need to do?
>
>
Is is added on next open commit fest which is
https://commitfest.postgresql.org/28/   now

regards
Surafel

effective_io_concurrency's steampunk spindle maths

2020-03-01 Thread Thomas Munro

Hello,

I was reading through some old threads[1][2][3] while trying to figure
out how to add a new GUC to control I/O prefetching for new kinds of
things[4][5], and enjoyed Simon Riggs' reference to Jules Verne in the
context of RAID spindles.

On 2 Sep 2015 14:54, "Andres Freund"  wrote:
> > On 2015-09-02 18:06:54 +0200, Tomas Vondra wrote:
> > Maybe the best thing we can do is just completely abandon the "number of
> > spindles" idea, and just say "number of I/O requests to prefetch". Possibly
> > with an explanation of how to estimate it (devices * queue length).
>
> I think that'd be a lot better.

+many, though I doubt I could describe how to estimate it myself,
considering cloud storage, SANs, multi-lane NVMe etc.  You basically
have to experiment, and like most of our resource consumption limits,
it's a per-backend limit anyway, so it's pretty complicated, but I
don't see how the harmonic series helps anyone.

Should we rename it?  Here are my first suggestions:

random_page_prefetch_degree
maintenance_random_page_prefetch_degree

Rationale for this naming pattern:
* "random_page" from "random_page_cost"
* leaves room for a different setting for sequential prefetching
* "degree" conveys the idea without using loaded words like "queue"
that might imply we know something about the I/O subsystem or that
it's system-wide like kernel and device queues
* "maintenance_" prefix is like other GUCs that establish (presumably
larger) limits for processes working on behalf of many user sessions

Whatever we call it, I don't think it makes sense to try to model the
details of any particular storage system.  Let's use a simple counter
of I/Os initiated but not yet known to have completed (for now: it has
definitely completed when the associated pread() complete; perhaps
something involving real async I/O completion notification in later
releases).

[1] 
https://www.postgresql.org/message-id/flat/CAHyXU0yaUG9R_E5%3D1gdXhD-MpWR%3DGr%3D4%3DEHFD_fRid2%2BSCQrqA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/Pine.GSO.4.64.0809220317320.20434%40westnet.com
[3] 
https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com
[4] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
[5] 
https://www.postgresql.org/message-id/CA%2BTgmoZP-CTmEPZdmqEOb%2B6t_Tts2nuF7eoqxxvXEHaUoBDmsw%40mail.gmail.com

Re: partition routing layering in nodeModifyTable.c

2020-03-01 Thread Amit Langote

On Mon, Mar 2, 2020 at 4:43 AM Tom Lane  wrote:
> Amit Langote  writes:
> > Rebased again.
>
> Seems to need that again, according to cfbot :-(

Thank you, done.

Regards,
Amit


v10-0002-Remove-es_result_relation_info.patch
Description: Binary data


v10-0003-Rearrange-partition-update-row-movement-code-a-b.patch
Description: Binary data


v10-0004-Refactor-transition-tuple-capture-code-a-bit.patch
Description: Binary data


v10-0001-Remove-dependency-on-estate-es_result_relation_i.patch
Description: Binary data

Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line

2020-03-01 Thread Michael Paquier

On Fri, Feb 28, 2020 at 03:37:47PM +0300, Alexey Kondratov wrote:
> I would prefer to keep it, since there are plenty of similar comments near
> Asserts and elogs all over the Postgres. Otherwise it may look like a valid
> error state. It may be obvious now, but for someone who is not aware of
> BuildRestoreCommand refactoring it may be not. So from my perspective there
> is nothing bad in this extra one line comment.

elog() is called in the event of errors which should never happen by
definition, so your comment is just a duplication of this state.  I'd
still remove that.

> I have added explicit exit(1) calls, since pg_fatal was used only twice in
> the archive.c. Probably, pg_log_fatal from common/logging should obey the
> same logic as FATAL log-level in the backend and do exit the process, but
> for now including pg_rewind.h inside archive.c or vice versa does not look
> like a solution.

archive.c should not depend on anything from src/bin/.

> + * For fixed-size files, the caller may pass the expected size as an
> + * additional crosscheck on successful recovery.  If the file size is not
> + * known, set expectedSize = 0.
> + */
> +int
> +RestoreArchivedWALFile(const char *path, const char *xlogfname,
> +off_t expectedSize, const char 
> *restoreCommand)

Actually, expectedSize is IMO a bad idea, because any caller of this
routine passing down zero could be trapped with an incorrect file
size.  So let's remove the behavior where it is possible to bypass
this sanity check.  We don't need it in pg_rewind either.

> + if ((output_fp = popen(postgres_cmd, "r")) == NULL ||
> + fgets(cmd_output, sizeof(cmd_output), output_fp) == 
> NULL)
> + pg_fatal("could not get restore_command using %s: %s",
> +  postgres_cmd, strerror(errno));

Still missing one %m here.

> + /* Remove trailing newline */
> + if (strchr(cmd_output, '\n') != NULL)
> + *strchr(cmd_output, '\n') = '\0';

It seems to me that what you are looking here is to use
pg_strip_crlf().  Thinking harder, we have pipe_read_line() in
src/common/exec.c which does the exact same job..

> - /*
> -  * construct the command to be executed
> -  */

Perhaps you meant "build" here.

> + if (restore_wal)
> + {
> + int rc;
> + charpostgres_exec_path[MAXPGPATH],
> + postgres_cmd[MAXPGPATH],
> + cmd_output[MAXPGPATH];
> + FILE   *output_fp;
> +
> + /* Find postgres executable. */
> + rc = find_other_exec(argv[0], "postgres",
> +  PG_BACKEND_VERSIONSTR,
> +  postgres_exec_path);

For readability, I would move that into its own separate routine.
--
Michael


signature.asc
Description: PGP signature

Re: [PATCH] Add schema and table names to partition error

2020-03-01 Thread Amit Langote

Hi Chris,

On Mon, Mar 2, 2020 at 8:51 AM Chris Bandy  wrote:
> On 3/1/20 5:14 AM, Amit Kapila wrote:
> > On Sun, Mar 1, 2020 at 10:10 AM Amit Langote  
> > wrote:
> >> This makes sense to me.  Btree code which implements unique
> >> constraints also does this; see _bt_check_unique() function in
> >> src/backend/access/nbtree/nbtinsert.c:
> >>
> >>  ereport(ERROR,
> >>  (errcode(ERRCODE_UNIQUE_VIOLATION),
> >>   errmsg("duplicate key value violates
> >> unique constraint \"%s\"",
> >>  RelationGetRelationName(rel)),
> >>   key_desc ? errdetail("Key %s already 
> >> exists.",
> >>key_desc) : 0,
> >>   errtableconstraint(heapRel,
> >>
> >> RelationGetRelationName(rel;
> >>
> >>> I've attached a patch that adds the schema and table name fields to
> >>> errors for my use case:
> >>
> >> Instead of using errtable(), use errtableconstraint() like the btree
> >> code does, if only just for consistency.
>
> There are two relations in the example you give: the index, rel, and the
> table, heapRel. It makes sense to me that two error fields be filled in
> with those two names.

That's a good point.  Index constraints are actually named after the
index and vice versa, so it's a totally valid usage of
errtableconstraint().

create table foo (a int unique);
\d foo
Table "public.foo"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
Indexes:
"foo_a_key" UNIQUE CONSTRAINT, btree (a)

select conname from pg_constraint where conrelid = 'foo'::regclass;
  conname
---
 foo_a_key
(1 row)

create table bar (a int, constraint a_uniq unique (a));
\d bar
Table "public.bar"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
Indexes:
"a_uniq" UNIQUE CONSTRAINT, btree (a)

select conname from pg_constraint where conrelid = 'bar'::regclass;
 conname
-
 a_uniq
(1 row)

> With partitions, there is no second name because there is no index nor
> constraint object.

It's right to say that partition's case cannot really be equated with
unique indexes.

> My (very limited) understanding is that partition
> "constraints" are entirely contained within pg_class.relpartbound of the
> partition.

That is correct.

> Are you suggesting that the table name go into the constraint name field
> of the error?

Yes, that's what I was thinking, at least for "partition constraint
violated" errors, but given the above that would be a misleading use
of ErrorData.constraint_name.

Maybe it's not too late to invent a new error code like
ERRCODE_PARTITION_VIOLATION or such, then maybe we can use a
hard-coded name, say, just the string "partition constraint".

> >> There are couple more instances in src/backend/command/tablecmds.c
> >> where partition constraint is checked:
> >>
> >> Maybe, better fix these too for completeness.
>
> Done. As there is no named constraint here, I used errtable again.
>
> > Right, if we want to make a change for this, then I think we can once
> > check all the places where we use error code ERRCODE_CHECK_VIOLATION.
>
> I've looked at every instance of this. It is used for 1) check
> constraints, 2) domain constraints, and 3) partition constraints.
>
> 1. errtableconstraint is used with the name of the constraint.
> 2. errdomainconstraint is used with the name of the constraint except in
> one instance which deliberately uses errtablecol.
> 3. With the attached patch, errtable is used except for one instance in
> src/backend/partitioning/partbounds.c described below.
>
> In check_default_partition_contents of
> src/backend/partitioning/partbounds.c, the default partition is checked
> for any rows that should belong in the partition being added _unless_
> the leaf being checked is a foreign table. There are two tables
> mentioned in this warning, and I couldn't decide which, if any, deserves
> to be in the error fields:
>
>  /*
>   * Only RELKIND_RELATION relations (i.e. leaf
> partitions) need to be
>   * scanned.
>   */
>  if (part_rel->rd_rel->relkind != RELKIND_RELATION)
>  {
>  if (part_rel->rd_rel->relkind ==
> RELKIND_FOREIGN_TABLE)
>  ereport(WARNING,
>
> (errcode(ERRCODE_CHECK_VIOLATION),
>   errmsg("skipped
> scanning foreign table \"%s\" which is a partition of default partition
> \"%s\"",
>
> RelationGetRelationName(part_rel),
>
> RelationGetRelationName(default_rel;

It seems strange to see that errcode here or any errcode for

Re: logical replication empty transactions

2020-03-01 Thread Dilip Kumar

On Sat, Nov 9, 2019 at 7:29 AM Euler Taveira  wrote:
>
> Em seg., 21 de out. de 2019 às 21:20, Jeff Janes
>  escreveu:
> >
> > After setting up logical replication of a slowly changing table using the 
> > built in pub/sub facility, I noticed way more network traffic than made 
> > sense.  Looking into I see that every transaction in that database on the 
> > master gets sent to the replica.  99.999+% of them are empty transactions 
> > ('B' message and 'C' message with nothing in between) because the 
> > transactions don't touch any tables in the publication, only non-replicated 
> > tables.  Is doing it this way necessary for some reason?  Couldn't we hold 
> > the transmission of 'B' until something else comes along, and then if that 
> > next thing is 'C' drop both of them?
> >
> That is not optimal. Those empty transactions is a waste of bandwidth.
> We can suppress them if no changes will be sent. test_decoding
> implements "skip empty transaction" as you described above and I did
> something similar to it. Patch is attached.

I think this significantly reduces the network bandwidth for empty
transactions.  I have briefly reviewed the patch and it looks good to
me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: ALTER tbl rewrite loses CLUSTER ON index

2020-03-01 Thread Michael Paquier

On Sat, Feb 29, 2020 at 10:52:58AM -0600, Justin Pryzby wrote:
> Rebased Amit's patch and included my own 0002 to fix the REINDEX CONCURRENTLY
> issue.

I have looked at 0002 as that concerns me.

> +SELECT indexrelid::regclass FROM pg_index WHERE 
> indrelid='concur_clustered'::regclass;
> +   indexrelid   
> +
> + concur_clustered_i_idx
> +(1 row)

This test should check after indisclustered.  Except that, the patch
is fine so I'll apply it if there are no objections.
--
Michael


signature.asc
Description: PGP signature

Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?

2020-03-01 Thread Michael Paquier

On Fri, Feb 28, 2020 at 05:24:29PM +0900, Michael Paquier wrote:
> +   /*
> +* CONTAINS_NEW_TUPLE will always be set unless the multi_insert was
> +* performed for a catalog.  If it is a catalog, return immediately as
> +* there is nothing to logically decode.
> +*/
> +   if (!(xlrec->flags & XLH_INSERT_CONTAINS_NEW_TUPLE))
> +   return;
> Hmm, OK.  Consistency with DecodeInsert() is a good idea here, so
> count me in regarding the way your patch handles the problem.  I would
> be fine to apply that part but, Andres, perhaps you would prefer
> taking care of it yourself?

Okay, I have applied this part.

One comment removal is missed in heapam.c (my fault, then).

+ * TODO this is defined in copy.c, if we want to use this to limit the number
+ * of slots in this patch, we need to figure out where to put it.
+ */
+#define MAX_BUFFERED_BYTES 65535
The use-case is different than copy.c, so aren't you looking at a
separate variable to handle that?

+reset_object_addresses(ObjectAddresses *addrs)
+{
+   if (addrs->numrefs == 0)
+   return;
Or just use an assert?

src/backend/commands/tablecmds.c:   /* attribute.attacl is handled by
InsertPgAttributeTuple */
src/backend/catalog/heap.c: * This is a variant of
InsertPgAttributeTuple() which dynamically allocates
Those two references are not true anymore as your patch removes
InsertPgAttributeTuple() and replaces it by the plural flavor.

+/*
+ * InsertPgAttributeTuples
+ * Construct and insert multiple tuples in pg_attribute.
  *
- * Caller has already opened and locked pg_attribute. new_attribute is the
- * attribute to insert.  attcacheoff is always initialized to -1, attacl and
- * attoptions are always initialized to NULL.
- *
- * indstate is the index state for CatalogTupleInsertWithInfo.  It can be
- * passed as NULL, in which case we'll fetch the necessary info.  (Don't do
- * this when inserting multiple attributes, because it's a tad more
- * expensive.)
+ * This is a variant of InsertPgAttributeTuple() which dynamically allocates
+ * space for multiple tuples. Having two so similar functions is a kludge, but
+ * for now it's a TODO to make it less terrible.
And those comments largely need to remain around.

-   attr = TupleDescAttr(tupdesc, i);
-   /* Fill in the correct relation OID */
-   attr->attrelid = new_rel_oid;
-   /* Make sure this is OK, too */
-   attr->attstattarget = -1;
-
-   InsertPgAttributeTuple(rel, attr, indstate);
attstattarget handling is inconsistent here, no?
InsertPgAttributeTuples() does not touch this part, though your code
updates the attribute's attrelid.

-referenced.classId = RelationRelationId;
-referenced.objectId = heapRelationId;
-referenced.objectSubId = indexInfo->ii_IndexAttrNumbers[i];
-
-recordDependencyOn(, , DEPENDENCY_AUTO);
-
+add_object_address(OCLASS_CLASS, heapRelationId,
+   indexInfo->ii_IndexAttrNumbers[i],
+   refobjs);
Not convinced that we have to publish add_object_address() in the
headers, because we have already add_exact_object_address which is
able to do the same job.  All code paths touched by your patch involve
already ObjectAddress objects, so switching to _exact_ creates less
code diffs.

+   if (ntuples == DEPEND_TUPLE_BUF)
+   {
+   CatalogTuplesMultiInsertWithInfo(sdepRel, slot, ntuples, indstate);
+   ntuples = 0;
+   }
Maybe this is a sufficient argument that we had better consider the
template copy as part of a separate patch, handled once the basics is
in place.  It is not really obvious if 32 is a good thing, or not.

+   /* TODO is nreferenced a reasonable allocation of slots? */
+   slot = palloc(sizeof(TupleTableSlot *) * nreferenced);
I guess so.

/* construct new attribute's pg_attribute entry */
+   oldcontext = MemoryContextSwitchTo(CurTransactionContext);
This needs a comment as this change is not obvious for the reader.
And actually, why?  
--
Michael


signature.asc
Description: PGP signature

Re[4]: bool_plperl transform

2020-03-01 Thread Ivan Panchenko


Thanks, Tom.
 
I think now it should build, please find the fixed patch attached.
I had no possibility to check it on Windows now, but the relevant changes in 
Mkvcbuild.pm are done, so I hope it should work.
The documentation changes are also included in the same patch.
 
Regards,
Ivan
  
>Понедельник, 2 марта 2020, 0:14 +03:00 от Tom Lane :
> 
>=?UTF-8?B?V2Fv?= < w...@mail.ru > writes:
>> Please find the full patch attached.
>The cfbot shows this failing to build on Windows:
>
>https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81889
>
>I believe that's a build without plperl, so what it's probably telling
>you is that Mkvcbuild.pm needs to be taught to build this module
>conditionally, as it already does for hstore_plperl and jsonb_plperl.
>
>Also, while the Linux build is passing, I can't find that it is actually
>compiling or testing bool_plperl anywhere:
>
>https://travis-ci.org/postgresql-cfbot/postgresql/builds/656909114
>
>This is likely because you didn't add it to contrib/Makefile.
>
>In general, I'd suggest grepping for references to hstore_plperl
>or jsonb_plperl, and making sure that bool_plperl gets added where
>appropriate.
>
>I rather imagine you need a .gitignore file, as well.
>
>You're also going to have to provide some documentation, because
>I don't see any in the patch.
>
>regards, tom lane 
 
 
 
 

bool_plperl_transform_v3.patch
Description: Binary data

Re: [PATCH] Add schema and table names to partition error

2020-03-01 Thread Chris Bandy


Thank you both for look at this!

On 3/1/20 5:14 AM, Amit Kapila wrote:

On Sun, Mar 1, 2020 at 10:10 AM Amit Langote  wrote:


Hi Chris,

On Sun, Mar 1, 2020 at 4:34 AM Chris Bandy  wrote:

Hello,

I'm writing telemetry data into a table partitioned by time. When there
is no partition for a particular date, my app notices the constraint
violation, creates the partition, and retries the insert.

I'm used to handling constraint violations by observing the constraint
name in the error fields. However, this error had none. I set out to add
the name to the error field, but after a bit of reading my impression is
that partition constraints are more like a property of a table.


This makes sense to me.  Btree code which implements unique
constraints also does this; see _bt_check_unique() function in
src/backend/access/nbtree/nbtinsert.c:

 ereport(ERROR,
 (errcode(ERRCODE_UNIQUE_VIOLATION),
  errmsg("duplicate key value violates
unique constraint \"%s\"",
 RelationGetRelationName(rel)),
  key_desc ? errdetail("Key %s already exists.",
   key_desc) : 0,
  errtableconstraint(heapRel,

RelationGetRelationName(rel;


I've attached a patch that adds the schema and table name fields to
errors for my use case:


Instead of using errtable(), use errtableconstraint() like the btree
code does, if only just for consistency.


There are two relations in the example you give: the index, rel, and the 
table, heapRel. It makes sense to me that two error fields be filled in 
with those two names.


With partitions, there is no second name because there is no index nor 
constraint object. My (very limited) understanding is that partition 
"constraints" are entirely contained within pg_class.relpartbound of the 
partition.


Are you suggesting that the table name go into the constraint name field 
of the error?



+1.  We use errtableconstraint at other places where we use error code
ERRCODE_CHECK_VIOLATION.


Yes, I see this function used when it is a CHECK constraint that is 
being violated. In every case the constraint name is passed as the 
second argument.



There are couple more instances in src/backend/command/tablecmds.c
where partition constraint is checked:

Maybe, better fix these too for completeness.


Done. As there is no named constraint here, I used errtable again.


Right, if we want to make a change for this, then I think we can once
check all the places where we use error code ERRCODE_CHECK_VIOLATION.


I've looked at every instance of this. It is used for 1) check 
constraints, 2) domain constraints, and 3) partition constraints.


1. errtableconstraint is used with the name of the constraint.
2. errdomainconstraint is used with the name of the constraint except in 
one instance which deliberately uses errtablecol.
3. With the attached patch, errtable is used except for one instance in 
src/backend/partitioning/partbounds.c described below.


In check_default_partition_contents of 
src/backend/partitioning/partbounds.c, the default partition is checked 
for any rows that should belong in the partition being added _unless_ 
the leaf being checked is a foreign table. There are two tables 
mentioned in this warning, and I couldn't decide which, if any, deserves 
to be in the error fields:


/*
 * Only RELKIND_RELATION relations (i.e. leaf 
partitions) need to be

 * scanned.
 */
if (part_rel->rd_rel->relkind != RELKIND_RELATION)
{
if (part_rel->rd_rel->relkind == 
RELKIND_FOREIGN_TABLE)

ereport(WARNING,

(errcode(ERRCODE_CHECK_VIOLATION),
 errmsg("skipped 
scanning foreign table \"%s\" which is a partition of default partition 
\"%s\"",


RelationGetRelationName(part_rel),

RelationGetRelationName(default_rel;

if (RelationGetRelid(default_rel) != 
RelationGetRelid(part_rel))

table_close(part_rel, NoLock);

continue;
}


Another thing we might need to see is which of these can be
back-patched.  We should also try to write the tests for cases we are
changing even if we don't want to commit those.


I don't have any opinion on back-patching. Existing tests pass. I wasn't 
able to find another test that checks the constraint field of errors. 
There's a little bit in the tests for psql, but that is about the the 
\errverbose functionality rather than specific errors and their fields.


Here's what I tested:

# CREATE TABLE t1 (i int PRIMARY KEY); INSERT INTO t1 VALUES (1), (1);
# \errverbose
...
CONSTRAINT NAME:  t1_pkey

# CREATE TABLE pt1 (x int, y int, PRIMARY KEY (x,y))

Re: [PATCH v1] Allow COPY "text" to output a header and add header matching mode to COPY FROM

2020-03-01 Thread Rémi Lapeyre



I created an entry for this patch in the new CommiFest but it seems that it is 
not finding it. Is there anything that I need to do?

Re[2]: bool_plperl transform

2020-03-01 Thread Ivan Panchenko




  
>Понедельник, 2 марта 2020, 1:09 +03:00 от ilm...@ilmari.org:
> 
>Wao < w...@mail.ru > writes:
> 
>> +Datum
>> +bool_to_plperl(PG_FUNCTION_ARGS)
>> +{
>> + dTHX;
>> + bool in = PG_GETARG_BOOL(0);
>> + SV *sv = newSVnv(SvNV(in ? _sv_yes : _sv_no));
>> + return PointerGetDatum(sv);
>> +}
>Why is this only copying the floating point part of the built-in
>booleans before returning them? I think this should just return
>_sv_yes or _sv_no directly, like boolean expressions in Perl do,
>and like what happens for NULL (_sv_undef).
Thanks, I will fix this in the next version of the patch.
 
Regards,
Ivan
>
>- ilmari
>--
>"A disappointingly low fraction of the human race is,
> at any given time, on fire." - Stig Sandbeck Mathisen
>
>

Re: Remove win32ver.rc from version_stamp.pl

2020-03-01 Thread Tom Lane

Peter Eisentraut  writes:
> This removes another relic from the old nmake-based Windows build. 
> version_stamp.pl put version number information into win32ver.rc.  But 
> win32ver.rc already gets other version number information from the 
> preprocessor at build time, so it would make more sense if all version 
> number information would be handled in the same way and we don't have 
> two places that do it.

This has a minor conflict in Solution.pm according to the cfbot.

In general, while I'm on board with the idea, I wonder whether it
wouldn't be smarter to keep on defining PG_MAJORVERSION as a string,
and just add PG_MAJORVERSION_NUM alongside of it.  This would
eliminate some hunks from the patch in places where it's more
convenient to have the version as a string, and it would avoid
what could otherwise be a pretty painful cross-version incompatibility
for extensions.  We already provide PG_VERSION in both forms, so
I don't see any inconsistency in doing likewise for PG_MAJORVERSION.

regards, tom lane

Re: Allow to_date() and to_timestamp() to accept localized names

2020-03-01 Thread Tom Lane

I wrote:
> * I don't think it's appropriate to hard-wire DEFAULT_COLLATION_OID
> as the collation to do case-folding with.  For to_date/to_timestamp,
> we have collatable text input so we can rely on the function's input
> collation instead, at the cost of having to pass down the collation
> OID through a few layers of subroutines :-(.  For parse_datetime,
> I punted for now and let it use DEFAULT_COLLATION_OID, because that's
> currently only called by JSONB code that probably hasn't got a usable
> input collation anyway (since jsonb isn't considered collatable).

On closer look, it's probably a wise idea to change the signature
of parse_datetime() to include a collation argument, because that
function is new in v13 so there's no API-break argument against it.
It will never be cheaper to change it than today.  So v11 below
does that, pushing the use of DEFAULT_COLLATION_OID into the
json-specific code.  Perhaps somebody else would like to look at
whether there's something brighter for that code to do, but I still
suspect there isn't, so I didn't chase it further.

> Barring objections, this seems
> committable to me.

Going once, going twice ...

regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 28035f1..8b73e05 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -5968,7 +5968,7 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');


 TM prefix
-translation mode (print localized day and month names based on
+translation mode (use localized day and month names based on
  )
 TMMonth

@@ -5999,9 +5999,20 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
 
  
   
-   TM does not include trailing blanks.
-   to_timestamp and to_date ignore
-   the TM modifier.
+   TM suppresses trailing blanks whether or
+   not FM is specified.
+  
+ 
+
+ 
+  
+   to_timestamp and to_date
+   ignore letter case in the input; so for
+   example MON, Mon,
+   and mon all accept the same strings.  When using
+   the TM modifier, case-folding is done according to
+   the rules of the function's input collation (see
+   ).
   
  
 
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index d029468..95f7d05 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1038,7 +1038,7 @@ static void parse_format(FormatNode *node, const char *str, const KeyWord *kw,
 static void DCH_to_char(FormatNode *node, bool is_interval,
 		TmToChar *in, char *out, Oid collid);
 static void DCH_from_char(FormatNode *node, const char *in, TmFromChar *out,
-		  bool std, bool *have_error);
+		  Oid collid, bool std, bool *have_error);
 
 #ifdef DEBUG_TO_FROM_CHAR
 static void dump_index(const KeyWord *k, const int *index);
@@ -1057,11 +1057,14 @@ static int	from_char_parse_int_len(int *dest, const char **src, const int len,
 	FormatNode *node, bool *have_error);
 static int	from_char_parse_int(int *dest, const char **src, FormatNode *node,
 bool *have_error);
-static int	seq_search(const char *name, const char *const *array, int *len);
+static int	seq_search_ascii(const char *name, const char *const *array, int *len);
+static int	seq_search_localized(const char *name, char **array, int *len,
+ Oid collid);
 static int	from_char_seq_search(int *dest, const char **src,
  const char *const *array,
+ char **localized_array, Oid collid,
  FormatNode *node, bool *have_error);
-static void do_to_timestamp(text *date_txt, text *fmt, bool std,
+static void do_to_timestamp(text *date_txt, text *fmt, Oid collid, bool std,
 			struct pg_tm *tm, fsec_t *fsec, int *fprec,
 			uint32 *flags, bool *have_error);
 static char *fill_str(char *str, int c, int max);
@@ -2457,7 +2460,7 @@ from_char_parse_int(int *dest, const char **src, FormatNode *node, bool *have_er
  * suitable for comparisons to ASCII strings.
  */
 static int
-seq_search(const char *name, const char *const *array, int *len)
+seq_search_ascii(const char *name, const char *const *array, int *len)
 {
 	unsigned char firstc;
 	const char *const *a;
@@ -2503,8 +2506,89 @@ seq_search(const char *name, const char *const *array, int *len)
 }
 
 /*
- * Perform a sequential search in 'array' for an entry matching the first
- * character(s) of the 'src' string case-insensitively.
+ * Sequentially search an array of possibly non-English words for
+ * a case-insensitive match to the initial character(s) of "name".
+ *
+ * This has the same API as seq_search_ascii(), but we use a more general
+ * case-folding transformation to achieve case-insensitivity.  Case folding
+ * is done per the rules of the collation identified by "collid".
+ *
+ * The array is treated as const, but we don't declare it that way because
+ * the arrays exported by

Re: bool_plperl transform

2020-03-01 Thread Dagfinn Ilmari Mannsåker

Wao  writes:

> +Datum
> +bool_to_plperl(PG_FUNCTION_ARGS)
> +{
> + dTHX;
> + bool in = PG_GETARG_BOOL(0);
> + SV  *sv = newSVnv(SvNV(in ? _sv_yes : _sv_no));
> + return PointerGetDatum(sv);
> +}

Why is this only copying the floating point part of the built-in
booleans before returning them?  I think this should just return
_sv_yes or _sv_no directly, like boolean expressions in Perl do,
and like what happens for NULL (_sv_undef).

- ilmari
-- 
"A disappointingly low fraction of the human race is,
 at any given time, on fire." - Stig Sandbeck Mathisen

Re: Commitfest 2020-03 Now in Progress

2020-03-01 Thread Tom Lane

David Steele  writes:
> The last Commitfest for v13 is now in progress!
> The number of patches waiting on author seems lower (as a percentage) 
> than usual which I take to be a good sign.  I'll be assessing the WoA 
> patches over the next day or two, so if your patch is in this state get 
> a new version in soon.

Another pointer is to check the state of your patch in the cfbot:

http://commitfest.cputube.org

If it isn't passing, please send in a new version that fixes whatever
the problem is.

regards, tom lane

Re: pgbench: option delaying queries till connections establishment?

2020-03-01 Thread Fabien COELHO



Hello Andres,


FWIW, leaving windows, error handling, and other annoyances aside, this
can be implemented fairly simply. See below.


Attached an attempt at improving things.

I've put 2 barriers: one so that all threads are up, one when all 
connections are setup and the bench is ready to go.


I've done a blind attempt at implementing the barrier stuff on windows.

I've changed the performance calculations depending on -C or not. Ramp-up 
effects are smoothed.


I've merged all time-related stuff (time_t, instr_time, int64) to use a 
unique type (pg_time_usec_t) and set of functions/macros, which simplifies 
the code somehow.


I've tried to do some variable renaming to distinguish timestamps and 
intervals.


This is work in progress.

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b8864c6ae5..a16d9d49e1 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -113,18 +113,29 @@ typedef struct socket_set
  */
 
 #ifdef WIN32
+#define PTHREAD_BARRIER_SERIAL_THREAD (-1)
+
 /* Use native win32 threads on Windows */
 typedef struct win32_pthread *pthread_t;
 typedef int pthread_attr_t;
+typedef SYNCHRONIZATION_BARRIER pthread_barrier_t;
 
 static int	pthread_create(pthread_t *thread, pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);
 static int	pthread_join(pthread_t th, void **thread_return);
+
+static int	pthread_barrier_init(pthread_barrier_t *barrier, void *unused, int nthreads);
+static int	pthread_barrier_wait(pthread_barrier_t *barrier);
+static int	pthread_barrier_destroy(pthread_barrier_t *barrier);
 #elif defined(ENABLE_THREAD_SAFETY)
 /* Use platform-dependent pthread capability */
 #include 
 #else
 /* No threads implementation, use none (-j 1) */
 #define pthread_t void *
+#define pthread_barrier_t void *
+#define pthread_barrier_init(a, b, c) /* ignore */
+#define pthread_barrier_wait(a) /* ignore */
+#define pthread_barrier_destroy(a) /* ignore */
 #endif
 
 
@@ -291,7 +302,7 @@ typedef struct SimpleStats
  */
 typedef struct StatsData
 {
-	time_t		start_time;		/* interval start time, for aggregates */
+	pg_time_usec_t	start_time;		/* interval start time, for aggregates */
 	int64		cnt;			/* number of transactions, including skipped */
 	int64		skipped;		/* number of transactions skipped under --rate
  * and --latency-limit */
@@ -310,6 +321,11 @@ typedef struct RandomState
 /* Various random sequences are initialized from this one. */
 static RandomState base_random_sequence;
 
+/* Thread synchronization barriers */
+static pthread_barrier_t
+	start_barrier,		/* all threads are started */
+	bench_barrier;		/* benchmarking ready to start */
+
 /*
  * Connection state machine states.
  */
@@ -417,10 +433,10 @@ typedef struct
 	bool		vars_sorted;	/* are variables sorted by name? */
 
 	/* various times about current transaction */
-	int64		txn_scheduled;	/* scheduled start time of transaction (usec) */
-	int64		sleep_until;	/* scheduled start time of next cmd (usec) */
-	instr_time	txn_begin;		/* used for measuring schedule lag times */
-	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	pg_time_usec_t	txn_scheduled;	/* scheduled start time of transaction (usec) */
+	pg_time_usec_t	sleep_until;	/* scheduled start time of next cmd (usec) */
+	pg_time_usec_t	txn_begin;		/* used for measuring schedule lag times (usec) */
+	pg_time_usec_t	stmt_begin;		/* used for measuring statement latencies (usec) */
 
 	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
 
@@ -452,10 +468,13 @@ typedef struct
 	FILE	   *logfile;		/* where to log, or NULL */
 
 	/* per thread collected stats */
-	instr_time	start_time;		/* thread start time */
-	instr_time	conn_time;
+	pg_time_usec_t	start_time;		/* thread start (creation) time */
+	pg_time_usec_t	started_time;	/* thread is running after start barrier */
+	pg_time_usec_t	bench_start_time; 	/* thread is really benchmarking */
+	pg_time_usec_t	conn_delay;		/* cumulated connection and deconnection delays (usec) */
+
 	StatsData	stats;
-	int64		latency_late;	/* executed but late transactions */
+	int64		latency_late;	/* count executed but late transactions */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -594,10 +613,10 @@ static void setIntValue(PgBenchValue *pv, int64 ival);
 static void setDoubleValue(PgBenchValue *pv, double dval);
 static bool evaluateExpr(CState *st, PgBenchExpr *expr,
 		 PgBenchValue *retval);
-static ConnectionStateEnum executeMetaCommand(CState *st, instr_time *now);
+static ConnectionStateEnum executeMetaCommand(CState *st, pg_time_usec_t *now);
 static void doLog(TState *thread, CState *st,
   StatsData *agg, bool skipped, double latency, double lag);
-static void processXactStats(TState *thread, CState *st, instr_time *now,
+static void processXactStats(TState *thread, CState *st, pg_time_usec_t *now,
 			 bool skipped, StatsData *agg);
 static void append_fillfactor(char *opts, int len);
 static

Re: Re[2]: bool_plperl transform

2020-03-01 Thread Tom Lane

=?UTF-8?B?V2Fv?=  writes:
> Please find the full patch attached.

The cfbot shows this failing to build on Windows:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81889

I believe that's a build without plperl, so what it's probably telling
you is that Mkvcbuild.pm needs to be taught to build this module
conditionally, as it already does for hstore_plperl and jsonb_plperl.

Also, while the Linux build is passing, I can't find that it is actually
compiling or testing bool_plperl anywhere:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/656909114

This is likely because you didn't add it to contrib/Makefile.

In general, I'd suggest grepping for references to hstore_plperl
or jsonb_plperl, and making sure that bool_plperl gets added where
appropriate.

I rather imagine you need a .gitignore file, as well.

You're also going to have to provide some documentation, because
I don't see any in the patch.

regards, tom lane

Commitfest 2020-03 Now in Progress

2020-03-01 Thread David Steele


Hackers,

The last Commitfest for v13 is now in progress!

Current stats for the Commitfest are:

Needs review: 192
Waiting on Author: 19
Ready for Committer: 4
Total: 215

Note that this time I'll be ignoring work done prior to the actual CF 
when reporting progress.  Arbitrary, perhaps, but I'm most interested in 
tracking the ongoing progress during the month.


The number of patches waiting on author seems lower (as a percentage) 
than usual which I take to be a good sign.  I'll be assessing the WoA 
patches over the next day or two, so if your patch is in this state get 
a new version in soon.


Please, if you have submitted patches in this CF make sure that you are 
also reviewing patches of a similar number and complexity.  The CF 
cannot move forward without patch review.


Happy Hacking!
--
-David
da...@pgmasters.net

Re: [PATCH] Erase the distinctClause if the result is unique by definition

2020-03-01 Thread Tom Lane

Andy Fan  writes:
> Please see if you have any comments.   Thanks

The cfbot isn't at all happy with this.  Its linux build is complaining
about a possibly-uninitialized variable, and then giving up:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/656722993

The Windows build isn't using -Werror, but it is crashing in at least
two different spots in the regression tests:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81778

I've not attempted to identify the cause of that.

At a high level, I'm a bit disturbed that this focuses only on DISTINCT
and doesn't (appear to) have any equivalent intelligence for GROUP BY,
though surely that offers much the same opportunities for optimization.
It seems like it'd be worthwhile to take a couple steps back and see
if we couldn't recast the logic to work for both.

Some other random comments:

* Don't especially like the way you broke query_is_distinct_for()
into two functions, especially when you then introduced a whole
lot of other code in between.  That's just making reviewer's job
harder to see what changed.  It makes the comments a bit disjointed
too, that is where you even had any.  (Zero introductory comment
for query_is_distinct_agg is *not* up to project coding standards.
There are a lot of other undercommented places in this patch, too.)

* Definitely don't like having query_distinct_through_join re-open
all the relations.  The data needed for this should get absorbed
while plancat.c has the relations open at the beginning.  (Non-nullness
of columns, in particular, seems like it'll be useful for other
purposes; I'm a bit surprised the planner isn't using that already.)

* In general, query_distinct_through_join seems hugely messy, expensive,
and not clearly correct.  If it is correct, the existing comments sure
aren't enough to explain what it is doing or why.

* Not entirely convinced that a new GUC is needed for this, but if
it is, you have to document it.

* I wouldn't bother with bms_array_free(), nor with any of the other
cleanup you've got at the bottom of query_distinct_through_join.
The planner leaks *lots* of memory, and this function isn't going to
be called so many times that it'll move the needle.

* There seem to be some pointless #include additions, eg in planner.c
the added code doesn't look to justify any of them.  Please also
avoid unnecessary whitespace changes, those also slow down reviewing.

* I see you decided to add a new regression test file select_distinct_2.
That's a poor choice of name because it conflicts with our rules for the
naming of alternative output files.  Besides which, you forgot to plug
it into the test schedule files, so it isn't actually getting run.
Is there a reason not to just add the new test cases to select_distinct?

* There are some changes in existing regression cases that aren't
visibly related to the stated purpose of the patch, eg it now
notices that "select distinct max(unique2) from tenk1" doesn't
require an explicit DISTINCT step.  That's not wrong, but I wonder
if maybe you should subdivide this patch into more than one patch,
because that must be coming from some separate change.  I'm also
wondering what caused the plan change in expected/join.out.

regards, tom lane

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-03-01 Thread Floris Van Nee

> The cfbot thinks it doesn't even apply anymore --- conflict with the dedup
> patch, no doubt?

Minor conflict with that patch indeed. Attached is rebased version. I'm running 
some tests now - will post the results here when finished.

-Floris



v4-0001-Avoid-pipeline-stall-in-_bt_compare.patch
Description: v4-0001-Avoid-pipeline-stall-in-_bt_compare.patch


v4-0002-Inline-_bt_compare.patch
Description: v4-0002-Inline-_bt_compare.patch

Re: [HACKERS] WAL logging problem in 9.4.3?

2020-03-01 Thread Noah Misch

On Thu, Feb 27, 2020 at 04:00:24PM +0900, Kyotaro Horiguchi wrote:
> At Tue, 25 Feb 2020 21:36:12 -0800, Noah Misch  wrote in 
> > On Tue, Feb 25, 2020 at 10:01:51AM +0900, Kyotaro Horiguchi wrote:
> > > At Sat, 22 Feb 2020 21:12:20 -0800, Noah Misch  wrote 
> > > in 
> > > > On Fri, Feb 21, 2020 at 04:49:59PM +0900, Kyotaro Horiguchi wrote:
> > > If we decide to keep the consistency there, I would like to describe
> > > the code is there for consistency, not for the benefit of a specific
> > > assertion.
> > > 
> > > (cluster.c:1116)
> > > -* new. The next step for rel2 is deletion, but copy rd_*Subid for the
> > > -* benefit of AssertPendingSyncs_RelationCache().
> > > +* new. The next step for rel2 is deletion, but copy rd_*Subid for the
> > > +* consistency of the fieles. It is checked later by
> > > +* AssertPendingSyncs_RelationCache().
> > 
> > I think the word "consistency" is too vague for "consistency of the fields" 
> > to
> > convey information.  May I just remove the last sentence of the comment
> > (everything after "* new.")?
> 
> I'm fine with that:)
> 
> > > I agree that relation works as the generic name of table-like
> > > objects. Addition to that, doesn't using the word "storage file" make
> > > it more clearly?  I'm not confident on the wording itself, but it will
> > > look like the following.
> > 
> > The docs rarely use "storage file" or "on-disk file" as terms.  I hesitate 
> > to
> > put more emphasis on files, because they are part of the implementation, not
> > part of the user interface.  The term "rewrites"/"rewriting" has the same
> > problem, though.  Yet another alternative would be to talk about operations
> > that change the pg_relation_filenode() return value:
> > 
> >   In minimal level, no information is logged for 
> > permanent
> >   relations for the remainder of a transaction that creates them or changes
> >   what pg_relation_filenode returns for them.
> > 
> > What do you think?
> 
> It sounds somewhat obscure.

I see.  I won't use that.

> Coulnd't we enumetate examples? And if we
> could use pg_relation_filenode, I think we can use just
> "filenode". (Thuogh the word is used in the documentation, it is not
> defined anywhere..)

func.sgml does define the term.  Nonetheless, I'm not using it.

> 
> In minimal level, no information is logged for
> permanent relations for the remainder of a transaction that creates
> them or changes their filenode. For example, CREATE
> TABLE, CLUSTER or REFRESH MATERIALIZED VIEW are the command of that
> category.
> 
> 
> # sorry for bothering you..

Including examples is fine.  Attached v36nm has just comment and doc changes.
Would you translate this into back-patch versions for v9.5 through v12?
Author: Noah Misch 
Commit: Noah Misch 

Fix cosmetic blemishes involving rd_createSubid.

Remove an obsolete comment from AtEOXact_cleanup().  Restore formatting
of a comment in struct RelationData, mangled by the pgindent run in
commit 9af4159fce6654aa0e081b00d02bca40b978745c.  Back-patch to 9.5 (all
supported versions), because another fix stacks on this.

diff --git a/src/backend/utils/cache/relcache.c 
b/src/backend/utils/cache/relcache.c
index 50f8912..44fa843 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3029,10 +3029,7 @@ AtEOXact_cleanup(Relation relation, bool isCommit)
 *
 * During commit, reset the flag to zero, since we are now out of the
 * creating transaction.  During abort, simply delete the relcache entry
-* --- it isn't interesting any longer.  (NOTE: if we have forgotten the
-* new-ness of a new relation due to a forced cache flush, the entry 
will
-* get deleted anyway by shared-cache-inval processing of the aborted
-* pg_class insertion.)
+* --- it isn't interesting any longer.
 */
if (relation->rd_createSubid != InvalidSubTransactionId)
{
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 31d8a1a..bf7a7df 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -63,7 +63,7 @@ typedef struct RelationData
 * 
rd_replidindex) */
boolrd_statvalid;   /* is rd_statlist valid? */
 
-   /*
+   /*--
 * rd_createSubid is the ID of the highest subtransaction the rel has
 * survived into; or zero if the rel was not created in the current top
 * transaction.  This can be now be relied on, whereas previously it 
could
@@ -73,8 +73,13 @@ typedef struct RelationData
 * have forgotten changing it). rd_newRelfilenodeSubid can be forgotten
 * when a relation has multiple new relfilenodes within a single
 * transaction, with one of them occurring in a subsequently aborted
-* subtransaction, e.g. BEGIN; TRUNCATE t; SAVEPOINT save; TRUNCATE t;
-*

Re: Some improvements to numeric sqrt() and ln()

2020-03-01 Thread Dean Rasheed

On Fri, 28 Feb 2020 at 08:15, Dean Rasheed  wrote:
>
> It's possible that there are further gains to be had in the sqrt()
> algorithm on platforms that support 128-bit integers, but I haven't
> had a chance to investigate that yet.
>

Rebased patch attached, now using 128-bit integers for part of
sqrt_var() on platforms that support them. This turned out to be well
worth it (1.5 to 2 times faster than the previous version if the
result has less than 30 or 40 digits).

Regards,
Dean
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 10229eb..9e6bb80
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -393,16 +393,6 @@ static const NumericVar const_ten =
 #endif
 
 #if DEC_DIGITS == 4
-static const NumericDigit const_zero_point_five_data[1] = {5000};
-#elif DEC_DIGITS == 2
-static const NumericDigit const_zero_point_five_data[1] = {50};
-#elif DEC_DIGITS == 1
-static const NumericDigit const_zero_point_five_data[1] = {5};
-#endif
-static const NumericVar const_zero_point_five =
-{1, -1, NUMERIC_POS, 1, NULL, (NumericDigit *) const_zero_point_five_data};
-
-#if DEC_DIGITS == 4
 static const NumericDigit const_zero_point_nine_data[1] = {9000};
 #elif DEC_DIGITS == 2
 static const NumericDigit const_zero_point_nine_data[1] = {90};
@@ -518,6 +508,8 @@ static void div_var_fast(const NumericVa
 static int	select_div_scale(const NumericVar *var1, const NumericVar *var2);
 static void mod_var(const NumericVar *var1, const NumericVar *var2,
 	NumericVar *result);
+static void div_mod_var(const NumericVar *var1, const NumericVar *var2,
+		NumericVar *quot, NumericVar *rem);
 static void ceil_var(const NumericVar *var, NumericVar *result);
 static void floor_var(const NumericVar *var, NumericVar *result);
 
@@ -7712,6 +7704,7 @@ div_var_fast(const NumericVar *var1, con
 			 NumericVar *result, int rscale, bool round)
 {
 	int			div_ndigits;
+	int			load_ndigits;
 	int			res_sign;
 	int			res_weight;
 	int		   *div;
@@ -7766,9 +7759,6 @@ div_var_fast(const NumericVar *var1, con
 	div_ndigits += DIV_GUARD_DIGITS;
 	if (div_ndigits < DIV_GUARD_DIGITS)
 		div_ndigits = DIV_GUARD_DIGITS;
-	/* Must be at least var1ndigits, too, to simplify data-loading loop */
-	if (div_ndigits < var1ndigits)
-		div_ndigits = var1ndigits;
 
 	/*
 	 * We do the arithmetic in an array "div[]" of signed int's.  Since
@@ -7781,9 +7771,16 @@ div_var_fast(const NumericVar *var1, con
 	 * (approximate) quotient digit and stores it into div[], removing one
 	 * position of dividend space.  A final pass of carry propagation takes
 	 * care of any mistaken quotient digits.
+	 *
+	 * Note that div[] doesn't necessarily contain all of the digits from the
+	 * dividend --- the desired precision plus guard digits might be less than
+	 * the dividend's precision.  This happens, for example, in the square
+	 * root algorithm, where we typically divide a 2N-digit number by an
+	 * N-digit number, and only require a result with N digits of precision.
 	 */
 	div = (int *) palloc0((div_ndigits + 1) * sizeof(int));
-	for (i = 0; i < var1ndigits; i++)
+	load_ndigits = Min(div_ndigits, var1ndigits);
+	for (i = 0; i < load_ndigits; i++)
 		div[i + 1] = var1digits[i];
 
 	/*
@@ -7844,9 +7841,15 @@ div_var_fast(const NumericVar *var1, con
 			maxdiv += Abs(qdigit);
 			if (maxdiv > (INT_MAX - INT_MAX / NBASE - 1) / (NBASE - 1))
 			{
-/* Yes, do it */
+/*
+ * Yes, do it.  Note that if var2ndigits is much smaller than
+ * div_ndigits, we can save a significant amount of effort
+ * here by noting that we only need to normalise those div[]
+ * entries touched where prior iterations subtracted multiples
+ * of the divisor.
+ */
 carry = 0;
-for (i = div_ndigits; i > qi; i--)
+for (i = Min(qi + var2ndigits - 2, div_ndigits); i > qi; i--)
 {
 	newdig = div[i] + carry;
 	if (newdig < 0)
@@ -8095,6 +8098,74 @@ mod_var(const NumericVar *var1, const Nu
 
 
 /*
+ * div_mod_var() -
+ *
+ *	Calculate the truncated integer quotient and numeric remainder of two
+ *	numeric variables.
+ */
+static void
+div_mod_var(const NumericVar *var1, const NumericVar *var2,
+			NumericVar *quot, NumericVar *rem)
+{
+	NumericVar	q;
+	NumericVar	r;
+
+	init_var();
+	init_var();
+
+	/*
+	 * Use div_var_fast() to get an initial estimate for the integer quotient.
+	 * In practice, this is almost always correct, but it is occasionally off
+	 * by one, which we can easily correct.
+	 */
+	div_var_fast(var1, var2, , 0, false);
+	mul_var(var2, , , var2->dscale);
+	sub_var(var1, , );
+
+	/*
+	 * Adjust the results if necessary --- the remainder should have the same
+	 * sign as var1, and its absolute value should be less than the absolute
+	 * value of var2.
+	 */
+	while (r.ndigits != 0 && r.sign != var1->sign)
+	{
+		/* The absolute value of the quotient is too large */
+		if (var1->sign == var2->sign)
+		{
+			sub_var(, _one, );
+			add_var(, var2, );
+		}

Re: partition routing layering in nodeModifyTable.c

2020-03-01 Thread Tom Lane

Amit Langote  writes:
> Rebased again.

Seems to need that again, according to cfbot :-(

regards, tom lane

Re: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-03-01 Thread Tom Lane

Peter Geoghegan  writes:
> Attached is v3, which no longer includes the third patch/optimization.
> It also inlines (in the second patch) by marking the _bt_compare
> definition as inline, while not changing anything in nbtree.h. I
> believe that this is portable C99 -- let's see what CF Tester thinks
> of it.

The cfbot thinks it doesn't even apply anymore --- conflict with the dedup
patch, no doubt?

regards, tom lane

Re: [HACKERS] Cached plans and statement generalization

2020-03-01 Thread Tom Lane

Konstantin Knizhnik  writes:
> [ autoprepare-extended-4.patch ]

The cfbot is showing that this doesn't apply anymore; there's
some doubtless-trivial conflict in prepare.c.

However ... TBH I've been skeptical of this whole proposal from the
beginning, on the grounds that it would probably hurt more use-cases
than it helps.  The latest approach doesn't really do anything to
assuage that fear, because now that you've restricted it to extended
query mode, the feature amounts to nothing more than deliberately
overriding the application's choice to use unnamed rather than named
(prepared) statements.  How often is that going to be a good idea?
And when it is, wouldn't it be better to fix the app?  The client is
likely to have a far better idea of which statements would benefit
from this treatment than the server will; and in this context,
the client-side changes needed would really be trivial.  The original
proposal, scary as it was, at least supported the argument that you
could use it to improve applications that were too dumb/hoary to
parameterize commands for themselves.

I'm also unimpressed by benchmark testing that relies entirely on
pgbench's default scenario, because that scenario consists entirely
of queries that are perfectly adapted to plan-only-once treatment.
In the real world, we constantly see complaints about cases where the
plancache mistakenly decides that a generic plan is acceptable.  I think
that extending that behavior to more cases is going to be a net loss,
until we find some way to make it smarter and more reliable.  At the
very least, you need to show some worst-case numbers alongside these
best-case numbers --- what happens with queries where we conclude we
must replan every time, so that the plancache becomes pure overhead?

The pgbench test case is laughably unrealistic in another way, in that
there are so few distinct queries it issues, so that there's no
stress at all on the cache-management aspect of this problem.

In short, I really think we ought to reject this patch and move on.
Maybe it could be resurrected sometime in the future when we have a
better handle on when to cache plans and when not.

If you want to press forward with it anyway, certainly the lack of
any tests in this patch is another big objection.  Perhaps you
could create a pgbench TAP script that exercises the logic.

regards, tom lane

Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2020-03-01 Thread Tom Lane

Matheus de Oliveira  writes:
> [ postgresql-alter-constraint.v7.patch ]

cfbot says this isn't applying --- looks like a minor conflict in
the regression test file.  Please rebase.

regards, tom lane

Re: proposal - reglanguage type

2020-03-01 Thread Pavel Stehule

ne 1. 3. 2020 v 19:31 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > I miss a reglanguage type from our set of reg* types.
>
> I'm skeptical about this.  I don't think we want to wind up with a reg*
> type for every system catalog, so there needs to be some rule about which
> ones it's worth the trouble for.  The original idea was to provide a reg*
> type if the lookup rule would be anything more complicated than "select
> oid from  where name = 'foo'".  We went beyond that with
> regnamespace and regrole, but I think there was a sufficient argument of
> usefulness for those two.  I don't see that reglanguage has enough of
> a use-case.
>

the use-case is probably only one - filtering pg_proc. Probably the most
common filter is

prolang = (SELECT oid
FROM pg_language
   WHERE lanname = 'plpgsql')

It's little bit not comfortable so for namespace we can do pronamespace <>
'pg_catalog'::regnamespace and there is nothing for language.

This feature is interesting for people who write code in plpgsql, or who
migrate from PL/SQL (and for people who use plpgsql_check).

All mass check (mass usage of plpgsql_check) have to use filter on prolang.

Regards

Pavel

>
> regards, tom lane
>

Re: proposal - reglanguage type

2020-03-01 Thread Tom Lane

Pavel Stehule  writes:
> I miss a reglanguage type from our set of reg* types.

I'm skeptical about this.  I don't think we want to wind up with a reg*
type for every system catalog, so there needs to be some rule about which
ones it's worth the trouble for.  The original idea was to provide a reg*
type if the lookup rule would be anything more complicated than "select
oid from  where name = 'foo'".  We went beyond that with
regnamespace and regrole, but I think there was a sufficient argument of
usefulness for those two.  I don't see that reglanguage has enough of
a use-case.

regards, tom lane

Re: explain HashAggregate to report bucket and memory stats

2020-03-01 Thread Justin Pryzby

Updated for new tests in 58c47ccfff20b8c125903482725c1dbfd30beade
and rebased.
>From 3e1904c6c36ee3ff4f56a2808c2400a3b2d2a0e5 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 31 Dec 2019 18:49:41 -0600
Subject: [PATCH v6 1/7] explain to show tuplehash bucket and memory stats..

Note that hashed SubPlan and recursiveUnion aren't affected in explain output,
probably since hashtables aren't allocated at that point.

Discussion: https://www.postgresql.org/message-id/flat/20200103161925.gm12...@telsasoft.com
---
 .../postgres_fdw/expected/postgres_fdw.out|  56 +--
 src/backend/commands/explain.c| 137 +++--
 src/backend/executor/execGrouping.c   |  27 
 src/backend/executor/nodeAgg.c|  11 ++
 src/backend/executor/nodeRecursiveunion.c |   3 +
 src/backend/executor/nodeSetOp.c  |   1 +
 src/backend/executor/nodeSubplan.c|   3 +
 src/include/executor/executor.h   |   1 +
 src/include/nodes/execnodes.h |   9 ++
 src/test/regress/expected/aggregates.out  |  36 +++--
 src/test/regress/expected/groupingsets.out|  64 ++--
 src/test/regress/expected/join.out|   3 +-
 src/test/regress/expected/matview.out |   9 +-
 .../regress/expected/partition_aggregate.out  | 145 ++
 src/test/regress/expected/partition_join.out  |  13 +-
 src/test/regress/expected/pg_lsn.out  |   3 +-
 src/test/regress/expected/select_distinct.out |   3 +-
 src/test/regress/expected/select_parallel.out |  11 +-
 src/test/regress/expected/subselect.out   |   9 +-
 src/test/regress/expected/tablesample.out |   3 +-
 src/test/regress/expected/union.out   |  15 +-
 src/test/regress/expected/window.out  |   3 +-
 src/test/regress/expected/write_parallel.out  |  16 +-
 23 files changed, 473 insertions(+), 108 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 62c2697920..2ddae83178 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2086,9 +2086,11 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
  ->  HashAggregate
Output: t1.c1, avg((t1.c1 + t2.c1))
Group Key: t1.c1
+   Buckets: 256
->  HashAggregate
  Output: t1.c1, t2.c1
  Group Key: t1.c1, t2.c1
+ Buckets: 4096
  ->  Append
->  Foreign Scan
  Output: t1.c1, t2.c1
@@ -2098,7 +2100,7 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
  Output: t1_1.c1, t2_1.c1
  Relations: (public.ft1 t1_1) INNER JOIN (public.ft2 t2_1)
  Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1"
-(20 rows)
+(22 rows)
 
 SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) UNION SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1)) AS t (t1c1, t2c1) GROUP BY t1c1 ORDER BY t1c1 OFFSET 100 LIMIT 10;
  t1c1 | avg  
@@ -2129,11 +2131,12 @@ SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM
  ->  HashAggregate
Output: t2.c1, t3.c1
Group Key: t2.c1, t3.c1
+   Buckets: 2
->  Foreign Scan
  Output: t2.c1, t3.c1
  Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3)
  Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1")) AND ((r1.c2 = $1::integer
-(13 rows)
+(14 rows)
 
 SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
  C 1 
@@ -2610,10 +2613,11 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
->  HashAggregate
  Output: ((c2 * ((random() <= '1'::double precision))::integer))
  Group Key: (ft2.c2 * ((random() <= '1'::double precision))::integer)
+ Buckets: 2
  ->  Foreign Scan on public.ft2
Output: (c2 * ((random() <= '1'::double precision))::integer)
Remote SQL: SELECT c2 FROM "S 1"."T 1"
-(9 rows)
+(10 rows)
 
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
@@ -2713,11 +2717,12 @@ select sum(c1) from ft1 group by c2 having avg(c1 * (random() <= 1)::int) > 100
->  HashAggregate
  Output: sum(c1), c2
  Group Key: ft1.c2
+ Buckets: 16
  Filter: (avg((ft1.c1 *

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-01 Thread legrand legrand

>> I like the idea of adding a check for a non-zero queryId in the new
>> pgss_planner_hook() (zero queryid shouldn't be reserved for
>> utility_statements ?).

> Some assert hit later, I can say that it's not always true.  For
> instance a CREATE TABLE AS won't run parse analysis for the underlying
> query, as this has already been done for the original statement, but
> will still call the planner.  I'll change pgss_planner_hook to ignore
> such cases, as pgss_store would otherwise think that it's a utility
> statement.  That'll probably incidentally fix the IVM incompatibility. 

Today with or without test on parse->queryId != UINT64CONST(0),
CTAS is collected as a utility_statement without planning counter.
This seems to me respectig the rule, not sure that this needs any 
new (risky) change to the actual (quite stable) patch. 





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: proposal \gcsv

2020-03-01 Thread Vik Fearing

On 01/03/2020 13:29, Pavel Stehule wrote:
> so 29. 2. 2020 v 11:34 odesílatel Vik Fearing 
> napsal:
> 
>> On 29/02/2020 06:43, Pavel Stehule wrote:
>>> Hi
>>>
>>> I would to enhance \g command about variant \gcsv
>>>
>>> proposed command has same behave like \g, only the result will be every
>>> time in csv format.
>>>
>>> It can helps with writing psql macros wrapping \g command.
>>>
>>> Options, notes?
>>
>> But then we would need \ghtml and \glatex etc.  If we want a shortcut
>> for setting a one-off format, I would go for \gf or something.
>>
>> \gf csv
>> \gf html
>> \gf latex
>>
> 
> ok. I implemented \gf. See a attached patch

I snuck this into the commitfest that starts today while no one was
looking. https://commitfest.postgresql.org/27/2503/

And I added myself as reviewer.
-- 
Vik Fearing

Re: d25ea01275 and partitionwise join

2020-03-01 Thread Amit Langote

On Sat, Feb 29, 2020 at 8:18 AM Tom Lane  wrote:
> Amit Langote  writes:
> > On Wed, Nov 6, 2019 at 2:00 AM Tom Lane  wrote:
> >> Just to leave a breadcrumb in this thread --- the planner failure
> >> induced by d25ea01275 has been fixed in 529ebb20a.  The difficulty
> >> with multiway full joins that Amit started this thread with remains
> >> open, but I imagine the posted patches will need rebasing over
> >> 529ebb20a.
>
> > Here are the rebased patches.
>
> The cfbot shows these patches as failing regression tests.  I think
> it is just cosmetic fallout from 6ef77cf46 having changed EXPLAIN's
> choices of table alias names; but please look closer to confirm,
> and post updated patches.

Thanks for notifying.

Checked and indeed fallout from 6ef77cf46 seems to be the reason a
test is failing.

Updated patches attached.

Thanks,
Amit


v5-0002-Move-some-code-from-joinrel.c-to-relnode.c.patch
Description: Binary data


v5-0001-Some-cosmetic-improvements-to-partitionwise-join-.patch
Description: Binary data


v5-0003-Fix-partitionwise-join-to-handle-FULL-JOINs-corre.patch
Description: Binary data

Re: Planning counters in pg_stat_statements (using pgss_store)

2020-03-01 Thread Julien Rouhaud

Hi,

On Fri, Feb 28, 2020 at 4:06 PM legrand legrand
 wrote:
>
> It seems IVM team does not consider this point as a priority ...

Well, IVM is a big project and I agree that fixing this issue isn't
the most urgent one, especially since there's no guarantee that this
pgss planning patch will be committed, or with the current behavior.

> We should not wait for them, if we want to keep a chance to be
> included in PG13.
>
> So we have to make this feature more robust, an assert failure being
> considered as a severe regression (even if this is not coming from pgss).

I'm still not convinced that handling NULL query string, as in
sometimes ignoring planning counters, is the right solution here. For
now all code is able to provide it (or at least all the code that goes
through make installcheck).   I'm wondering if it'd be better to
instead add a similar assert in pg_plan_query, to make sure that this
requirements is always met even without using pg_stat_statements, or
any other extension that would also rely on that.

I also realized that the last version of the patch I sent was a rebase
of the wrong version, I'll send the correct version soon.

> I like the idea of adding a check for a non-zero queryId in the new
> pgss_planner_hook() (zero queryid shouldn't be reserved for
> utility_statements ?).

Some assert hit later, I can say that it's not always true.  For
instance a CREATE TABLE AS won't run parse analysis for the underlying
query, as this has already been done for the original statement, but
will still call the planner.  I'll change pgss_planner_hook to ignore
such cases, as pgss_store would otherwise think that it's a utility
statement.  That'll probably incidentally fix the IVM incompatibility.

Re: Broken resetting of subplan hash tables

2020-03-01 Thread Ranier Vilela

Em sáb., 29 de fev. de 2020 às 18:44, Tom Lane  escreveu:

> "throw it away" sure looks like it means the entire hashtable, not just
> its tuple contents.
>
I don't know if I can comment clearly to help, but from my experience,
destroying and rebuilding the hashtable is a waste if possible, resetting
it.
By analogy, I have code with arrays where, I reuse them, with only one
line, instead of reconstructing them.
a->nelts = 0; / * reset array * /
If possible, doing the same for hashtables would be great.

regards,
Ranier Vilela

Re: proposal \gcsv

2020-03-01 Thread Pavel Stehule

so 29. 2. 2020 v 18:06 odesílatel David Fetter  napsal:

> On Sat, Feb 29, 2020 at 11:59:22AM +0100, Pavel Stehule wrote:
> > so 29. 2. 2020 v 11:34 odesílatel Vik Fearing 
> > napsal:
> >
> > > On 29/02/2020 06:43, Pavel Stehule wrote:
> > > > Hi
> > > >
> > > > I would to enhance \g command about variant \gcsv
> > > >
> > > > proposed command has same behave like \g, only the result will be
> every
> > > > time in csv format.
> > > >
> > > > It can helps with writing psql macros wrapping \g command.
> > > >
> > > > Options, notes?
> > >
> > > But then we would need \ghtml and \glatex etc.  If we want a shortcut
> > > for setting a one-off format, I would go for \gf or something.
> > >
> > > \gf csv
> > > \gf html
> > > \gf latex
> > >
> >
> > usability of html or latex format in psql is significantly lower than csv
> > format. There is only one generic format for data - csv.
>
> Not exactly.  There's a lot of uses for things along the lines of
>
> \gf json
> \gf yaml
>
> I'd rather add a new \gf that takes arguments, as it seems more
> extensible. For example, there are uses for
>

I implemented \gf by Vik's proposal


> \gf csv header
>
> if no header is the default, or
>
> \gf csv noheader
>

It is little bit hard (although it looks simply).

The second option of this command can be file - and it reads all to end of
line. So in this case a implementation of variadic parameters is difficult.

Motivation for this patch is a possibility to write macros like

postgres=# \set gnuplot '\\g | gnuplot -p -e "set datafile separator
\',\'; set key autotitle columnhead; set terminal dumb enhanced; plot
\'-\'with boxes"'

postgres=# \pset format csv

postgres=# select i, sin(i) from generate_series(0, 6.3, 0.05) g(i) :gnuplot


with \gf csv I can do almost what I need.

\set gnuplot '\\gf csv | gnuplot -p -e "set datafile separator \',\'; set
key autotitle columnhead; set terminal dumb enhanced; plot \'-\'with
boxes"'


> if header is the default.
>
> Best,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Re: proposal \gcsv

2020-03-01 Thread Pavel Stehule

so 29. 2. 2020 v 11:34 odesílatel Vik Fearing 
napsal:

> On 29/02/2020 06:43, Pavel Stehule wrote:
> > Hi
> >
> > I would to enhance \g command about variant \gcsv
> >
> > proposed command has same behave like \g, only the result will be every
> > time in csv format.
> >
> > It can helps with writing psql macros wrapping \g command.
> >
> > Options, notes?
>
> But then we would need \ghtml and \glatex etc.  If we want a shortcut
> for setting a one-off format, I would go for \gf or something.
>
> \gf csv
> \gf html
> \gf latex
>

ok. I implemented \gf. See a attached patch

Regards

Pavel


> -1 on \gcsv
> --
> Vik Fearing
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 20ba105160..734b3e94e0 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2108,6 +2108,16 @@ CREATE INDEX
 
   
 
+  
+\gf format [ filename ]
+\gf format [ |command ]
+
+
+\gf is equivalent to \g, but
+forces specified format.  See \pset format.
+
+
+  
 
   
 \gset [ prefix ]
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index e111cee556..fe93313964 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -163,6 +163,8 @@ static void printGSSInfo(void);
 static bool printPsetInfo(const char *param, struct printQueryOpt *popt);
 static char *pset_value_string(const char *param, struct printQueryOpt *popt);
 
+static bool format_number(const char *value, int vallen, enum printFormat *numformat);
+
 #ifdef WIN32
 static void checkWin32Codepage(void);
 #endif
@@ -333,7 +335,8 @@ exec_command(const char *cmd,
 		status = exec_command_errverbose(scan_state, active_branch);
 	else if (strcmp(cmd, "f") == 0)
 		status = exec_command_f(scan_state, active_branch);
-	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0)
+	else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0 ||
+			 strcmp(cmd, "gf") == 0)
 		status = exec_command_g(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "gdesc") == 0)
 		status = exec_command_gdesc(scan_state, active_branch);
@@ -1259,7 +1262,42 @@ exec_command_g(PsqlScanState scan_state, bool active_branch, const char *cmd)
 
 	if (active_branch)
 	{
-		char	   *fname = psql_scan_slash_option(scan_state,
+		char	   *fname;
+
+		if (strcmp(cmd, "gf") == 0)
+		{
+			char	   *fmtname = psql_scan_slash_option(scan_state,
+		 OT_NORMAL, NULL, false);
+
+			if (!fmtname)
+			{
+pg_log_error("no format name");
+
+return PSQL_CMD_ERROR;
+			}
+			else
+			{
+enum printFormat	format;
+bool		result;
+
+result = format_number(fmtname, strlen(fmtname), );
+
+free(fmtname);
+
+if (result)
+{
+	pset.gf_format = format;
+}
+else
+{
+	pg_log_error("\\pset: allowed formats are aligned, asciidoc, csv, html, latex, latex-longtable, troff-ms, unaligned, wrapped");
+
+	return PSQL_CMD_ERROR;
+}
+			}
+		}
+
+		fname  = psql_scan_slash_option(scan_state,
    OT_FILEPIPE, NULL, false);
 
 		if (!fname)
@@ -3746,6 +3784,68 @@ _unicode_linestyle2string(int linestyle)
 	return "unknown";
 }
 
+/*
+ * Returns true if format name was recognized.
+ */
+static bool
+format_number(const char *value, int vallen, enum printFormat *numformat)
+{
+	static const struct fmt
+	{
+		const char *name;
+		enum printFormat number;
+	}			formats[] =
+	{
+		/* remember to update error message below when adding more */
+		{"aligned", PRINT_ALIGNED},
+		{"asciidoc", PRINT_ASCIIDOC},
+		{"csv", PRINT_CSV},
+		{"html", PRINT_HTML},
+		{"latex", PRINT_LATEX},
+		{"troff-ms", PRINT_TROFF_MS},
+		{"unaligned", PRINT_UNALIGNED},
+		{"wrapped", PRINT_WRAPPED}
+	};
+
+	int			match_pos = -1;
+
+	for (int i = 0; i < lengthof(formats); i++)
+	{
+		if (pg_strncasecmp(formats[i].name, value, vallen) == 0)
+		{
+			if (match_pos < 0)
+match_pos = i;
+			else
+			{
+pg_log_error("\\pset: ambiguous abbreviation \"%s\" matches both \"%s\" and \"%s\"",
+			 value,
+			 formats[match_pos].name, formats[i].name);
+return false;
+			}
+		}
+	}
+	if (match_pos >= 0)
+	{
+		*numformat = formats[match_pos].number;
+
+		return true;
+	}
+	else if (pg_strncasecmp("latex-longtable", value, vallen) == 0)
+	{
+		/*
+		 * We must treat latex-longtable specially because latex is a
+		 * prefix of it; if both were in the table above, we'd think
+		 * "latex" is ambiguous.
+		 */
+		*numformat = PRINT_LATEX_LONGTABLE;
+
+		return true;
+	}
+
+	return false;
+}
+
+
 /*
  * do_pset
  *
@@ -3763,55 +3863,14 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 	/* set format */
 	if (strcmp(param, "format") == 0)
 	{
-		static const struct fmt
-		{
-			const char *name;
-			enum printFormat number;
-		}			formats[] =
-		{
-			/* remember to update error message below when adding more */
-			{"aligned", PRINT_ALIGNED},
-			{"asciidoc", PRINT_ASCIIDOC},
-

Re: [PATCH] Add schema and table names to partition error

2020-03-01 Thread Amit Kapila

On Sun, Mar 1, 2020 at 10:10 AM Amit Langote  wrote:
>
> Hi Chris,
>
> On Sun, Mar 1, 2020 at 4:34 AM Chris Bandy  wrote:
> > Hello,
> >
> > I'm writing telemetry data into a table partitioned by time. When there
> > is no partition for a particular date, my app notices the constraint
> > violation, creates the partition, and retries the insert.
> >
> > I'm used to handling constraint violations by observing the constraint
> > name in the error fields. However, this error had none. I set out to add
> > the name to the error field, but after a bit of reading my impression is
> > that partition constraints are more like a property of a table.
>
> This makes sense to me.  Btree code which implements unique
> constraints also does this; see _bt_check_unique() function in
> src/backend/access/nbtree/nbtinsert.c:
>
> ereport(ERROR,
> (errcode(ERRCODE_UNIQUE_VIOLATION),
>  errmsg("duplicate key value violates
> unique constraint \"%s\"",
> RelationGetRelationName(rel)),
>  key_desc ? errdetail("Key %s already 
> exists.",
>   key_desc) : 0,
>  errtableconstraint(heapRel,
>
> RelationGetRelationName(rel;
>
> > I've attached a patch that adds the schema and table name fields to
> > errors for my use case:
>
> Instead of using errtable(), use errtableconstraint() like the btree
> code does, if only just for consistency.
>

+1.  We use errtableconstraint at other places where we use error code
ERRCODE_CHECK_VIOLATION.

> > - Insert data into a partitioned table for which there is no partition.
> > - Insert data directly into an incorrect partition.
>
> There are couple more instances in src/backend/command/tablecmds.c
> where partition constraint is checked:
>
> In ATRewriteTable():
>
> if (partqualstate && !ExecCheck(partqualstate, econtext))
> {
> if (tab->validate_default)
> ereport(ERROR,
> (errcode(ERRCODE_CHECK_VIOLATION),
>  errmsg("updated partition constraint for
> default partition \"%s\" would be violated by some row",
> RelationGetRelationName(oldrel;
> else
> ereport(ERROR,
> (errcode(ERRCODE_CHECK_VIOLATION),
>  errmsg("partition constraint of relation
> \"%s\" is violated by some row",
> RelationGetRelationName(oldrel;
> }
>
> Maybe, better fix these too for completeness.
>

Right, if we want to make a change for this, then I think we can once
check all the places where we use error code ERRCODE_CHECK_VIOLATION.
Another thing we might need to see is which of these can be
back-patched.  We should also try to write the tests for cases we are
changing even if we don't want to commit those.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re[2]: bool_plperl transform

2020-03-01 Thread Ivan Panchenko




  
>Воскресенье, 1 февраля 2020, 1:15 +03:00 от Tom Lane :
> 
>=?UTF-8?B?SXZhbiBQYW5jaGVua28=?= < w...@mail.ru > writes:
>> While using PL/Perl I have found that it obtains boolean arguments from 
>> Postgres as ‘t’ and ‘f’, which is extremely inconvenient because ‘f’ is not 
>> false from the perl viewpoint.
>> ...
>> * make a transform which transforms bool, like it is done with jsonb. This 
>> does not break compatibility and is rather straightforward.
>Please register this patch in the commitfest app, so we don't lose track
>of it.
>
>https://commitfest.postgresql.org/27/
Done:
https://commitfest.postgresql.org/27/2502/
 
Regards,
Ivan
 
>
>regards, tom lane

Re: Improving connection scalability: GetSnapshotData()

2020-03-01 Thread Amit Kapila

On Sun, Mar 1, 2020 at 2:17 PM Andres Freund  wrote:
>
> Hi,
>
> On 2020-03-01 00:36:01 -0800, Andres Freund wrote:
> > conns   tps mastertps pgxact-split
> >
> > 1   26842.49284526524.194821
> > 10  246923.158682   249224.782661
> > 50  695956.539704   709833.746374
> > 100 1054727.043139  1903616.306028
> > 200 964795.282957   1949200.338012
> > 300 906029.377539   1927881.231478
> > 400 845696.690912   1911065.369776
> > 500 812295.222497   1926237.255856
> > 600 888030.104213   1903047.236273
> > 700 866896.532490   1886537.202142
> > 800 863407.341506   1883768.592610
> > 900 871386.608563   1874638.012128
> > 1000887668.277133   1876402.391502
> > 1500860051.361395   1815103.564241
> > 2000890900.098657   1775435.271018
> > 3000874184.980039   1653953.817997
> > 4000845023.080703   1582582.316043
> > 5000817100.195728   1512260.802371
> >
> > I think these are pretty nice results.
>

Nice improvement.  +1 for improving the scalability for higher connection count.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Yet another fast GiST build

2020-03-01 Thread Komяpa

Hello,

Thanks for the patch and working on GiST infrastructure, it's really
valuable for PostGIS use cases and I wait to see this improvement in
PG13.

On Sat, Feb 29, 2020 at 3:13 PM Andrey M. Borodin  wrote:

> Thomas, I've used your wording almost exactly with explanation how
> point_zorder_internal() works. It has more explanation power than my attempts
> to compose good comment.

PostGIS uses this trick to ensure locality. In PostGIS 3 we enhanced
that trick to have the Hilbert curve instead of Z Order curve.

For visual representation have a look at these links:
 - http://blog.cleverelephant.ca/2019/08/postgis-3-sorting.html - as
it's implemented in PostGIS btree sorting opclass
 - https://observablehq.com/@mourner/hilbert-curve-packing - to
explore general approach.

Indeed if it feels insecure to work with bit magic that implementation
can be left out to extensions.

> There is one design decision that worries me most:
> should we use opclass function or index option to provide this sorting 
> information?
> It is needed only during index creation, actually. And having extra i-class 
> only for fast build
> seems excessive.
> I think we can provide both ways and let opclass developers decide?

Reloption variant looks dirty. It won't cover an index on (id uuid,
geom geometry) where id is duplicated (say, tracked car identifier)
but present in every query - no way to pass such thing as reloption.
I'm also concerned about security of passing a sortsupport function
manually during index creation (what if that's not from the same
extension or even (wrong-)user defined something).

We know for sure it's a good idea for all btree_gist types and
geometry and I don't see a case where user would want to disable it.
Just make it part of operator class, and that would also allow fast
creation of multi-column index.

Thanks.

-- 
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

proposal - reglanguage type

2020-03-01 Thread Pavel Stehule

Hi

I miss a reglanguage type from our set of reg* types.

It reduce a mental overhead for queries over pg_proc table

With this type I can easy filter only plpgsql functions

select *
  from pg_proc
where prolang = 'plpgsql'::reglanguage
   and pronamespace <> 'pg_catalog'::regnamespace;

Regards

Pavel
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index d1d033178f..85bcd43d3d 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4525,8 +4525,8 @@ INSERT INTO mytable VALUES(-1);  -- fails
 several alias types for oid: regproc,
 regprocedure, regoper, regoperator,
 regclass, regtype, regrole,
-regnamespace, regconfig, and
-regdictionary.   shows an
+regnamespace, regconfig, reglanguage
+and regdictionary.   shows an
 overview.

 
@@ -4660,6 +4660,14 @@ SELECT * FROM pg_attribute
 text search dictionary
 simple

+
+   
+reglanguage
+pg_language
+language name
+plpgsql
+   
+
   
  
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 28035f1635..2bb1b2fd8e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18413,6 +18413,11 @@ SELECT pg_type_is_visible('myschema.widget'::regtype);
regrole
get the OID of the named role
   
+  
+   to_reglanguage(language_name)
+   regrole
+   get the OID of the named language
+  
  
 

@@ -18717,7 +18722,7 @@ SELECT collation for ('foo' COLLATE "de_DE");
names (given as text) to objects of
type regclass, regproc, regprocedure,
regoper, regoperator, regtype,
-   regnamespace, and regrole
+   regnamespace, regrole and reglanguage
respectively.  These functions differ from a cast from
text in that they don't accept a numeric OID, and that they return null
rather than throwing an error if the name is not found (or, for
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index f0fa52bc27..da412ef01b 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -29,6 +29,7 @@
 #include "catalog/pg_ts_config.h"
 #include "catalog/pg_ts_dict.h"
 #include "catalog/pg_type.h"
+#include "commands/proclang.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "parser/parse_type.h"
@@ -1717,6 +1718,118 @@ stringToQualifiedNameList(const char *string)
 	return result;
 }
 
+/*
+ * reglanguagein		- converts "language" to type OID
+ *
+ * We also accept a numeric OID, for symmetry with the output routine,
+ * and for possible use in bootstrap mode.
+ *
+ * '-' signifies unknown (OID 0).  In all other cases, the input must
+ * match an existing pg_type entry.
+ */
+Datum
+reglanguagein(PG_FUNCTION_ARGS)
+{
+	char	   *language_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result = InvalidOid;
+
+	/* '-' ? */
+	if (strcmp(language_name_or_oid, "-") == 0)
+		PG_RETURN_OID(InvalidOid);
+
+	/* Numeric OID? */
+	if (language_name_or_oid[0] >= '0' &&
+		language_name_or_oid[0] <= '9' &&
+		strspn(language_name_or_oid, "0123456789") == strlen(language_name_or_oid))
+	{
+		result = DatumGetObjectId(DirectFunctionCall1(oidin,
+	  CStringGetDatum(language_name_or_oid)));
+		PG_RETURN_OID(result);
+	}
+
+	/* Else it's a type name, possibly schema-qualified or decorated */
+
+	/* The rest of this wouldn't work in bootstrap mode */
+	if (IsBootstrapProcessingMode())
+		elog(ERROR, "reglanguage values must be OIDs in bootstrap mode");
+
+	result = get_language_oid(language_name_or_oid, false);
+
+	PG_RETURN_OID(result);
+}
+
+/*
+ * to_reglanguage	- converts "language nane" to type OID
+ *
+ * If the name is not found, we return NULL.
+ */
+Datum
+to_reglanguage(PG_FUNCTION_ARGS)
+{
+	char	   *language_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	Oid			result;
+
+	result = get_language_oid(language_name, true);
+
+	if (OidIsValid(result))
+		PG_RETURN_OID(result);
+	else
+		PG_RETURN_NULL();
+}
+
+/*
+ * reglanguageout		- converts language OID to "language_name"
+ */
+Datum
+reglanguageout(PG_FUNCTION_ARGS)
+{
+	Oid			langid = PG_GETARG_OID(0);
+	char	   *result;
+
+	if (langid == InvalidOid)
+	{
+		result = pstrdup("-");
+		PG_RETURN_CSTRING(result);
+	}
+
+	result = get_language_name(langid, true);
+
+	if (result)
+	{
+		/* pstrdup is not really necessary, but it avoids a compiler warning */
+		result = pstrdup(quote_identifier(result));
+	}
+	else
+	{
+		/* If OID doesn't match any namespace, return it numerically */
+		result = (char *) palloc(NAMEDATALEN);
+		snprintf(result, NAMEDATALEN, "%u", langid);
+	}
+
+	PG_RETURN_CSTRING(result);
+}
+
+/*
+ *		reglanguagerecv			- converts external binary format to reglanguage
+ */
+Datum
+reglanguagerecv(PG_FUNCTION_ARGS)
+{
+	/* Exactly the same as oidrecv, so share code */
+	return oidrecv(fcinfo);
+}
+
+/*
+ *		reglanguagesend			- converts reglanguage to binary format
+ */
+Datum
+reglanguagesend(PG_FUNCTION_ARGS)
+{
+

Re: Improving connection scalability: GetSnapshotData()

2020-03-01 Thread Andres Freund

Hi,

On 2020-03-01 00:36:01 -0800, Andres Freund wrote:
> conns   tps mastertps pgxact-split
> 
> 1   26842.49284526524.194821
> 10  246923.158682   249224.782661
> 50  695956.539704   709833.746374
> 100 1054727.043139  1903616.306028
> 200 964795.282957   1949200.338012
> 300 906029.377539   1927881.231478
> 400 845696.690912   1911065.369776
> 500 812295.222497   1926237.255856
> 600 888030.104213   1903047.236273
> 700 866896.532490   1886537.202142
> 800 863407.341506   1883768.592610
> 900 871386.608563   1874638.012128
> 1000887668.277133   1876402.391502
> 1500860051.361395   1815103.564241
> 2000890900.098657   1775435.271018
> 3000874184.980039   1653953.817997
> 4000845023.080703   1582582.316043
> 5000817100.195728   1512260.802371
> 
> I think these are pretty nice results.

Attached as a graph as well.

Improving connection scalability: GetSnapshotData()

2020-03-01 Thread Andres Freund

Hi,

I think postgres' issues with scaling to larger numbers of connections
is a serious problem in the field. While poolers can address some of
that, given the issues around prepared statements, transaction state,
etc, I don't think that's sufficient in many cases. It also adds
latency.

Nor do I think the argument that one shouldn't have more than a few
dozen connection holds particularly much water. As clients have think
time, and database results have to be sent/received (most clients don't
use pipelining), and as many applications have many application servers
with individual connection pools, it's very common to need more
connections than postgres can easily deal with.


The largest reason for that is GetSnapshotData(). It scales poorly to
larger connection counts. Part of that is obviously it's O(connections)
nature, but I always thought it had to be more.  I've seen production
workloads spending > 98% of the cpu time n GetSnapshotData().


After a lot of analysis and experimentation I figured out that the
primary reason for this is PGXACT->xmin. Even the simplest transaction
modifies MyPgXact->xmin several times during its lifetime (IIRC twice
(snapshot & release) for exec_bind_message(), same for
exec_exec_message(), then again as part of EOXact processing). Which
means that a backend doing GetSnapshotData() on a system with a number
of other connections active, is very likely to hit PGXACT cachelines
that are owned by another cpu / set of cpus / socket. The larger the
system is, the worse the consequences of this are.

This problem is most prominent (and harder to fix) for xmin, but also
exists for the other fields in PGXACT. We rarely have xid, nxids,
overflow, or vacuumFlags set, yet constantly set them, leading to
cross-node traffic.

The second biggest problem is that the indirection through pgprocnos
that GetSnapshotData() has to do to go through to get each backend's
xmin is very unfriendly for a pipelined CPU (i.e. all that postgres runs
on). There's basically a stall at the end of every loop iteration -
which is exascerbated by there being so many cache misses.


It's fairly easy to avoid unnecessarily dirtying cachelines for all the
PGXACT fields except xmin. Because that actually needs to be visible to
other backends.


While it sounds almost trivial in hindsight, it took me a long while to
grasp a solution to a big part of this problem: We don't actually need
to look at PGXACT->xmin to compute a snapshot. The only reason that
GetSnapshotData() does so, is because it also computes
RecentGlobal[Data]Xmin.

But we don't actually need them all that frequently. They're primarily
used as a horizons for heap_page_prune_opt() etc. But for one, while
pruning is really important, it doesn't happen *all* the time. But more
importantly a RecentGlobalXmin from an earlier transaction is actually
sufficient for most pruning requests, especially when there is a larger
percentage of reading than updating transaction (very common).

By having GetSnapshotData() compute an accurate upper bound after which
we are certain not to be able to prune (basically the transaction's
xmin, slots horizons, etc), and a conservative lower bound below which
we are definitely able to prune, we can allow some pruning actions to
happen. If a pruning request (or something similar) encounters an xid
between those, an accurate lower bound can be computed.

That allows to avoid looking at PGXACT->xmin.


To address the second big problem (the indirection), we can instead pack
the contents of PGXACT tightly, just like we do for pgprocnos. In the
attached series, I introduced separate arrays for xids, vacuumFlags,
nsubxids.

The reason for splitting them is that they change at different rates,
and different sizes.  In a read-mostly workload, most backends are not
going to have an xid, therefore making the xids array almost
constant. As long as all xids are unassigned, GetSnapshotData() doesn't
need to look at anything else, therefore making it sensible to check the
xid first.


Here are some numbers for the submitted patch series. I'd to cull some
further improvements to make it more manageable, but I think the numbers
still are quite convincing.

The workload is a pgbench readonly, with pgbench -M prepared -c $conns
-j $conns -S -n for each client count.  This is on a machine with 2
Intel(R) Xeon(R) Platinum 8168, but virtualized.

conns   tps master  tps pgxact-split

1   26842.49284526524.194821
10  246923.158682   249224.782661
50  695956.539704   709833.746374
100 1054727.043139  1903616.306028
200 964795.282957   1949200.338012
300 906029.377539   1927881.231478
400 845696.690912   1911065.369776
500 812295.222497   1926237.255856
600 888030.104213   1903047.236273
700 866896.532490   1886537.202142
800 863407.341506   1883768.592610
900 871386.608563

Re[2]: bool_plperl transform

2020-03-01 Thread Wao


Sorry,
 
Please find the full patch attached.
 
Ivan
  
>Воскресенье, 1 марта 2020, 7:57 +03:00 от Andrew Dunstan 
>:
> 
>
>On 2/29/20 4:55 PM, Ivan Panchenko wrote:
>> Hi,
>> While using PL/Perl I have found that it obtains boolean arguments
>> from Postgres as ‘t’ and ‘f’, which is extremely inconvenient because
>> ‘f’ is not false from the perl viewpoint.
>> So the problem is how to convert the SQL booleans into Perl style.
>>  
>> There are 3 ways to do this:
>>
>> 1. make plperl automatically convert bools into something acceptable
>> for perl. This looks simple, but probably is not acceptable as it
>> breaks compatibility.
>> 2. try to make some trick like it is done with arrays, i.e. convert
>> bools into special Perl objects which look like ‘t’ and ‘f’ when
>> treated as text, but are true and false for boolean operations. I
>> am not sure that it is possible and reliable.
>> 3. make a transform which transforms bool, like it is done with
>> jsonb. This does not break compatibility and is rather
>> straightforward.
>>
>> So I propose to take the third way and make such transform. This is
>> very simple, a patch is attached.
>> Also this patch improves the plperl documentation page, which now has
>> nothing said about the transforms.
>>  
>>
>
>Patch appears to be missing all the new files.
>
>
>cheers
>
>
>andrew
>
>
>
>--
>Andrew Dunstan  https://www.2ndQuadrant.com
>PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>  
 
 
--
Иван Панченко
 

bool_plperl_transform_v2.patch
Description: Binary data

51 matches

Mail list logo