Re: Is it worth accepting multiple CRLs?

2021-01-19 Thread Peter Eisentraut

On 2021-01-19 01:17, Kyotaro Horiguchi wrote:

Thank you for the information. The only reason for sharing the same
variable for both file and directory is to avoid additional variable
only for this reason. I'll post a new version where new GUC
ssl_crl_path is added.


Okay, I look forward to that patch.


By the way we can do the same thing on CA file/dir, but I personally
think that the benefit from the specify-by-directory for CA files is
far less than CRL files. So I'm not going to do this for CA files for
now.


Yeah, that seems not so commonly used.




Re: popcount

2021-01-19 Thread Peter Eisentraut

On 2021-01-18 16:34, Tom Lane wrote:

Peter Eisentraut  writes:

[ assorted nits ]


At the level of bikeshedding ... I quite dislike using the name "popcount"
for these functions.  I'm aware that some C compilers provide primitives
of that name, but I wouldn't expect a SQL programmer to know that;
without that context the name seems pretty random and unintuitive.
Moreover, it invites confusion with SQL's use of "pop" to abbreviate
"population" in the statistical aggregates, such as var_pop().


I was thinking about that too, but according to 
, popcount is an accepted 
high-level term, with "pop" also standing for "population".





Re: Change default of checkpoint_completion_target

2021-01-19 Thread Peter Eisentraut

On 2021-01-13 23:10, Stephen Frost wrote:

Yes, I agree, and am involved in that thread as well- currently waiting
feedback from others about the proposed approach.

I've tried to push that forward.  I'm happy to update this patch once
we've got agreement to move forward on that, to wit, adding to an
'obsolete' section in the documentation information about this
particular GUC and how it's been removed due to not being sensible or
necessary to continue to have.


Some discussion a few days ago was arguing that it was still necessary 
in some cases as a way to counteract the possible lack of tuning in the 
kernel flushing behavior.  I think in light of that we should go with 
your first patch that just changes the default, possibly with the 
documentation updated a bit.





Re: Paint some PG_USED_FOR_ASSERTS_ONLY in inline functions of ilist.h and bufpage.h

2021-01-19 Thread Julien Rouhaud
On Tue, Jan 19, 2021 at 9:53 AM Michael Paquier  wrote:
>
> Hi all,
>
> The following functions in ilist.h and bufpage.h use some arguments
> only in assertions:
> - dlist_next_node
> - dlist_prev_node
> - slist_has_next
> - slist_next_node
> - PageValidateSpecialPointer
>
> Without PG_USED_FOR_ASSERTS_ONLY, this can lead to compilation
> warnings when not using assertions, and one example of that is
> plpgsql_check.

For the record, that's due to that extra flags in the Makefile:

override CFLAGS += -I$(top_builddir)/src/pl/plpgsql/src -Wall

I think that we're still far from being able to get a clean output
using -Wall on postgres itself, so I don't know how much we can
promise to external code, but fixing those may be a good step.

>  We don't have examples on HEAD where
> PG_USED_FOR_ASSERTS_ONLY is used on arguments, but that looks to work
> properly with gcc.

Yeah I don't see any explicit mention on that on gcc manual.  For the
record it also work as expected using clang, and the attached patch
remove all warnings when compiling plpgsql_check.




Re: Is it worth accepting multiple CRLs?

2021-01-19 Thread Kyotaro Horiguchi
At Tue, 19 Jan 2021 09:17:34 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> By the way we can do the same thing on CA file/dir, but I personally
> think that the benefit from the specify-by-directory for CA files is
> far less than CRL files. So I'm not going to do this for CA files for
> now.

This is it. A new guc ssl_crl_dir and connection option crldir are
added.

One problem raised upthread is the footprint for test is quite large
because all certificate and key files are replaced by this patch. I
think we can shrink the footprint by generating that files on-demand
but that needs openssl frontend to be installed on the development
environment.

If we agree that requirement, I'm going to go that direction.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 898ab229a54373dc4794af8fa7eebffbf2849c13 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 21 Jul 2020 23:01:27 +0900
Subject: [PATCH v3] Allow to specify CRL directory

We have the ssl_crl_file GUC variable and the sslcrl connection option
to specify a CRL file.  X509_STORE_load_locations accepts a directory,
which leads to on-demand loading method with which method only
relevant CRLs are loaded. Allow server and client to use the hashed
directory method. We could use the existing variable and option to
specify the direcotry name but allowing to use both methods at the
same time gives operation flexibility to users.
---
 doc/src/sgml/config.sgml  | 21 -
 doc/src/sgml/libpq.sgml   | 20 +++-
 doc/src/sgml/runtime.sgml | 33 +
 src/backend/libpq/be-secure-openssl.c | 27 +--
 src/backend/libpq/be-secure.c |  1 +
 src/backend/utils/misc/guc.c  | 10 
 src/include/libpq/libpq.h |  1 +
 src/interfaces/libpq/fe-connect.c |  6 +++
 src/interfaces/libpq/fe-secure-openssl.c  | 24 +++---
 src/interfaces/libpq/libpq-int.h  |  1 +
 src/test/ssl/Makefile | 20 +++-
 src/test/ssl/ssl/both-cas-1.crt   | 46 +--
 src/test/ssl/ssl/both-cas-2.crt   | 46 +--
 src/test/ssl/ssl/client+client_ca.crt | 28 +--
 src/test/ssl/ssl/client-crldir/9bb9e3c3.r0| 11 +
 src/test/ssl/ssl/client-revoked.crt   | 14 +++---
 src/test/ssl/ssl/client.crl   | 16 +++
 src/test/ssl/ssl/client.crt   | 14 +++---
 src/test/ssl/ssl/client_ca.crt| 14 +++---
 .../ssl/ssl/root+client-crldir/9bb9e3c3.r0| 11 +
 .../ssl/ssl/root+client-crldir/a3d11bff.r0| 11 +
 src/test/ssl/ssl/root+client.crl  | 32 ++---
 src/test/ssl/ssl/root+client_ca.crt   | 32 ++---
 .../ssl/ssl/root+server-crldir/a3d11bff.r0| 11 +
 .../ssl/ssl/root+server-crldir/a836cc2d.r0| 11 +
 src/test/ssl/ssl/root+server.crl  | 32 ++---
 src/test/ssl/ssl/root+server_ca.crt   | 32 ++---
 src/test/ssl/ssl/root.crl | 16 +++
 src/test/ssl/ssl/root_ca.crt  | 18 
 src/test/ssl/ssl/server-cn-and-alt-names.crt  | 14 +++---
 src/test/ssl/ssl/server-cn-only.crt   | 14 +++---
 src/test/ssl/ssl/server-crldir/a836cc2d.r0| 11 +
 .../ssl/ssl/server-multiple-alt-names.crt | 14 +++---
 src/test/ssl/ssl/server-no-names.crt  | 16 +++
 src/test/ssl/ssl/server-revoked.crt   | 14 +++---
 src/test/ssl/ssl/server-single-alt-name.crt   | 16 +++
 src/test/ssl/ssl/server-ss.crt| 18 
 src/test/ssl/ssl/server.crl   | 16 +++
 src/test/ssl/ssl/server_ca.crt| 14 +++---
 src/test/ssl/t/001_ssltests.pl| 31 -
 src/test/ssl/t/SSLServer.pm   | 14 +-
 41 files changed, 496 insertions(+), 255 deletions(-)
 create mode 100644 src/test/ssl/ssl/client-crldir/9bb9e3c3.r0
 create mode 100644 src/test/ssl/ssl/root+client-crldir/9bb9e3c3.r0
 create mode 100644 src/test/ssl/ssl/root+client-crldir/a3d11bff.r0
 create mode 100644 src/test/ssl/ssl/root+server-crldir/a3d11bff.r0
 create mode 100644 src/test/ssl/ssl/root+server-crldir/a836cc2d.r0
 create mode 100644 src/test/ssl/ssl/server-crldir/a836cc2d.r0

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 82864bbb24..85d4402745 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1214,7 +1214,26 @@ include_dir 'conf.d'
 Relative paths are relative to the data directory.
 This parameter can only be set in the postgresql.conf
 file or on the server command line.
-The default is empty, meaning no CRL file is loaded.
+The default is empty, meaning no CRL file is loaded unless
+ is set.
+   
+  
+ 
+
+ 
+  ssl_crl_dir (string)
+  
+   ssl_crl_dir configuration

Re: pg_stat_statements oddity with track = all

2021-01-19 Thread Masahiro Ikeda

Hi,

Thanks for making the patch to add "toplevel" column in 
pg_stat_statements.

This is a review comment.

There hasn't been any discussion, at least that I've been able to find.
So, +1 to change the status to "Ready for Committer".


1. submission/feature review
=

This patch can be applied cleanly to the current master branch(ed4367).
I tested with `make check-world` and I checked there is no fail.

This patch has reasonable documents and tests.
A "toplevel" column of pg_stat_statements view is documented and
following tests are added.
- pg_stat_statements.track option : 'top' and 'all'
- query type: normal query and nested query(pl/pgsql)

I tested the "update" command can work.
postgres=# ALTER EXTENSION pg_stat_statements UPDATE TO '1.10';

Although the "toplevel" column of all queries which already stored is 
'false',

we have to decide the default. I think 'false' is ok.


2. usability review


This patch solves the problem we can't get to know
which query is top-level if pg_stat_statements.track = 'all'.

This leads that we can analyze with aggregated queries.

There is some use-case.
For example, we can know the sum of total_exec_time of queries
even if nested queries are executed.

We can know how efficiently a database can use CPU resource for queries
using the sum of the total_exec_time, and the exec_user_time and
exec_system_time in pg_stat_kcache which is the extension gathering
os resources.

Although one concern is whether only top-level is enough or not,
I couldn't come up with any use-case to use nested level, so I think 
it's ok.



3. coding review
=

Although I had two concerns, I think they are no problem.
So, this patch looks good to me.

Following were my concerns.

A. the risk of too many same queries is duplicate.

Since this patch adds a "top" member in the hash key,
it leads to store duplicated same query which "top" is false and true.

This concern is already discussed and I agreed to the consensus
it seems unlikely to have the same queries being executed both
at the top level and as nested statements.

B. add a argument of the pg_stat_statements_reset().

Now, pg_stat_statements supports a flexible reset feature.

pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint)


Although I wondered whether we need to add a top-level flag to the 
arguments,
I couldn't come up with any use-case to reset only top-level queries or 
not top-level queries.



4. others
==

These comments are not related to this patch.

A. about the topic of commitfests

Since I think this feature is for monitoring,
it's better to change the topic from "System Administration"
to "Monitoring & Control".


B. check logic whether a query is top-level or not in pg_stat_kcache

This patch uses the only exec_nested_level to check whether a query is 
top-level or not.

The reason is nested_level is less useful and I agree.

But, pg_stat_kcache uses plan_nested_level too.
Although the check result is the same, it's better to change it
corresponding to this patch after it's committed.


Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION




Re: Single transaction in the tablesync worker?

2021-01-19 Thread Peter Smith
Hi Amit.

PSA the v17 patch for the Tablesync Solution1.

Main differences from v16:
+ Small refactor for DropSubscription to correct the "make check" deadlock
+ Added test case
+ Some comment wording



Features:

* The tablesync slot is now permanent instead of temporary.

* The tablesync slot name is no longer tied to the Subscription slot name.

* The tablesync worker is now allowing multiple tx instead of single tx

* A new state (SUBREL_STATE_FINISHEDCOPY) is persisted after a
successful copy_table in tablesync's LogicalRepSyncTableStart.

* If a re-launched tablesync finds state SUBREL_STATE_FINISHEDCOPY
then it will bypass the initial copy_table phase.

* Now tablesync sets up replication origin tracking in
LogicalRepSyncTableStart (similar as done for the apply worker). The
origin is advanced when first created.

* Cleanup of tablesync resources:
- The tablesync slot cleanup (drop) code is added for
process_syncing_tables_for_sync functions.
- The tablesync replication origin tracking is cleaned
process_syncing_tables_for_apply.
- A tablesync function to cleanup its own slot/origin is called fro
ProcessInterrupts. This is indirectly invoked by
DropSubscription/AlterSubscription when they signal the tablesync
worker to stop.

* Updates to PG docs.

* New TAP test case

TODO / Known Issues:

* None known.

---

Kind Regards,
Peter Smith.
Fujitsu Australia


v17-0001-Tablesync-Solution1.patch
Description: Binary data


v17-0002-Tablesync-extra-logging.patch
Description: Binary data


Re: ResourceOwner refactoring

2021-01-19 Thread Heikki Linnakangas

On 18/01/2021 18:11, Robert Haas wrote:

On Mon, Jan 18, 2021 at 11:11 AM Robert Haas  wrote:

On Mon, Jan 18, 2021 at 10:19 AM Heikki Linnakangas  wrote:

On 18/01/2021 16:34, Alvaro Herrera wrote:

So according to your performance benchmark, we're willing to accept a
30% performance loss on an allegedly common operation -- numkeep=0
numsnaps=10 becomes 49.8ns from 37.6ns.  That seems a bit shocking.
Maybe you can claim that these operations aren't exactly hot spots, and
so the fact that we remain in the same power-of-ten is sufficient.  Is
that the argument?


That's right. The fast path is fast, and that's important. The slow path
becomes 30% slower, but that's acceptable.


I don't know whether a 30% slowdown will hurt anybody, but it seems
like kind of a lot, and I'm not sure I understand what corresponding
benefit we're getting.


The benefit is to make it easy for extensions to use resource owners to 
track whatever resources they need to track. And arguably, the new 
mechanism is nicer for built-in code, too.


I'll see if I can optimize the slow paths, to make it more palatable.

- Heikki




Make gaps array static

2021-01-19 Thread Mark G
Looking over the recently committed work for btree tuple deletion (d168b66)
should this variable not be declared static as in the attached patch?

Thanks,
Mark.
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index faffbb1..bbdc1ce 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7409,7 +7409,7 @@ index_delete_sort(TM_IndexDeleteOp *delstate)
 	 * This implementation is fast with array sizes up to ~4500.  This covers
 	 * all supported BLCKSZ values.
 	 */
-	const int	gaps[9] = {1968, 861, 336, 112, 48, 21, 7, 3, 1};
+	static const int	gaps[9] = {1968, 861, 336, 112, 48, 21, 7, 3, 1};
 
 	/* Think carefully before changing anything here -- keep swaps cheap */
 	StaticAssertStmt(sizeof(TM_IndexDelete) <= 8,


Re: Wrong usage of RelationNeedsWAL

2021-01-19 Thread Noah Misch
On Tue, Jan 19, 2021 at 01:48:31PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 18 Jan 2021 17:30:22 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > At Sun, 17 Jan 2021 23:02:18 -0800, Noah Misch  wrote in 
> > > On Sun, Jan 17, 2021 at 10:36:31PM -0800, Noah Misch wrote:
> > > > I wrote the above based on the "PageGetLSN(page) > (snapshot)->lsn" 
> > > > check in
> > > > TestForOldSnapshot().  If the LSN isn't important, what else explains
> > > > RelationAllowsEarlyPruning() checking RelationNeedsWAL()?
> > > 
> > > Thinking about it more, some RelationAllowsEarlyPruning() callers need
> > > different treatment.  Above, I was writing about the case of deciding 
> > > whether
> > > to do early pruning.  The other RelationAllowsEarlyPruning() call sites 
> > > are
> > > deciding whether the relation might be lacking old data.  For that case, 
> > > we
> > > should check RELPERSISTENCE_PERMANENT, not RelationNeedsWAL().  
> > > Otherwise, we
> > > could fail to report an old-snapshot error in a case like this:
> > > 
> A> > setup: CREATE TABLE t AS SELECT ...;
> B> > xact1: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1;  -- take snapshot
> C> > xact2: DELETE FROM t;
> D> > (plenty of time passes)
> E> > xact3: SELECT count(*) FROM t;  -- early pruning
> F> > xact1: SAVEPOINT q; SELECT count(*) FROM t; ROLLBACK TO q;  -- "snapshot 
> too old"
> G> > xact1: ALTER TABLE t SET TABLESPACE something;  -- start skipping WAL
> H> > xact1: SELECT count(*) FROM t;  -- no error, wanted "snapshot too old"
> > > 
> > > Is that plausible?
> > 
> > Thank you for the consideration and yes. But I get "snapshot too old"
> > from the last query with the patched version. maybe I'm missing
> > something. I'm going to investigate the case.
> 
> Ah. I took that in reverse way. (caught by the discussion on page
> LSN.) Ok, the "current" code works that way. So we need to perform the
> check the new way in RelationAllowsEarlyPruning. So in a short answer
> for the last question in regard to the reference side is yes.

Right.  I expect the above procedure shows a bug in git master, but your patch
versions suffice to fix that bug.

> I understand that you are suggesting that at least
> TransactionIdLimitedForOldSnapshots should follow not only relation
> persistence but RelationNeedsWAL, based on the discussion on the
> check on LSN of TestForOldSnapshot().
> 
> I still don't think that LSN in the WAL-skipping-created relfilenode
> harm.  ALTER TABLE SET TABLESPACE always makes a dead copy of every
> block (except checksum) including page LSN regardless of wal_level. In
> the scenario above, the last select at H correctly reads page LSN set
> by E then copied by G, which is larger than the snapshot LSN at B. So
> doesn't go the fast-return path before actual check performed by
> RelationAllowsEarlyPruning.

I agree the above procedure doesn't have a problem under your patch versions.
See https://postgr.es/m/20210116043816.ga1644...@rfd.leadboat.com for a
different test case.  In more detail:

setup: CREATE TABLE t AS SELECT ...;
xact1: BEGIN; DELETE FROM t;
xact2: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1;  -- take snapshot
xact1: COMMIT;
(plenty of time passes, rows of t are now eligible for early pruning)
xact3: BEGIN;
xact3: ALTER TABLE t SET TABLESPACE something;  -- start skipping WAL
xact3: SELECT count(*) FROM t;  -- early pruning w/o WAL, w/o LSN changes
xact3: COMMIT;
xact2: SELECT count(*) FROM t;  -- no error, wanted "snapshot too old"

I expect that shows no bug for git master, but I expect it does show a bug
with your patch versions.  Could you try implementing both test procedures in
src/test/modules/snapshot_too_old?  There's no need to make the test use
wal_level=minimal explicitly; it's sufficient to catch these bugs when
wal_level=minimal is in the TEMP_CONFIG file.




Re: Parallel INSERT (INTO ... SELECT ...)

2021-01-19 Thread Greg Nancarrow
On Fri, Jan 8, 2021 at 8:25 PM vignesh C  wrote:
>
>
> Few includes are not required:
>  #include "executor/nodeGather.h"
> +#include "executor/nodeModifyTable.h"
>  #include "executor/nodeSubplan.h"
>  #include "executor/tqueue.h"
>  #include "miscadmin.h"
> @@ -60,6 +61,7 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
> GatherState *gatherstate;
> Plan   *outerNode;
> TupleDesc   tupDesc;
> +   Index   varno;
>
> This include is not required in nodeModifyTable.c
>

I think you meant nodeGather.c (not nodeModifyTable.c).
However, the include file (executor/nodeModifyTable.h) is actually
required here, otherwise there are build warnings.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: Improper use about DatumGetInt32

2021-01-19 Thread Peter Eisentraut

On 2021-01-14 09:00, Michael Paquier wrote:

I don't have more comments by reading the code and my tests have
passed after applying the patch on top of df10ac62.  I would have also
added some tests that check after blkno < 0 and > MaxBlockNumber in
all the functions where it can be triggered as that's cheap for 1.8
and 1.9, but that it's a minor point.


committed with some additional tests




Re: configurable the threshold for warning due to run out of transaction ID

2021-01-19 Thread Laurenz Albe
On Tue, 2021-01-19 at 11:44 +0900, Masahiro Ikeda wrote:
> PostgreSQL has the feature to warn about running out of transaction ID.
> The following message is an example.
> 
> 2021-01-19 10:59:27 JST [client backend] WARNING:  database "postgres" must 
> be vacuumed within xxx transactions
> 2021-01-19 10:59:27 JST [client backend] HINT:  To avoid a database shutdown, 
> execute a database-wide VACUUM in that database.
>  You might also need to commit or roll back old prepared 
> transactions, or drop stale replication slots.
> 
> But, the threshold for the warning is not configurable.
> The value is hard-coded to 40M.
> 
> varsup.c
> 
> /*
>  * We'll start complaining loudly when we get within 40M transactions 
> of
>  * data loss.  This is kind of arbitrary, but if you let your gas 
> gauge
>  * get down to 2% of full, would you be looking for the next gas 
> station?
>  * We need to be fairly liberal about this number because there are 
> lots
>  * of scenarios where most transactions are done by automatic clients 
> that
>  * won't pay attention to warnings.  (No, we're not gonna make this
>  * configurable.  If you know enough to configure it, you know enough 
> to
>  * not get in this kind of trouble in the first place.)
>  */
> 
> I think it's useful to configure the threshold for warning due to  run out of
> transaction ID like "checkpoint_warning" parameter.

I think the argument in the comment is a good one: people who know enough to
increase the number because they consume lots of transactions and want to be
warned earlier are probably people who care enough about their database to have
some monitoring in place that warns them about approaching transaction 
wraparound
("datfrozenxid" and "datminmxid" in "pg_database").

People who lower the limit to get rid of the warning are hopeless, and there is
no need to support such activity.

So I don't see much point in making this configurable.
We have enough parameters as it is.

Yours,
Laurenz Albe





Re: TOAST condition for column size

2021-01-19 Thread Amit Kapila
On Mon, Jan 18, 2021 at 7:53 PM torikoshia  wrote:
>
> Hi,

> I confirmed these sizes in my environment but AFAIU they would be
> the same size in any environment.
>
> So, as a result of adjusting the alignment, 20~23 bytes seems to
> fail.
>
> I wonder if it might be better not to adjust the alignment here
> as an attached patch because it succeeded in inserting 20~23
> bytes records.
> Or is there reasons to add the alignment here?
>

Because no benefit is to be expected by compressing it. The size will
be mostly the same. Also, even if we somehow try to fit this data via
toast, I think reading speed will be slower because for all such
columns an extra fetch from toast would be required. Another thing is
you or others can still face the same problem with 17-byte column
data. I don't this is the right way to fix it. I don't have many good
ideas but I think you can try by (a) increasing block size during
configure, (b) reduce the number of columns, (c) create char columns
of somewhat bigger size say greater than 24 bytes to accommodate your
case.

I know none of these are good workarounds but at this moment I can't
think of better alternatives.

-- 
With Regards,
Amit Kapila.




Re: Add primary keys to system catalogs

2021-01-19 Thread Laurenz Albe
On Sun, 2021-01-17 at 17:07 -0500, Tom Lane wrote:
> Peter Eisentraut  writes:
> > [ v2-0001-Add-primary-keys-and-unique-constraints-to-system.patch ]
> 
> [...] do we really want to prefer
> using the OID indexes as the primary keys?  In most cases there's some
> other index that seems to me to be what a user would think of as the
> pkey, for example pg_class_relname_nsp_index for pg_class or
> pg_proc_proname_args_nsp_index for pg_proc.  Preferring OID where it
> exists is a nice simple rule, which has some attractiveness, but the
> OID indexes seem to me like a lookup aid rather than the "real" object
> identity.

I disagree.  The OID is the real, immutable identity of an object.
The "relname" of a "pg_class" encatalogtry can change any time.
Since there are no foreign keys that reference catalogs, that won't cause
problems, but I still think that primary keys should change as little as
possible.

Yours,
Laurenz Albe





Re: [patch] Help information for pg_dump

2021-01-19 Thread Dilip Kumar
On Tue, Jan 19, 2021 at 11:24 AM Bharath Rupireddy
 wrote:
>
> On Tue, Jan 19, 2021 at 9:07 AM Zhang, Jie  wrote:
> >
> > Hi all
> >
> > After executing command [pg_dump -?], some help information is as follows.
> >
> > pg_dump -?
> > -
> >   -N, --exclude-schema=PATTERN do NOT dump the specified schema(s)  
> > ※
> >   -T, --exclude-table=PATTERN  do NOT dump the specified table(s)   
> > ※
> >   -x, --no-privileges  do not dump privileges (grant/revoke)
> >   --exclude-table-data=PATTERN do NOT dump data for the specified table(s)  
> > ※
> >   --no-commentsdo not dump comments
> >   --no-publicationsdo not dump publications
> >   --no-security-labels do not dump security label assignments
> >   --no-subscriptions   do not dump subscriptions
> >   --no-synchronized-snapshots  do not use synchronized snapshots in 
> > parallel jobs
> >   --no-tablespaces do not dump tablespace assignments
> >   --no-unlogged-table-data do not dump unlogged table data
> > 
> >
> > I think it would be better to change [do NOT dump] to [do not dump].
> >
> > Here is a patch.
>
> +1. Looks like SQL keywords are mentioned in capital letters in both
> pg_dump and pg_dumpall cases, so changing "do NOT" to "do not" seems
> okay to me.
>
> Patch LGTM.

Also  "do NOT" is inconsistent with the other message where we are
saying "do not" so +1

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




[PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
Fixes:
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -O2 
-I../../../../src/include  -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
-c -o fd.o fd.c
fd.c:3661:10: warning: 'pwritev' is only available on macOS 11.0 or newer 
[-Wunguarded-availability-new]
part = pg_pwritev(fd, iov, iovcnt, offset);
   ^~
../../../../src/include/port/pg_iovec.h:49:20: note: expanded from macro 
'pg_pwritev'
   ^~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9:
 note: 'pwritev' has been marked as being introduced in macOS 11.0
  here, but the deployment target is macOS 10.15.0
ssize_t pwritev(int, const struct iovec *, int, off_t) 
__DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0), 
watchos(7.0), tvos(14.0));
^
fd.c:3661:10: note: enclose 'pwritev' in a __builtin_available check to silence 
this warning
part = pg_pwritev(fd, iov, iovcnt, offset);
   ^~
../../../../src/include/port/pg_iovec.h:49:20: note: expanded from macro 
'pg_pwritev'
   ^~~
1 warning generated.

This results in a runtime error:
running bootstrap script ... dyld: lazy symbol binding failed: Symbol not 
found: _pwritev
  Referenced from: /usr/local/pgsql/bin/postgres
  Expected in: /usr/lib/libSystem.B.dylib

dyld: Symbol not found: _pwritev
  Referenced from: /usr/local/pgsql/bin/postgres
  Expected in: /usr/lib/libSystem.B.dylib

child process was terminated by signal 6: Abort trap: 6

To fix this we set -Werror=unguarded-availability-new so that a compile
test for pwritev will fail if the symbol is unavailable on the requested
SDK version.
---
 configure| 88 
 configure.ac | 19 +++-
 2 files changed, 93 insertions(+), 14 deletions(-)

diff --git a/configure b/configure
index 8af4b99021..503b0d27e6 100755
--- a/configure
+++ b/configure
@@ -5373,6 +5373,47 @@ if test x"$pgac_cv_prog_CC_cflags__Werror_vla" = x"yes"; 
then
 fi
 
 
+  # Prevent usage of symbols marked as newer than our target.
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-Werror=unguarded-availability-new, for CFLAGS" >&5
+$as_echo_n "checking whether ${CC} supports 
-Werror=unguarded-availability-new, for CFLAGS... " >&6; }
+if ${pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS} -Werror=unguarded-availability-new"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new=yes
+else
+  pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new" = 
x"yes"; then
+  CFLAGS="${CFLAGS} -Werror=unguarded-availability-new"
+fi
+
+
   # -Wvla is not applicable for C++
 
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-Wendif-labels, for CFLAGS" >&5
@@ -15715,6 +15756,40 @@ $as_echo "#define HAVE_PS_STRINGS 1" >>confdefs.h
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for pwritev" >&5
+$as_echo_n "checking for pwritev... " >&6; }
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#ifdef HAVE_SYS_TYPES_H
+#include 
+#endif
+#ifdef HAVE_SYS_UIO_H
+#include 
+#endif
+int
+main ()
+{
+struct iovec *iov;
+off_t offset;
+offset = 0;
+pwritev(0, iov, 0, offset);
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
+$as_echo "yes" >&6; }
+
+$as_echo "#define HAVE_PWRITEV 1" >>confdefs.h
+
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+
 ac_fn_c_check_func "$LINENO" "dlopen" "ac_cv_func_dlopen"
 if test "x$ac_cv_func_dlopen" = xyes; then :
   $as_echo "#define HAVE_DLOPEN 1" >>confdefs.h
@@ -15871,19 +15946,6 @@ esac
 
 fi
 
-ac_fn_c_check_func "$LINENO" "pwritev" "ac_cv_func_pwritev"
-if test "x$ac_cv_func_pwritev" = xyes; then :
-  $as_e

Re: Added schema level support for publication.

2021-01-19 Thread vignesh C
On Sat, Jan 9, 2021 at 8:08 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
>
>
> As Amit pointed out earlier, the behaviour when schema dropped, I
> think we should also consider when schema is altered, say altered to a
> different name, maybe we should change that in the publication too.
>

This scenario is handled now in the patch posted at [1].

>
> Say a user has a schema with 121 tables in it, and wants to replicate
> only 120 or 199 or even lesser tables out of it, so can we have some
> skip option to the new syntax, something like below?
> CREATE PUBLICATION production_publication FOR ALL TABLES SCHEMA
> production WITH skip = marketing, accounts, sales;  --> meaning is,
> replicate all the tables in the schema production except marketing,
> accounts, sales tables.

I have not yet handled this, I'm working on this and will try post a patch
for this in the next version.

[1] -
https://www.postgresql.org/message-id/CALDaNm02%3Dk8K_ZSN7_dyVHyMTW4B5hOaeo2PzdWG%3Da7GtLH0oA%40mail.gmail.com

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com


Re: Printing backtrace of postgres processes

2021-01-19 Thread vignesh C
On Sun, Jan 17, 2021 at 10:26 PM vignesh C  wrote:
>
> On Sat, Jan 16, 2021 at 11:10 PM Andres Freund  wrote:
> >
> > Hi,
> >
> > On Sat, Jan 16, 2021, at 09:34, vignesh C wrote:
> > > On Sat, Jan 16, 2021 at 1:40 AM Andres Freund  wrote:
> > > >
> > > > On 2021-01-15 09:53:05 +0100, Peter Eisentraut wrote:
> > > > > On 2020-12-08 10:38, vignesh C wrote:
> > > > > > I have implemented printing of backtrace based on handling it in
> > > > > > CHECK_FOR_INTERRUPTS. This patch also includes the change to allow
> > > > > > getting backtrace of any particular process based on the 
> > > > > > suggestions.
> > > > > > Attached patch has the implementation for the same.
> > > > > > Thoughts?
> > > > >
> > > > > Are we willing to use up a signal for this?
> > > >
> > > > Why is a full signal needed? Seems the procsignal infrastructure should
> > > > suffice?
> > >
> > > Most of the processes have access to ProcSignal, for these processes
> > > printing of callstack signal was handled by using ProcSignal. Pgstat
> > > process & syslogger process do not have access to ProcSignal,
> > > multiplexing with SIGUSR1 is not possible for these processes. So I
> > > handled the printing of callstack for pgstat process & syslogger using
> > > the SIGUSR2 signal.
> > > This is because shared memory is detached before pgstat & syslogger
> > > process is started by using the below:
> > > /* Drop our connection to postmaster's shared memory, as well */
> > > dsm_detach_all();
> > > PGSharedMemoryDetach();
> >
> > Sure. But why is it important enough to support those that we are willing 
> > to dedicate a signal to the task? Their backtraces aren't often 
> > interesting, so I think we should just ignore them here.
>
> Thanks for your comments Andres, I will ignore it for the processes
> which do not have access to ProcSignal. I will make the changes and
> post a patch for this soon.
>

The attached patch has the fix for this.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From e173fac2bef56b165cac96398f8b38e91233c59f Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 19 Jan 2021 16:44:37 +0530
Subject: [PATCH v2] Print backtrace of postgres process that are part of this 
 instance.

The idea here is to implement & expose pg_print_callstack function, internally
what this function does is, the connected backend will send SIGUSR1 signal by
setting PMSIGNAL_BACKTRACE_EMIT to the postmaster process. Postmaster process
will send SIGUSR1 signal to process by setting PROCSIG_BACKTRACE_PRINT. Once
the process receives this signal it will log the backtrace of the process. As
syslogger process & Stats process don't have access to ProcSignal, it was
discussed and concluded to skip these processes.
---
 doc/src/sgml/func.sgml| 23 +++
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  5 +++
 src/backend/postmaster/interrupt.c|  5 +++
 src/backend/postmaster/postmaster.c   | 26 
 src/backend/storage/ipc/procsignal.c  | 50 +++
 src/backend/storage/ipc/signalfuncs.c | 74 +++
 src/backend/tcop/postgres.c   | 38 ++
 src/backend/utils/init/globals.c  |  1 +
 src/bin/pg_ctl/t/005_backtrace.pl | 73 ++
 src/include/catalog/pg_proc.dat   |  8 
 src/include/miscadmin.h   |  2 +
 src/include/storage/pmsignal.h|  2 +
 src/include/storage/procsignal.h  |  5 +++
 src/include/tcop/tcopprot.h   |  1 +
 15 files changed, 317 insertions(+)
 create mode 100644 src/bin/pg_ctl/t/005_backtrace.pl

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fd0370a..8bc2b46 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24668,6 +24668,29 @@ SELECT collation for ('foo' COLLATE "de_DE");
 however only superusers can terminate superuser backends.

   
+
+  
+   
+
+ pg_print_callstack
+
+pg_print_callstack ( [pid integer] )
+boolean
+   
+   
+Signals the server processes to print the callstack.
+ pg_print_callstack without an argument prints the
+callstack of all the server processes in this instance. To request
+callstack of a particular process from this instance, the corresponding
+process id should be passed as an argument. Callstack will be printed
+based on the log configuration set. See
+ for more information. This
+feature is not supported for stats collector and syslogger processes.
+Only superusers can use pg_print_callstack  to print
+the callstack.
+   
+  
+
  
 

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 47e60ca..aa87eda 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -836,6 +836,10 @@ HandleAutoVacLauncherInterru

Re: TOAST condition for column size

2021-01-19 Thread Dilip Kumar
On Mon, Jan 18, 2021 at 7:53 PM torikoshia  wrote:
>
> Hi,
>
> When I created a table consisting of 400 VARCHAR columns and tried
> to INSERT a record which rows were all the same size, there were
> cases where I got an error due to exceeding the size limit per
> row.
>
>=# -- create a table consisting of 400 VARCHAR columns
>=# CREATE TABLE t1 (c1 VARCHAR(100),
>c2 VARCHAR(100),
>...
>c400 VARCHAR(100));
>
>=# -- insert one record which rows are all 20 bytes
>=# INSERT INTO t1 VALUES (repeat('a', 20),
>  repeat('a', 20),
>  ...
>  repeat('a', 20));
>  ERROR:  row is too big: size 8424, maximum size 8160
>
> What is interesting is that it failed only when the size of each
> column was 20~23 bytes, as shown below.
>
>size of each column  |  result
>---
>18 bytes |  success
>19 bytes |  success
>20 bytes |  failure
>21 bytes |  failure
>22 bytes |  failure
>23 bytes |  failure
>24 bytes |  success
>25 bytes |  success
>
>
> When the size of each column was 19 bytes or less, it succeeds
> because the row size is within a page size.
> When the size of each column was 24 bytes or more, it also
> succeeds because columns are TOASTed and the row size is reduced
> to less than one page size.
> OTOH, when it's more than 19 bytes and less than 24 bytes,
> columns aren't TOASTed because it doesn't meet the condition of
> the following if statement.
>
>   --src/backend/access/table/toast_helper.c
>
> toast_tuple_find_biggest_attribute(ToastTupleContext *ttc,
>   bool for_compression, bool check_main)
> ...(snip)...
> int32biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
> ...(snip)...
> if (ttc->ttc_attr[i].tai_size > biggest_size) // <- here
> {
> biggest_attno = i;
> biggest_size = ttc->ttc_attr[i].tai_size;
> }
>
>
> Since TOAST_POINTER_SIZE is 18 bytes but
> MAXALIGN(TOAST_POINTER_SIZE) is 24 bytes, columns are not TOASTed
> until its size becomes larger than 24 bytes.
>
> I confirmed these sizes in my environment but AFAIU they would be
> the same size in any environment.
>
> So, as a result of adjusting the alignment, 20~23 bytes seems to
> fail.
>
> I wonder if it might be better not to adjust the alignment here
> as an attached patch because it succeeded in inserting 20~23
> bytes records.
> Or is there reasons to add the alignment here?
>
> I understand that TOAST is not effective for small data and it's
> not recommended to create a table containing hundreds of columns,
> but I think cases that can be successful should be successful.
>
> Any thoughts?

How this can be correct? because while forming the tuple you might
need the alignment.  So basically while computing the size we are not
considering alignment and later while actually forming the tuple you
might have to align it so seems like it can create corruption while
forming the tuple.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH 1/1] Initial mach based shared memory support.

2021-01-19 Thread James Hilliard
On Mon, Jan 18, 2021 at 11:12 PM Tom Lane  wrote:
>
> James Hilliard  writes:
> > from my understanding due to mach semantics a number of the sanity checks
> > we are doing for sysv shmem are probably unnecessary when using mach
> > API's due to the mach port task bindings(mach_task_self()) effectively
> > ensuring our maps are already task bound and won't conflict with other 
> > tasks.
>
> Really?  If so, this whole patch is pretty much dead in the water,
> because the fact that sysv shmem is globally accessible is exactly
> why we use it (well, that and the fact that you can find out how many
> processes are attached to it).  It's been a long time since we cared
> about sysv shmem as a source of shared storage.  What we really use
> it for is as a form of mutual exclusion, i.e. being able to detect
> whether any live children remain from a dead postmaster.  That is,
> PGSharedMemoryIsInUse is not some minor ancillary check, it's the main
> reason this module exists at all.  If POSIX-style mmap'd shmem had the
> same capability we'd likely have dropped sysv altogether long ago.
Oh, I had figured the mutual exclusion check was just to ensure that
one didn't accidentally overlap an existing shared memory map.

If it's just an additional sanity check maybe we should make it non-fatal for
ENOSPC returns as it is impossible to increase the sysv shm segment limits
on OSX from my understanding.

But there should be a way to do this with mach, I see a few options, the task
bound map behavior from my understanding is what you get when using
mach_task_self(), however you can also look up a task using pid_for_task()
from my understanding(there may be security checks here that could cause
issues). It also may make sense to write the postmaster task ID and use that
instead of the PID. I'll try and rework this.

By the way is PGSharedMemoryIsInUse only using the pid(id2) for looking up
the shmem segment key or is something else needed? I noticed there is an
unused id1 variable as well that gets passed to it.
>
> I've occasionally wondered if we should take another look at file locking
> as a mechanism for ensuring only one postmaster+children process group
> can access a data directory.  Back in the day it was too untrustworthy,
> but maybe that has changed.
Yeah, there's probably some ugly edge cases there still.
>
> regards, tom lane




Re: pg_preadv() and pg_pwritev()

2021-01-19 Thread James Hilliard
On Thu, Jan 14, 2021 at 6:45 PM Tom Lane  wrote:
>
> Sergey Shinderuk  writes:
> > On 15.01.2021 01:13, Tom Lane wrote:
> >> Also, after re-reading [1] I am not at all excited about trying to
> >> remove the -isysroot switches from our *FLAGS.  What I propose to do
> >> is keep that, but improve our mechanism for choosing a default value
> >> for PG_SYSROOT.  It looks like first trying "xcrun --show-sdk-path",
> >> and falling back to "xcodebuild -version -sdk macosx Path" if that
> >> doesn't yield a valid path, is more likely to give a working build
> >> than relying entirely on xcodebuild.  Maybe there's a case for trying
> >> "xcrun --sdk macosx --show-sdk-path" in between; in my tests that
> >> seemed noticeably faster than invoking xcodebuild, and I've not yet
> >> seen a case where it gave a different answer.
>
> > I spent quite some time trying to understand / reverse engineer the
> > logic behind xcrun's default SDK selection.
>
> Yeah, I wasted a fair amount of time on that too, going so far as
> to ktrace xcrun (as I gather you did too).  I'm not any more
> enlightened than you are about exactly how it's making the choice.
>
> > Oh, that's weird! Nevertheless I like you suggestion to call "xcrun"
> > from "configure".
>
> Anyway, after re-reading the previous thread, something I like about
> the current behavior is that it tends to produce a version-numbered
> sysroot path, ie something ending in "MacOSX11.1.sdk" or whatever.
> One of the hazards we're trying to avoid is some parts of a PG
> installation being built against one SDK version while other parts are
> built against another.  The typical behavior of "xcrun --show-sdk-path"
> seems to be to produce a path ending in "MacOSX.sdk", which defeats that.
> So I think we should accept the path only if it contains a version number,
> and otherwise move on to the other probe commands.
I don't think enforcing a specific naming scheme makes sense, the minimum
OSX runtime version is effectively entirely separate from the SDK version.

The pwritev issue just seems to be caused by a broken configure check,
I've fixed that here:
https://postgr.es/m/20210119111625.20435-1-james.hilliard1%40gmail.com
>
> Hence, I propose the attached.  This works as far as I can tell
> to fix the problem you're seeing.
>
> regards, tom lane
>




Re: Add primary keys to system catalogs

2021-01-19 Thread Joel Jacobson
On Mon, Jan 18, 2021, at 18:23, Tom Lane wrote:
> I realized that there's a stronger roadblock for
> treating catalog interrelationships as SQL foreign keys.  Namely,
> that we always represent no-reference situations with a zero OID,
> whereas it'd have to be NULL to look like a valid foreign-key case.

Another roadblock is perhaps the lack of foreign keys on arrays,
which would be needed by the following references:

SELECT * FROM oidjoins WHERE column_type ~ '(vector|\[\])$' ORDER BY 1,2,3;
  table_name  |  column_name   | column_type | ref_table_name | 
ref_column_name
--++-++-
pg_constraint| conexclop  | oid[]   | pg_operator| oid
pg_constraint| conffeqop  | oid[]   | pg_operator| oid
pg_constraint| confkey| int2[]  | pg_attribute   | attnum
pg_constraint| conkey | int2[]  | pg_attribute   | attnum
pg_constraint| conpfeqop  | oid[]   | pg_operator| oid
pg_constraint| conppeqop  | oid[]   | pg_operator| oid
pg_extension | extconfig  | oid[]   | pg_class   | oid
pg_index | indclass   | oidvector   | pg_opclass | oid
pg_index | indcollation   | oidvector   | pg_collation   | oid
pg_index | indkey | int2vector  | pg_attribute   | attnum
pg_partitioned_table | partattrs  | int2vector  | pg_attribute   | attnum
pg_partitioned_table | partclass  | oidvector   | pg_opclass | oid
pg_partitioned_table | partcollation  | oidvector   | pg_collation   | oid
pg_policy| polroles   | oid[]   | pg_authid  | oid
pg_proc  | proallargtypes | oid[]   | pg_type| oid
pg_proc  | proargtypes| oidvector   | pg_type| oid
pg_statistic_ext | stxkeys| int2vector  | pg_attribute   | attnum
pg_trigger   | tgattr | int2vector  | pg_attribute   | attnum
(18 rows)

I see there is an old thread with work in this area,  "Re: GSoC 2017: Foreign 
Key Arrays":

https://www.postgresql.org/message-id/flat/76a8d3d8-a22e-3299-7c4e-6e115dbf3762%40proxel.se#a3ddc34863465ef83adbd26022cdbcc0

The last message in the thread is from 2018-10-02:
>On Tue, 2 Oct, 2018 at 05:13:26AM +0200, Michael Paquier wrote:
>>On Sat, Aug 11, 2018 at 05:20:57AM +0200, Mark Rofail wrote:
>> I am still having problems rebasing this patch. I can not figure it out on
>> my own.
>Okay, it's been a couple of months since this last email, and nothing
>has happened, so I am marking it as returned with feedback.
>--
>Michael

Personally, I would absolutely *love* FKs on array columns.
I always feel shameful when adding array columns to tables,
when the elements are PKs in some other table.
It's convenient and often the best design,
but it feels dirty knowing there are no FKs to help detect application bugs.

Let's hope the current FKs-on-catalog-discussion can ignite the old Foreign Key 
Arrays thread.

/Joel

Some coverage for DROP OWNED BY with pg_default_acl

2021-01-19 Thread Michael Paquier
Hi all,

I was looking again at the thread that reported a problem when using
ALTER DEFAULT PRIVILEGES with duplicated object names:
https://www.postgresql.org/message-id/ae2a7dc1-9d71-8cba-3bb9-e4cb7eb1f...@hot.ee

And while reviewing the thing, I have spotted that there is a specific
path for pg_default_acl in RemoveRoleFromObjectACL() that has zero
coverage.  This can be triggered with DROP OWNED BY, and it is
actually safe to run as long as this is done in a separate transaction
to avoid any interactions with parallel regression sessions.
privileges.sql already has similar tests, so I'd like to add some
coverage as per the attached (the duplicated role name is wanted).

Thoughts?
--
Michael
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 7754c20db4..873df85e84 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -1759,6 +1759,35 @@ SELECT has_schema_privilege('regress_priv_user2', 'testns4', 'CREATE'); -- yes
 
 ALTER DEFAULT PRIVILEGES REVOKE ALL ON SCHEMAS FROM regress_priv_user2;
 COMMIT;
+-- Test for DROP OWNED BY with shared dependencies.  This is done in a
+-- separate, rollbacked, transaction to avoid any trouble with other
+-- regression sessions.
+BEGIN;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON FUNCTIONS TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON SCHEMAS TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON SEQUENCES TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON TABLES TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON TYPES TO regress_priv_user2;
+SELECT count(*) FROM pg_shdepend
+  WHERE deptype = 'a' AND
+refobjid = 'regress_priv_user2'::regrole AND
+	classid = 'pg_default_acl'::regclass;
+ count 
+---
+ 5
+(1 row)
+
+DROP OWNED BY regress_priv_user2, regress_priv_user2;
+SELECT count(*) FROM pg_shdepend
+  WHERE deptype = 'a' AND
+refobjid = 'regress_priv_user2'::regrole AND
+	classid = 'pg_default_acl'::regclass;
+ count 
+---
+ 0
+(1 row)
+
+ROLLBACK;
 CREATE SCHEMA testns5;
 SELECT has_schema_privilege('regress_priv_user2', 'testns5', 'USAGE'); -- no
  has_schema_privilege 
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index 4911ad4add..3a20e93ada 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -1048,6 +1048,26 @@ ALTER DEFAULT PRIVILEGES REVOKE ALL ON SCHEMAS FROM regress_priv_user2;
 
 COMMIT;
 
+-- Test for DROP OWNED BY with shared dependencies.  This is done in a
+-- separate, rollbacked, transaction to avoid any trouble with other
+-- regression sessions.
+BEGIN;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON FUNCTIONS TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON SCHEMAS TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON SEQUENCES TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON TABLES TO regress_priv_user2;
+ALTER DEFAULT PRIVILEGES GRANT ALL ON TYPES TO regress_priv_user2;
+SELECT count(*) FROM pg_shdepend
+  WHERE deptype = 'a' AND
+refobjid = 'regress_priv_user2'::regrole AND
+	classid = 'pg_default_acl'::regclass;
+DROP OWNED BY regress_priv_user2, regress_priv_user2;
+SELECT count(*) FROM pg_shdepend
+  WHERE deptype = 'a' AND
+refobjid = 'regress_priv_user2'::regrole AND
+	classid = 'pg_default_acl'::regclass;
+ROLLBACK;
+
 CREATE SCHEMA testns5;
 
 SELECT has_schema_privilege('regress_priv_user2', 'testns5', 'USAGE'); -- no


signature.asc
Description: PGP signature


Re: Add primary keys to system catalogs

2021-01-19 Thread Vik Fearing
On 1/19/21 11:46 AM, Laurenz Albe wrote:
> On Sun, 2021-01-17 at 17:07 -0500, Tom Lane wrote:
>> Peter Eisentraut  writes:
>>> [ v2-0001-Add-primary-keys-and-unique-constraints-to-system.patch ]
>>
>> [...] do we really want to prefer
>> using the OID indexes as the primary keys?  In most cases there's some
>> other index that seems to me to be what a user would think of as the
>> pkey, for example pg_class_relname_nsp_index for pg_class or
>> pg_proc_proname_args_nsp_index for pg_proc.  Preferring OID where it
>> exists is a nice simple rule, which has some attractiveness, but the
>> OID indexes seem to me like a lookup aid rather than the "real" object
>> identity.
> 
> I disagree.  The OID is the real, immutable identity of an object.
> The "relname" of a "pg_class" encatalogtry can change any time.
> Since there are no foreign keys that reference catalogs, that won't cause
> problems, but I still think that primary keys should change as little as
> possible.

This might be good advice for systems where the primary key defines the
physical layout of the table, but for postgres's heap there is no reason
not to prefer the natural key of the table.

My vote is with Tom on this one.
-- 
Vik Fearing




Re: Paint some PG_USED_FOR_ASSERTS_ONLY in inline functions of ilist.h and bufpage.h

2021-01-19 Thread Michael Paquier
On Tue, Jan 19, 2021 at 04:27:43PM +0800, Julien Rouhaud wrote:
> Yeah I don't see any explicit mention on that on gcc manual.  For the
> record it also work as expected using clang, and the attached patch
> remove all warnings when compiling plpgsql_check.

FWIW, the part of the GCC docs that I looked at is here:
https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes

And what I have done does not seem completely legal either for
function arguments, even if I am not getting any complaints when
compiling that.
--
Michael


signature.asc
Description: PGP signature


Re: popcount

2021-01-19 Thread Robert Haas
On Tue, Jan 19, 2021 at 3:06 AM Peter Eisentraut
 wrote:
> On 2021-01-18 16:34, Tom Lane wrote:
> > Peter Eisentraut  writes:
> >> [ assorted nits ]
> >
> > At the level of bikeshedding ... I quite dislike using the name "popcount"
> > for these functions.  I'm aware that some C compilers provide primitives
> > of that name, but I wouldn't expect a SQL programmer to know that;
> > without that context the name seems pretty random and unintuitive.
> > Moreover, it invites confusion with SQL's use of "pop" to abbreviate
> > "population" in the statistical aggregates, such as var_pop().
>
> I was thinking about that too, but according to
> , popcount is an accepted
> high-level term, with "pop" also standing for "population".

Yeah, I am not sure that it's going to be good to invent our own name
for this, although maybe. But at least I think we should make sure
there are some good comments in an easily discoverable place. Some
people seem to think every programmer in the universe should know what
things like popcount() and fls() and ffs() and stuff like that are,
but it's far from obvious and I often have to refresh my memory. Let's
make it easy for someone to figure out, if they don't know already.
Like just a comment that says "this returns the number of 1 bits in
the integer supplied as an argument" or something can save somebody a
lot of trouble.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: TOAST condition for column size

2021-01-19 Thread Amit Kapila
On Tue, Jan 19, 2021 at 5:18 PM Dilip Kumar  wrote:
>
> On Mon, Jan 18, 2021 at 7:53 PM torikoshia  wrote:
> >
> > Hi,
> >
> > When I created a table consisting of 400 VARCHAR columns and tried
> > to INSERT a record which rows were all the same size, there were
> > cases where I got an error due to exceeding the size limit per
> > row.
> >
> >=# -- create a table consisting of 400 VARCHAR columns
> >=# CREATE TABLE t1 (c1 VARCHAR(100),
> >c2 VARCHAR(100),
> >...
> >c400 VARCHAR(100));
> >
> >=# -- insert one record which rows are all 20 bytes
> >=# INSERT INTO t1 VALUES (repeat('a', 20),
> >  repeat('a', 20),
> >  ...
> >  repeat('a', 20));
> >  ERROR:  row is too big: size 8424, maximum size 8160
> >
> > What is interesting is that it failed only when the size of each
> > column was 20~23 bytes, as shown below.
> >
> >size of each column  |  result
> >---
> >18 bytes |  success
> >19 bytes |  success
> >20 bytes |  failure
> >21 bytes |  failure
> >22 bytes |  failure
> >23 bytes |  failure
> >24 bytes |  success
> >25 bytes |  success
> >
> >
> > When the size of each column was 19 bytes or less, it succeeds
> > because the row size is within a page size.
> > When the size of each column was 24 bytes or more, it also
> > succeeds because columns are TOASTed and the row size is reduced
> > to less than one page size.
> > OTOH, when it's more than 19 bytes and less than 24 bytes,
> > columns aren't TOASTed because it doesn't meet the condition of
> > the following if statement.
> >
> >   --src/backend/access/table/toast_helper.c
> >
> > toast_tuple_find_biggest_attribute(ToastTupleContext *ttc,
> >   bool for_compression, bool check_main)
> > ...(snip)...
> > int32biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
> > ...(snip)...
> > if (ttc->ttc_attr[i].tai_size > biggest_size) // <- here
> > {
> > biggest_attno = i;
> > biggest_size = ttc->ttc_attr[i].tai_size;
> > }
> >
> >
> > Since TOAST_POINTER_SIZE is 18 bytes but
> > MAXALIGN(TOAST_POINTER_SIZE) is 24 bytes, columns are not TOASTed
> > until its size becomes larger than 24 bytes.
> >
> > I confirmed these sizes in my environment but AFAIU they would be
> > the same size in any environment.
> >
> > So, as a result of adjusting the alignment, 20~23 bytes seems to
> > fail.
> >
> > I wonder if it might be better not to adjust the alignment here
> > as an attached patch because it succeeded in inserting 20~23
> > bytes records.
> > Or is there reasons to add the alignment here?
> >
> > I understand that TOAST is not effective for small data and it's
> > not recommended to create a table containing hundreds of columns,
> > but I think cases that can be successful should be successful.
> >
> > Any thoughts?
>
> How this can be correct? because while forming the tuple you might
> need the alignment.
>

Won't it be safe because we don't align individual attrs of type
varchar where length is less than equal to 127?

-- 
With Regards,
Amit Kapila.




Re: TOAST condition for column size

2021-01-19 Thread Dilip Kumar
On Tue, 19 Jan 2021 at 6:28 PM, Amit Kapila  wrote:

> On Tue, Jan 19, 2021 at 5:18 PM Dilip Kumar  wrote:
> >
> > On Mon, Jan 18, 2021 at 7:53 PM torikoshia 
> wrote:
> > >
> > > Hi,
> > >
> > > When I created a table consisting of 400 VARCHAR columns and tried
> > > to INSERT a record which rows were all the same size, there were
> > > cases where I got an error due to exceeding the size limit per
> > > row.
> > >
> > >=# -- create a table consisting of 400 VARCHAR columns
> > >=# CREATE TABLE t1 (c1 VARCHAR(100),
> > >c2 VARCHAR(100),
> > >...
> > >c400 VARCHAR(100));
> > >
> > >=# -- insert one record which rows are all 20 bytes
> > >=# INSERT INTO t1 VALUES (repeat('a', 20),
> > >  repeat('a', 20),
> > >  ...
> > >  repeat('a', 20));
> > >  ERROR:  row is too big: size 8424, maximum size 8160
> > >
> > > What is interesting is that it failed only when the size of each
> > > column was 20~23 bytes, as shown below.
> > >
> > >size of each column  |  result
> > >---
> > >18 bytes |  success
> > >19 bytes |  success
> > >20 bytes |  failure
> > >21 bytes |  failure
> > >22 bytes |  failure
> > >23 bytes |  failure
> > >24 bytes |  success
> > >25 bytes |  success
> > >
> > >
> > > When the size of each column was 19 bytes or less, it succeeds
> > > because the row size is within a page size.
> > > When the size of each column was 24 bytes or more, it also
> > > succeeds because columns are TOASTed and the row size is reduced
> > > to less than one page size.
> > > OTOH, when it's more than 19 bytes and less than 24 bytes,
> > > columns aren't TOASTed because it doesn't meet the condition of
> > > the following if statement.
> > >
> > >   --src/backend/access/table/toast_helper.c
> > >
> > > toast_tuple_find_biggest_attribute(ToastTupleContext *ttc,
> > >   bool for_compression, bool check_main)
> > > ...(snip)...
> > > int32biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
> > > ...(snip)...
> > > if (ttc->ttc_attr[i].tai_size > biggest_size) // <- here
> > > {
> > > biggest_attno = i;
> > > biggest_size = ttc->ttc_attr[i].tai_size;
> > > }
> > >
> > >
> > > Since TOAST_POINTER_SIZE is 18 bytes but
> > > MAXALIGN(TOAST_POINTER_SIZE) is 24 bytes, columns are not TOASTed
> > > until its size becomes larger than 24 bytes.
> > >
> > > I confirmed these sizes in my environment but AFAIU they would be
> > > the same size in any environment.
> > >
> > > So, as a result of adjusting the alignment, 20~23 bytes seems to
> > > fail.
> > >
> > > I wonder if it might be better not to adjust the alignment here
> > > as an attached patch because it succeeded in inserting 20~23
> > > bytes records.
> > > Or is there reasons to add the alignment here?
> > >
> > > I understand that TOAST is not effective for small data and it's
> > > not recommended to create a table containing hundreds of columns,
> > > but I think cases that can be successful should be successful.
> > >
> > > Any thoughts?
> >
> > How this can be correct? because while forming the tuple you might
> > need the alignment.
> >
>
> Won't it be safe because we don't align individual attrs of type
> varchar where length is less than equal to 127?


Yeah right,  I just missed that point.

> --
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Use boolean array for nulls parameters

2021-01-19 Thread japin

Hi,

When I review the [1], I find that the tuple's nulls array use char type.
However there are many places use boolean array to repsent the nulls array,
so I think we can replace the char type nulls array to boolean type.  This
change will break the SPI_xxx API, I'm not sure whether this chagnges cause
other problems or not.  Any thought?

[1] - 
https://www.postgresql.org/message-id/flat/ca+hiwqgkfjfydeq5vhph6eqpkjsbfpddy+j-kxyfepqedts...@mail.gmail.com

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

>From 7496f20ccfda7687333db9e5c43227ee30e4eda9 Mon Sep 17 00:00:00 2001
From: Japin Li 
Date: Tue, 19 Jan 2021 15:51:44 +0800
Subject: [PATCH v1] Replace the char to bool array for null parameters

---
 doc/src/sgml/spi.sgml   | 60 -
 src/backend/executor/spi.c  | 18 -
 src/backend/utils/adt/ri_triggers.c |  8 ++--
 src/backend/utils/adt/ruleutils.c   | 10 ++---
 src/include/executor/spi.h  | 12 +++---
 src/pl/plperl/plperl.c  |  6 +--
 src/pl/plpython/plpy_cursorobject.c |  6 +--
 src/test/regress/regress.c  | 12 +++---
 8 files changed, 61 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/spi.sgml b/doc/src/sgml/spi.sgml
index f5e0a35da0..d3cc405c42 100644
--- a/doc/src/sgml/spi.sgml
+++ b/doc/src/sgml/spi.sgml
@@ -649,7 +649,7 @@ int SPI_exec(const char * command, long count<
 
 int SPI_execute_with_args(const char *command,
   int nargs, Oid *argtypes,
-  Datum *values, const char *nulls,
+  Datum *values, const bool *nulls,
   bool read_only, long count)
 
  
@@ -728,7 +728,7 @@ int SPI_execute_with_args(const char *command,

 

-const char * nulls
+const bool * nulls
 
  
   an array of length nargs, describing which
@@ -739,12 +739,10 @@ int SPI_execute_with_args(const char *command,
   If nulls is NULL then
   SPI_execute_with_args assumes that no parameters
   are null.  Otherwise, each entry of the nulls
-  array should be ' ' if the corresponding parameter
-  value is non-null, or 'n' if the corresponding parameter
+  array should be false if the corresponding parameter
+  value is non-null, or true if the corresponding parameter
   value is null.  (In the latter case, the actual value in the
-  corresponding values entry doesn't matter.)  Note
-  that nulls is not a text string, just an array:
-  it does not need a '\0' terminator.
+  corresponding values entry doesn't matter.)
  
 

@@ -1601,7 +1599,7 @@ bool SPI_is_cursor_plan(SPIPlanPtr plan)
 
  
 
-int SPI_execute_plan(SPIPlanPtr plan, Datum * values, const char * nulls,
+int SPI_execute_plan(SPIPlanPtr plan, Datum * values, const bool * nulls,
  bool read_only, long count)
 
  
@@ -1642,7 +1640,7 @@ int SPI_execute_plan(SPIPlanPtr plan, Datum * 

 

-const char * nulls
+const bool * nulls
 
  
   An array describing which parameters are null.  Must have same length as
@@ -1653,12 +1651,10 @@ int SPI_execute_plan(SPIPlanPtr plan, Datum * 
   If nulls is NULL then
   SPI_execute_plan assumes that no parameters
   are null.  Otherwise, each entry of the nulls
-  array should be ' ' if the corresponding parameter
-  value is non-null, or 'n' if the corresponding parameter
+  array should be false if the corresponding parameter
+  value is non-null, or true if the corresponding parameter
   value is null.  (In the latter case, the actual value in the
-  corresponding values entry doesn't matter.)  Note
-  that nulls is not a text string, just an array:
-  it does not need a '\0' terminator.
+  corresponding values entry doesn't matter.)
  
 

@@ -1946,7 +1942,7 @@ int SPI_execute_plan_with_receiver(SPIPlanPtr plan,
 
  
 
-int SPI_execp(SPIPlanPtr plan, Datum * values, const char * nulls, long count)
+int SPI_execp(SPIPlanPtr plan, Datum * values, const bool * nulls, long count)
 
  
 
@@ -1985,7 +1981,7 @@ int SPI_execp(SPIPlanPtr plan, Datum * values<

 

-const char * nulls
+const bool * nulls
 
  
   An array describing which parameters are null.  Must have same length as
@@ -1996,12 +1992,10 @@ int SPI_execp(SPIPlanPtr plan, Datum * values<
   If nulls is NULL then
   SPI_execp assumes that no parameters
   are null.  Otherwise, each entry of the nulls
-  array should be ' ' if the corresponding parameter
-  value is non-null, or 'n' if the corresponding parameter
+  array should be false if the corresponding parameter
+  value is non-null, or true if the corresponding parameter
   value is null.  (In the latter case, the actual value in the
-  corresponding values entry doesn't matter.)  Note
-  that nulls is not a text string, just an array:
-  it does not need a '\0

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-19 Thread Dmitry Dolgov
> On Thu, Jan 14, 2021 at 12:02:42PM -0500, Dian M Fay wrote:
> > Other than that, since I've already posted the patch for returning an
> > error option, it seems that the only thing left is to decide with which
> > version to go.
>
> The trigger issue (which I did verify) makes the "no update" option
> unworkable imo, JavaScript's behavior notwithstanding. But it should be
> called out very clearly in the documentation, since it does depart from
> what people more familiar with that behavior may expect. Here's a quick
> draft, based on your v44 patch:
>
> 
>  jsonb data type supports array-style subscripting expressions
>  to extract or update particular elements. It's possible to use multiple
>  subscripting expressions to extract nested values. In this case, a chain of
>  subscripting expressions follows the same rules as the
>  path argument in jsonb_set function,
>  e.g. in case of arrays it is a 0-based operation or that negative integers
>  that appear in path count from the end of JSON arrays.
>  The result of subscripting expressions is always of the jsonb data type.
> 
> 
>  UPDATE statements may use subscripting in the
>  SET clause to modify jsonb values. Every
>  affected value must conform to the path defined by the subscript(s). If the
>  path cannot be followed to its end for any individual value (e.g.
>  val['a']['b']['c'] where val['a'] or
>  val['b'] is null, a string, or a number), an error is
>  raised even if other values do conform.
> 
> 
>  An example of subscripting syntax:

Yes, makes sense. I've incorporated your suggestion into the last patch,
thanks.
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v45 1/3] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]

Re: [PATCH] More docs on what to do and not do in extension code

2021-01-19 Thread Bharath Rupireddy
On Mon, Jan 18, 2021 at 1:27 PM Craig Ringer
 wrote:
> Hi folks
>
> The attached patch expands the xfunc docs and bgworker docs a little, 
> providing a starting point for developers to learn how to do some common 
> tasks the Postgres Way.
>
> It mentions in brief these topics:
>
> * longjmp() based exception handling with elog(ERROR), PG_CATCH() and 
> PG_RE_THROW() etc
> * Latches, spinlocks, LWLocks, heavyweight locks, condition variables
> * shm, DSM, DSA, shm_mq
> * syscache, relcache, relation_open(), invalidations
> * deferred signal handling, CHECK_FOR_INTERRUPTS()
> * Resource cleanup hooks and callbacks like on_exit, before_shmem_exit, the 
> resowner callbacks, etc
> * signal handling in bgworkers
>
> All very superficial, but all things I really wish I'd known a little about, 
> or even that I needed to learn about, when I started working on postgres.
>
> I'm not sure it's in quite the right place. I wonder if there should be a 
> separate part of xfunc.sgml that covers the slightly more advanced bits of 
> postgres backend and function coding like this, lists relevant README files 
> in the source tree, etc.
>
> I avoided going into details like how resource owners work. I don't want the 
> docs to have to cover all that in detail; what I hope to do is start 
> providing people with clear references to the right place in the code, 
> READMEs, etc to look when they need to understand specific topics.

Thanks for the patch.

Here are some comments:

[1]
   background worker's main function, and must be unblocked by it; this is to
allow the process to customize its signal handlers, if necessary.
-   Signals can be unblocked in the new process by calling
-   BackgroundWorkerUnblockSignals and blocked by calling
-   BackgroundWorkerBlockSignals.
+   It is important that all background workers set up and unblock signal
+   handling before they enter their main loops. Signal handling in background
+   workers is discussed separately in .
   

IMO, we can retain the statement about BackgroundWorkerUnblockSignals
and BackgroundWorkerBlockSignals, but mention the link to
"bgworker-signals" for more details and move the statement "it's
important to unblock signals before enter their main loop" to
"bgworker-signals" section and we can also reason there the
consequences if not done.

[2]
+   interupt-aware APIs for the purpose. Do not
usleep(),
+   system(), make blocking system calls, etc.
+  

Is it "Do not use usleep(),
system() or make blocking system calls etc." ?

[3] IMO, we can remove following from "bgworker-signals" if we retain
it where currently it is, as discussed in [1].
+Signals can be unblocked in the new process by calling
+BackgroundWorkerUnblockSignals and blocked by calling
+BackgroundWorkerBlockSignals.

[4] Can we say
+The default signal handlers set up for background workers do
+default background worker signal handlers, it should call

instead of
+The default signal handlers installed for background workers do
+default background worker signal handling it should call

[5] IMO, we can have something like below
+request, etc. Set up these handlers before unblocking signals as
+shown below:

instead of
+request, etc. To install these handlers, before unblocking interrupts
+run:

[6] I think logs and errors either elog() or ereport can be used, so how about
+Use elog() or ereport() for
+logging output or raising errors instead of any direct stdio calls.

instead of
+Use elog() and ereport() for
+logging output and raising errors instead of any direct stdio calls.

[7] Can we use child processes instead of subprocess ? If okay in
other places in the patch as well.
+and should only use the main thread. PostgreSQL generally
uses child processes
+that coordinate over shared memory instead of threads - for
instance, see
+.

instead of
+and should only use the main thread. PostgreSQL generally
uses subprocesses
+that coordinate over shared memory instead of threads - see
+.

[8] Why should file descriptor manager API be used to execute
subprocesses/child processes?
+To execute subprocesses and open files, use the routines provided by
+the file descriptor manager like BasicOpenFile
+and OpenPipeStream instead of a direct

[9] "should always be"? even if it's a blocking extesion, does it
work? If our intention is to recommend the developers, maybe we should
avoid using the term "should" in the patch in other places as well.
+Extension code should always be structured as a non-blocking

[10] I think it is
+you should avoid using sleep() or
usleep()

instead of
+you should sleep() or
usleep()


[11] I think it is
+block if this happens. So cleanup of resources is not
entirely managed by PostgreSQL, it
+   must be handled using appropriate callbacks provided by PostgreSQL

instead of
+block if this happe

Re: ResourceOwner refactoring

2021-01-19 Thread Heikki Linnakangas

On 18/01/2021 16:34, Alvaro Herrera wrote:

On 2021-Jan-18, Heikki Linnakangas wrote:


+static ResourceOwnerFuncs jit_funcs =
+{
+   /* relcache references */
+   .name = "LLVM JIT context",
+   .phase = RESOURCE_RELEASE_BEFORE_LOCKS,
+   .ReleaseResource = ResOwnerReleaseJitContext,
+   .PrintLeakWarning = ResOwnerPrintJitContextLeakWarning
+};


I think you mean jit_resowner_funcs here; "jit_funcs" is a bit
excessively vague.  Also, why did you choose not to define
ResourceOwnerRememberJIT?  You do that in other modules and it seems
better.


I did it in modules that had more than one ResourceOwnerRemeber/Forget 
call. Didn't seem worth it in functions like IncrTupleDescRefCount(), 
for example.


Hayato Kuroda also pointed that out, though. So perhaps it's better to 
be consistent, to avoid the confusion. I'll add the missing wrappers.


- Heikki




Re: Use boolean array for nulls parameters

2021-01-19 Thread Hamid Akhtar
I personally don't see any benefit in this change. The focus shouldn't be
on fixing things that aren't broken. Perhaps, there is more value in using
bitmap data type to keep track of NULL values, which is typical storage vs
performance debate, and IMHO, it's better to err on using slightly more
storage for much better performance. IIRC, the bitmap idea has previously
discussed been rejected too.

On Tue, Jan 19, 2021 at 7:07 PM japin  wrote:

>
> Hi,
>
> When I review the [1], I find that the tuple's nulls array use char type.
> However there are many places use boolean array to repsent the nulls array,
> so I think we can replace the char type nulls array to boolean type.  This
> change will break the SPI_xxx API, I'm not sure whether this chagnges cause
> other problems or not.  Any thought?
>
> [1] -
> https://www.postgresql.org/message-id/flat/ca+hiwqgkfjfydeq5vhph6eqpkjsbfpddy+j-kxyfepqedts...@mail.gmail.com
>
> --
> Regrads,
> Japin Li.
> ChengDu WenWu Information Technology Co.,Ltd.
>
>

-- 
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950  EMAIL: mailto:hamid.akh...@highgo.ca
SKYPE: engineeredvirus


Re: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault

2021-01-19 Thread Tom Lane
David Geier  writes:
> On 18.01.21 23:42, Tom Lane wrote:
>> OK, cool.  I was afraid you'd argue that you really needed your CustomScan
>> node to be transparent in such cases.  We could imagine inventing an
>> additional custom-scan-provider callback to embed the necessary knowledge,
>> but I'd rather not add the complexity until someone has a use-case.

> I was thinking about that. Generally, having such possibility would be 
> very useful. Unfortunately, I believe that in my specific case even 
> having such functionality wouldn't allow for the query to work with 
> CURRENT OF, because my CSP behaves similarly to a materialize node.
> My understanding is only such plans are supported which consist of nodes 
> that guarantee that the tuple returned by the plan is the unmodified 
> tuple a scan leaf node is currently positioned on?

Doesn't have to be *unmodified* --- a projection is fine, for example.
But we have to be sure that the current output tuple of the plan tree
is based on the current output tuple of the bottom-level table scan
node.  As an example of the hazards here, it's currently safe for
search_plan_tree to descend through a Limit node, but it did not use to
be, because the old implementation of Limit was such that it could return
a different tuple from the one the underlying scan node thinks it is
positioned on.

As another example, descending through Append is OK because only one
of the child scans will be positioned-on-a-tuple at all; the rest
will be at EOF or not yet started, so they can't produce a match
to whatever tuple ID the WHERE CURRENT OF is asking about.

Now that I look at this, I strongly wonder whether whoever added
MergeAppend support here understood what they were doing.  That
looks broken, because child nodes will typically be positioned on
tuples, whether or not the current top-level output came from them.
So I fear we could get a false-positive confirmation that some
tuple matches WHERE CURRENT OF.

Anyway, it seems clearly possible that some nonleaf CustomScans
would operate in a manner that would allow descending through them
while others wouldn't.  But I don't really want to write the docs
explaining what a callback for this should do ;-)

regards, tom lane




Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread Tom Lane
James Hilliard  writes:
> Fixes:
> gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
> -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security 
> -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -O2 
> -I../../../../src/include  -isysroot 
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
> -c -o fd.o fd.c
> fd.c:3661:10: warning: 'pwritev' is only available on macOS 11.0 or newer 
> [-Wunguarded-availability-new]

We already dealt with that by not selecting an SDK newer than the
underlying OS (see 4823621db).  I do not believe that your proposal
is more reliable than that approach, and it's surely uglier.  Are
we really going to abandon Autoconf's built-in checking method every
time Apple adds an API they should have had ten years ago?  If so,
you forgot preadv ...

regards, tom lane




Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
On Tue, Jan 19, 2021 at 8:27 AM Tom Lane  wrote:
>
> James Hilliard  writes:
> > Fixes:
> > gcc -Wall -Wmissing-prototypes -Wpointer-arith 
> > -Wdeclaration-after-statement -Werror=vla -Wendif-labels 
> > -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
> > -Wno-unused-command-line-argument -O2 -I../../../../src/include  -isysroot 
> > /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
> > -c -o fd.o fd.c
> > fd.c:3661:10: warning: 'pwritev' is only available on macOS 11.0 or newer 
> > [-Wunguarded-availability-new]
>
> We already dealt with that by not selecting an SDK newer than the
> underlying OS (see 4823621db).
Tried that, doesn't work, not even sure how it could possibly fix this
issue at all,
this can not be fixed properly by selecting a specific SDK version
alone, it's the
symbols valid for a specific target deployment version that matters here.
> I do not believe that your proposal
> is more reliable than that approach, and it's surely uglier.  Are
> we really going to abandon Autoconf's built-in checking method every
> time Apple adds an API they should have had ten years ago?  If so,
> you forgot preadv ...
I didn't run into an issue there for some reason...but this was the cleanest fix
I could come up with that seemed to work.
>
> regards, tom lane




Re: TOAST condition for column size

2021-01-19 Thread Tom Lane
Dilip Kumar  writes:
> On Tue, 19 Jan 2021 at 6:28 PM, Amit Kapila  wrote:
>> Won't it be safe because we don't align individual attrs of type
>> varchar where length is less than equal to 127?

> Yeah right,  I just missed that point.

Yeah, the minimum on biggest_size has nothing to do with alignment
decisions.  It's just a filter to decide whether it's worth trying
to toast anything.

Having said that, I'm pretty skeptical of this patch: I think its
most likely real-world effect is going to be to waste cycles (and
create TOAST-table bloat) on the way to failing anyway.  I do not
think that toasting a 20-byte field down to 18 bytes is likely to be
a productive thing to do in typical situations.  The given example
looks like a cherry-picked edge case rather than a useful case to
worry about.

IOW, if I were asked to review whether the current minimum is
well-chosen, I'd be wondering if we should increase it not
decrease it.

regards, tom lane




Re: Use boolean array for nulls parameters

2021-01-19 Thread Tom Lane
japin  writes:
> When I review the [1], I find that the tuple's nulls array use char type.
> However there are many places use boolean array to repsent the nulls array,
> so I think we can replace the char type nulls array to boolean type.  This
> change will break the SPI_xxx API, I'm not sure whether this chagnges cause
> other problems or not.  Any thought?

We have always considered that changing the APIs of published SPI
interfaces is a non-starter.  The entire reason those calls still
exist at all is for the benefit of third-party extensions.

regards, tom lane




Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread Tom Lane
James Hilliard  writes:
> On Tue, Jan 19, 2021 at 8:27 AM Tom Lane  wrote:
>> We already dealt with that by not selecting an SDK newer than the
>> underlying OS (see 4823621db).

> Tried that, doesn't work, not even sure how it could possibly fix this
> issue at all,

It worked for me and for Sergey, so we need to figure out what's different
about your setup.  What do you get from "xcrun --show-sdk-path" and
"xcrun --sdk macosx --show-sdk-path"?  What have you got under
/Library/Developer/CommandLineTools/SDKs ?

> this can not be fixed properly by selecting a specific SDK version
> alone, it's the symbols valid for a specific target deployment version
> that matters here.

I don't think I believe that argument.  As a counterexample, supposing
that somebody were intentionally cross-compiling on an older OSX platform
but using a newer SDK, shouldn't they get an executable suited to the
SDK's target version?

(I realize that Apple thinks we ought to handle that through run-time
not compile-time adaptation, but I have no interest in going there.)

regards, tom lane




Re: POC: postgres_fdw insert batching

2021-01-19 Thread Tomas Vondra




On 1/19/21 7:23 AM, Amit Langote wrote:

On Tue, Jan 19, 2021 at 2:06 PM tsunakawa.ta...@fujitsu.com
 wrote:

From: Amit Langote 

I apologize in advance for being maybe overly pedantic, but I noticed
that, in ExecInitModifyTable(), you decided to place the call outside
the loop that goes over resultRelations (shown below), although my
intent was to ask to place it next to the BeginForeignModify() in that
loop.


Actually, I tried to do it (adding the GetModifyBatchSize() call after 
BeginForeignModify()), but it failed.  Because postgresfdwGetModifyBatchSize() 
wants to know if RETURNING is specified, and ResultRelInfo->projectReturning is 
created after the above part.  Considering the context where GetModifyBatchSize() 
implementations may want to know the environment, I placed the call as late as 
possible in the initialization phase.  As for the future(?) multi-target DML 
statements, I think we can change this together with other many(?) parts that 
assume a single target table.


Okay, sometime later then.

I wasn't sure if bringing it up here would be appropriate, but there's
a patch by me to refactor ModfiyTable result relation allocation that
will have to remember to move this code along to an appropriate place
[1].  Thanks for the tip about the dependency on how RETURNING is
handled.  I will remember it when rebasing my patch over this.



Thanks. The last version (v12) should be addressing all the comments and 
seems fine to me, so barring objections I'll get that pushed shortly.


One thing that seems a bit annoying is that with the partitioned table 
the explain (verbose) looks like this:


 QUERY PLAN
-
 Insert on public.batch_table
   ->  Function Scan on pg_catalog.generate_series i
 Output: i.i
 Function Call: generate_series(1, 66)
(4 rows)

That is, there's no information about the batch size :-( But AFAICS 
that's due to how explain shows (or rather does not) partitions in this 
type of plan.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Is Recovery actually paused?

2021-01-19 Thread Dilip Kumar
On Tue, Jan 19, 2021 at 8:34 AM Dilip Kumar  wrote:
>
> On Tue, 19 Jan 2021 at 8:12 AM, Yugo NAGATA  wrote:
>>
>> On Sun, 17 Jan 2021 11:33:52 +0530
>> Dilip Kumar  wrote:
>>
>> > On Thu, Jan 14, 2021 at 6:49 PM Yugo NAGATA  wrote:
>> > >
>> > > On Wed, 13 Jan 2021 17:49:43 +0530
>> > > Dilip Kumar  wrote:
>> > >
>> > > > On Wed, Jan 13, 2021 at 3:35 PM Dilip Kumar  
>> > > > wrote:
>> > > > >
>> > > > > On Wed, Jan 13, 2021 at 3:27 PM Yugo NAGATA  
>> > > > > wrote:
>> > > > > >
>> > > > > > On Thu, 10 Dec 2020 11:25:23 +0530
>> > > > > > Dilip Kumar  wrote:
>> > > > > >
>> > > > > > > > > However, I wonder users don't expect pg_is_wal_replay_paused 
>> > > > > > > > > to wait.
>> > > > > > > > > Especially, if max_standby_streaming_delay is -1, this will 
>> > > > > > > > > be blocked forever,
>> > > > > > > > > although this setting may not be usual. In addition, some 
>> > > > > > > > > users may set
>> > > > > > > > > recovery_min_apply_delay for a large.  If such users call 
>> > > > > > > > > pg_is_wal_replay_paused,
>> > > > > > > > > it could wait for a long time.
>> > > > > > > > >
>> > > > > > > > > At least, I think we need some descriptions on document to 
>> > > > > > > > > explain
>> > > > > > > > > pg_is_wal_replay_paused could wait while a time.
>> > > > > > > >
>> > > > > > > > Ok
>> > > > > > >
>> > > > > > > Fixed this, added some comments in .sgml as well as in function 
>> > > > > > > header
>> > > > > >
>> > > > > > Thank you for fixing this.
>> > > > > >
>> > > > > > Also, is it better to fix the description of pg_wal_replay_pause 
>> > > > > > from
>> > > > > > "Pauses recovery." to "Request to pause recovery." in according 
>> > > > > > with
>> > > > > > pg_is_wal_replay_paused?
>> > > > >
>> > > > > Okay
>> > > > >
>> > > > > >
>> > > > > > > > > Also, how about adding a new boolean argument to 
>> > > > > > > > > pg_is_wal_replay_paused to
>> > > > > > > > > control whether this waits for recovery to get paused or 
>> > > > > > > > > not? By setting its
>> > > > > > > > > default value to true or false, users can use the old format 
>> > > > > > > > > for calling this
>> > > > > > > > > and the backward compatibility can be maintained.
>> > > > > > > >
>> > > > > > > > So basically, if the wait_recovery_pause flag is false then we 
>> > > > > > > > will
>> > > > > > > > immediately return true if the pause is requested?  I agree 
>> > > > > > > > that it is
>> > > > > > > > good to have an API to know whether the recovery pause is 
>> > > > > > > > requested or
>> > > > > > > > not but I am not sure is it good idea to make this API serve 
>> > > > > > > > both the
>> > > > > > > > purpose?  Anyone else have any thoughts on this?
>> > > > > > > >
>> > > > > >
>> > > > > > I think the current pg_is_wal_replay_paused() already has another 
>> > > > > > purpose;
>> > > > > > this waits recovery to actually get paused. If we want to limit 
>> > > > > > this API's
>> > > > > > purpose only to return the pause state, it seems better to fix 
>> > > > > > this to return
>> > > > > > the actual state at the cost of lacking the backward 
>> > > > > > compatibility. If we want
>> > > > > > to know whether pause is requested, we may add a new API like
>> > > > > > pg_is_wal_replay_paluse_requeseted(). Also, if we want to wait 
>> > > > > > recovery to actually
>> > > > > > get paused, we may add an option to pg_wal_replay_pause() for this 
>> > > > > > purpose.
>> > > > > >
>> > > > > > However, this might be a bikeshedding. If anyone don't care that
>> > > > > > pg_is_wal_replay_paused() can make user wait for a long time, I 
>> > > > > > don't care either.
>> > > > >
>> > > > > I don't think that it will be blocked ever, because
>> > > > > pg_wal_replay_pause is sending the WakeupRecovery() which means the
>> > > > > recovery process will not be stuck on waiting for the WAL.
>> > >
>> > > Yes, there is no stuck on waiting for the WAL. However, it can be stuck 
>> > > during resolving
>> > > a recovery conflict. The process could wait for 
>> > > max_standby_streaming_delay or
>> > > max_standby_archive_delay at most before recovery get completely paused.
>> >
>> > Okay, I agree that it is possible so for handling this we have a
>> > couple of options
>> > 1. pg_is_wal_replay_paused(), interface will wait for recovery to
>> > actually get paused, but user have an option to cancel that.  So I
>> > agree that there is currently no option to just know that recovery
>> > pause is requested without waiting for its actually get paused if it
>> > is requested.  So one option is we can provide an another interface as
>> > you mentioned pg_is_wal_replay_paluse_requeseted(), which can just
>> > return the request status.  I am not sure how useful it is.
>>
>> If it is acceptable that pg_is_wal_replay_paused() makes users wait,
>> I'm ok for the current interface. I don't feel the need of
>> pg_is_wal_replay_paluse_requeseted().
>>
>> >
>> > 2. Pass an option to pg_is_wal_replay_paused whether to wait

Re: Stronger safeguard for archive recovery not to miss data

2021-01-19 Thread Laurenz Albe
On Mon, 2021-01-18 at 07:34 +, osumi.takami...@fujitsu.com wrote:
> I noticed that this message should cover both archive recovery modes,
> which means in recovery mode and standby mode. Then, I combined your
> suggestion above with this point of view. Have a look at the updated patch.
> I also enriched the new tap tests to show this perspective.

Looks good, thanks.

I'll mark this patch as "ready for committer".

Yours,
Laurenz Albe





pg_class.reltype -> pg_type.oid missing for pg_toast table

2021-01-19 Thread Joel Jacobson
I have a memory of the catalog not being MVCC,
so maybe this is normal and expected,
but I wanted to report it in case it's not.

When copying all tables in pg_catalog, to a separate schema with the purpose
of testing if foreign keys could be added for all oid columns, I got an error 
for a toast table:

ERROR:  insert or update on table "pg_class" violates foreign key constraint 
"pg_class_reltype_fkey"
DETAIL:  Key (reltype)=(86987582) is not present in table "pg_type".
CONTEXT:  SQL statement "
ALTER TABLE catalog_fks.pg_class ADD FOREIGN KEY (reltype) REFERENCES 
catalog_fks.pg_type (oid)
  "

The copies of pg_catalog were executed in one and the same transaction,
but as separate queries in a PL/pgSQL function using EXECUTE.

/Joel

Re: popcount

2021-01-19 Thread David Fetter
On Tue, Jan 19, 2021 at 07:58:12AM -0500, Robert Haas wrote:
> On Tue, Jan 19, 2021 at 3:06 AM Peter Eisentraut
>  wrote:
> > On 2021-01-18 16:34, Tom Lane wrote:
> > > Peter Eisentraut  writes:
> > >> [ assorted nits ]
> > >
> > > At the level of bikeshedding ... I quite dislike using the name "popcount"
> > > for these functions.  I'm aware that some C compilers provide primitives
> > > of that name, but I wouldn't expect a SQL programmer to know that;
> > > without that context the name seems pretty random and unintuitive.
> > > Moreover, it invites confusion with SQL's use of "pop" to abbreviate
> > > "population" in the statistical aggregates, such as var_pop().
> >
> > I was thinking about that too, but according to
> > , popcount is an accepted
> > high-level term, with "pop" also standing for "population".
> 
> Yeah, I am not sure that it's going to be good to invent our own
> name for this, although maybe. But at least I think we should make
> sure there are some good comments in an easily discoverable place.
> Some people seem to think every programmer in the universe should
> know what things like popcount() and fls() and ffs() and stuff like
> that are, but it's far from obvious and I often have to refresh my
> memory.  Let's make it easy for someone to figure out, if they don't
> know already.

I went with count_set_bits() for the next version, and I hope that
helps clarify what it does.

> Like just a comment that says "this returns the number of 1 bits in
> the integer supplied as an argument" or something can save somebody
> a lot of trouble.

You bring up an excellent point, which is that our builtin functions
could use a lot more documentation directly to hand than they now
have.  For example, there's a lot of needless ambiguity created by
function comments which leave it up in the air as to which positional
argument does what in functions like string_to_array, which take
multiple arguments. I'll try to get a patch in for the next CF with a
fix for that, and a separate one that doesn't put it on people to use
\df+ to find the comments we do provide. There have been proposals for
including an optional space for COMMENT ON in DDL, but I suspect that
those won't fly unless and until they make their way into the
standard.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: popcount

2021-01-19 Thread Isaac Morland
On Tue, 19 Jan 2021 at 11:38, David Fetter  wrote:

> You bring up an excellent point, which is that our builtin functions
> could use a lot more documentation directly to hand than they now
> have.  For example, there's a lot of needless ambiguity created by
> function comments which leave it up in the air as to which positional
> argument does what in functions like string_to_array, which take
> multiple arguments. I'll try to get a patch in for the next CF with a
> fix for that, and a separate one that doesn't put it on people to use
> \df+ to find the comments we do provide. There have been proposals for
> including an optional space for COMMENT ON in DDL, but I suspect that
> those won't fly unless and until they make their way into the
> standard.
>

Since you mention \df+, I wonder if this is the time to consider removing
the source code from \df+ (since we have \sf) and adding in the proacl
instead?


Re: pg_class.reltype -> pg_type.oid missing for pg_toast table

2021-01-19 Thread Tom Lane
"Joel Jacobson"  writes:
> When copying all tables in pg_catalog, to a separate schema with the purpose
> of testing if foreign keys could be added for all oid columns, I got an error 
> for a toast table:
> ERROR:  insert or update on table "pg_class" violates foreign key constraint 
> "pg_class_reltype_fkey"
> DETAIL:  Key (reltype)=(86987582) is not present in table "pg_type".

I'm too lazy to check the code right now, but my recollection is that we
do not bother to make composite-type entries for toast tables.  However,
they should have reltype = 0 if so, so I'm not quite sure where the
above failure is coming from.

regards, tom lane




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-01-19 Thread Julien Rouhaud
On Fri, Jan 8, 2021 at 1:07 AM Julien Rouhaud  wrote:
>
> v15 that fixes recent conflicts.

Rebase only, thanks to the cfbot!  V16 attached.
From a0388c53d9755cfd706513f7f02a08b31a48aacb Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Mon, 18 Mar 2019 18:55:50 +0100
Subject: [PATCH v16 2/3] Expose queryid in pg_stat_activity and
 log_line_prefix

Similarly to other fields in pg_stat_activity, only the queryid from the top
level statements are exposed, and if the backends status isn't active then the
queryid from the last executed statements is displayed.

Also add a %Q placeholder to include the queryid in the log_line_prefix, which
will also only expose top level statements.

Author: Julien Rouhaud
Reviewed-by: Evgeny Efimkin, Michael Paquier, Yamada Tatsuro, Atsushi Torikoshi
Discussion: https://postgr.es/m/CA+8PKvQnMfOE-c3YLRwxOsCYXQDyP8VXs6CDtMZp1V4=d4l...@mail.gmail.com
---
 .../pg_stat_statements/pg_stat_statements.c   | 112 +++---
 doc/src/sgml/config.sgml  |  29 +++--
 doc/src/sgml/monitoring.sgml  |  16 +++
 src/backend/catalog/system_views.sql  |   1 +
 src/backend/executor/execMain.c   |   8 ++
 src/backend/executor/execParallel.c   |  14 ++-
 src/backend/executor/nodeGather.c |   3 +-
 src/backend/executor/nodeGatherMerge.c|   4 +-
 src/backend/parser/analyze.c  |   5 +
 src/backend/postmaster/pgstat.c   |  65 ++
 src/backend/tcop/postgres.c   |   5 +
 src/backend/utils/adt/pgstatfuncs.c   |   7 +-
 src/backend/utils/error/elog.c|   9 +-
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/backend/utils/misc/queryjumble.c  |  29 +++--
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/executor/execParallel.h   |   3 +-
 src/include/pgstat.h  |   5 +
 src/include/utils/queryjumble.h   |   2 +-
 src/test/regress/expected/rules.out   |   9 +-
 20 files changed, 223 insertions(+), 110 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3db4fa2f7a..ce166f417e 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -65,6 +65,7 @@
 #include "tcop/utility.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/queryjumble.h"
 #include "utils/memutils.h"
 #include "utils/timestamp.h"
 
@@ -99,6 +100,14 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define USAGE_DEALLOC_PERCENT	5	/* free this % of entries at once */
 #define IS_STICKY(c)	((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0)
 
+/*
+ * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze
+ * ignores.
+ */
+#define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
+	!IsA(n, PrepareStmt) && \
+	!IsA(n, DeallocateStmt))
+
 /*
  * Extension version number, for supporting older extension versions' objects
  */
@@ -307,7 +316,6 @@ static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 ProcessUtilityContext context, ParamListInfo params,
 QueryEnvironment *queryEnv,
 DestReceiver *dest, QueryCompletion *qc);
-static uint64 pgss_hash_string(const char *str, int len);
 static void pgss_store(const char *query, uint64 queryId,
 	   int query_location, int query_len,
 	   pgssStoreKind kind,
@@ -804,16 +812,14 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 		return;
 
 	/*
-	 * Utility statements get queryId zero.  We do this even in cases where
-	 * the statement contains an optimizable statement for which a queryId
-	 * could be derived (such as EXPLAIN or DECLARE CURSOR).  For such cases,
-	 * runtime control will first go through ProcessUtility and then the
-	 * executor, and we don't want the executor hooks to do anything, since we
-	 * are already measuring the statement's costs at the utility level.
+	 * Clear queryId for prepared statements related utility, as those will
+	 * inherit from the underlying statement's one (except DEALLOCATE which is
+	 * entirely untracked).
 	 */
 	if (query->utilityStmt)
 	{
-		query->queryId = UINT64CONST(0);
+		if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
+			query->queryId = UINT64CONST(0);
 		return;
 	}
 
@@ -1055,6 +1061,23 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	DestReceiver *dest, QueryCompletion *qc)
 {
 	Node	   *parsetree = pstmt->utilityStmt;
+	uint64		saved_queryId = pstmt->queryId;
+
+	/*
+	 * Force utility statements to get queryId zero.  We do this even in cases
+	 * where the statement contains an optimizable statement for which a
+	 * queryId could be derived (such as EXPLAIN or DECLARE CURSOR).  For such
+	 * cases, runtime control will first go through ProcessUtility and then t

Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
On Tue, Jan 19, 2021 at 8:57 AM Tom Lane  wrote:
>
> James Hilliard  writes:
> > On Tue, Jan 19, 2021 at 8:27 AM Tom Lane  wrote:
> >> We already dealt with that by not selecting an SDK newer than the
> >> underlying OS (see 4823621db).
>
> > Tried that, doesn't work, not even sure how it could possibly fix this
> > issue at all,
>
> It worked for me and for Sergey, so we need to figure out what's different
> about your setup.  What do you get from "xcrun --show-sdk-path" and
> "xcrun --sdk macosx --show-sdk-path"?  What have you got under
> /Library/Developer/CommandLineTools/SDKs ?
$ xcrun --show-sdk-path
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
$ xcrun --sdk macosx --show-sdk-path
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
$ ls -laht /Library/Developer/CommandLineTools/SDKs
total 0
drwxr-xr-x  5 root  wheel   160B Jan 14  2020 .
drwxr-xr-x  8 root  wheel   256B Jan 14  2020 MacOSX10.15.sdk
drwxr-xr-x  7 root  wheel   224B Jan 14  2020 MacOSX10.14.sdk
lrwxr-xr-x  1 root  wheel15B Jan 14  2020 MacOSX.sdk -> MacOSX10.15.sdk
>
> > this can not be fixed properly by selecting a specific SDK version
> > alone, it's the symbols valid for a specific target deployment version
> > that matters here.
>
> I don't think I believe that argument.  As a counterexample, supposing
> that somebody were intentionally cross-compiling on an older OSX platform
> but using a newer SDK, shouldn't they get an executable suited to the
> SDK's target version?
Yep, that's exactly what this should fix:

MACOSX_DEPLOYMENT_TARGET=11.0 ./configure
checking for pwritev... yes

Which fails at runtime on 10.15:
dyld: lazy symbol binding failed: Symbol not found: _pwritev
  Referenced from: /usr/local/pgsql/bin/postgres (which was built for
Mac OS X 11.0)
  Expected in: /usr/lib/libSystem.B.dylib

dyld: Symbol not found: _pwritev
  Referenced from: /usr/local/pgsql/bin/postgres (which was built for
Mac OS X 11.0)
  Expected in: /usr/lib/libSystem.B.dylib

child process was terminated by signal 6: Abort trap: 6

MACOSX_DEPLOYMENT_TARGET=10.15 ./configure
checking for pwritev... no

Noticed a couple small issues, I'll send a v2.
>
> (I realize that Apple thinks we ought to handle that through run-time
> not compile-time adaptation, but I have no interest in going there.)
>
> regards, tom lane




Re: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault

2021-01-19 Thread Tom Lane
I wrote:
> Now that I look at this, I strongly wonder whether whoever added
> MergeAppend support here understood what they were doing.  That
> looks broken, because child nodes will typically be positioned on
> tuples, whether or not the current top-level output came from them.
> So I fear we could get a false-positive confirmation that some
> tuple matches WHERE CURRENT OF.

Urgh, indeed it's buggy.  With the attached test script I get

...
BEGIN
DECLARE CURSOR
 f1 | f2  
+-
  1 | one
(1 row)

UPDATE 1
UPDATE 1
UPDATE 1
COMMIT
 f1 | f2  
+-
  1 | one updated
(1 row)

 f1 | f2  
+-
  2 | two updated
(1 row)

 f1 |  f2   
+---
  3 | three updated
(1 row)

where clearly only the row with f1=1 should have updated
(and if you leave off ORDER BY, so as to get a Merge not
MergeAppend plan, indeed only that row updates).

I shall go fix this, and try to improve the evidently-inadequate
comments in search_plan_tree.

regards, tom lane

drop table if exists t1,t2,t3;

create table t1 (f1 int unique, f2 text);
insert into t1 values (1, 'one');
create table t2 (f1 int unique, f2 text);
insert into t2 values (2, 'two');
create table t3 (f1 int unique, f2 text);
insert into t3 values (3, 'three');

explain
select * from t1 union all select * from t2 union all select * from t3
order by 1;

begin;

declare c cursor for
select * from t1 union all select * from t2 union all select * from t3
order by 1;

fetch 1 from c;

update t1 set f2 = f2 || ' updated' where current of c;
update t2 set f2 = f2 || ' updated' where current of c;
update t3 set f2 = f2 || ' updated' where current of c;

commit;

table t1;
table t2;
table t3;


[PATCH v2 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
Fixes:
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security 
-fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -O2 
-I../../../../src/include  -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
-c -o fd.o fd.c
fd.c:3661:10: warning: 'pwritev' is only available on macOS 11.0 or newer 
[-Wunguarded-availability-new]
part = pg_pwritev(fd, iov, iovcnt, offset);
   ^~
../../../../src/include/port/pg_iovec.h:49:20: note: expanded from macro 
'pg_pwritev'
   ^~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9:
 note: 'pwritev' has been marked as being introduced in macOS 11.0
  here, but the deployment target is macOS 10.15.0
ssize_t pwritev(int, const struct iovec *, int, off_t) 
__DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0), 
watchos(7.0), tvos(14.0));
^
fd.c:3661:10: note: enclose 'pwritev' in a __builtin_available check to silence 
this warning
part = pg_pwritev(fd, iov, iovcnt, offset);
   ^~
../../../../src/include/port/pg_iovec.h:49:20: note: expanded from macro 
'pg_pwritev'
   ^~~
1 warning generated.

This results in a runtime error:
running bootstrap script ... dyld: lazy symbol binding failed: Symbol not 
found: _pwritev
  Referenced from: /usr/local/pgsql/bin/postgres
  Expected in: /usr/lib/libSystem.B.dylib

dyld: Symbol not found: _pwritev
  Referenced from: /usr/local/pgsql/bin/postgres
  Expected in: /usr/lib/libSystem.B.dylib

child process was terminated by signal 6: Abort trap: 6

To fix this we set -Werror=unguarded-availability-new so that a compile
test for pwritev will fail if the symbol is unavailable on the requested
SDK version.
---
Changes v1 -> v2:
  - Add AC_LIBOBJ(pwritev) when pwritev not available
  - set -Werror=unguarded-availability-new for CXX flags as well
---
 configure| 145 ++-
 configure.ac |  21 +++-
 2 files changed, 152 insertions(+), 14 deletions(-)

diff --git a/configure b/configure
index 8af4b99021..662b0ae9ce 100755
--- a/configure
+++ b/configure
@@ -5373,6 +5373,98 @@ if test x"$pgac_cv_prog_CC_cflags__Werror_vla" = x"yes"; 
then
 fi
 
 
+  # Prevent usage of symbols marked as newer than our target.
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-Werror=unguarded-availability-new, for CFLAGS" >&5
+$as_echo_n "checking whether ${CC} supports 
-Werror=unguarded-availability-new, for CFLAGS... " >&6; }
+if ${pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS} -Werror=unguarded-availability-new"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new=yes
+else
+  pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__Werror_unguarded_availability_new" = 
x"yes"; then
+  CFLAGS="${CFLAGS} -Werror=unguarded-availability-new"
+fi
+
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CXX} supports 
-Werror=unguarded-availability-new, for CXXFLAGS" >&5
+$as_echo_n "checking whether ${CXX} supports 
-Werror=unguarded-availability-new, for CXXFLAGS... " >&6; }
+if ${pgac_cv_prog_CXX_cxxflags__Werror_unguarded_availability_new+:} false; 
then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CXXFLAGS=$CXXFLAGS
+pgac_save_CXX=$CXX
+CXX=${CXX}
+CXXFLAGS="${CXXFLAGS} -Werror=unguarded-availability-new"
+ac_save_cxx_werror_flag=$ac_cxx_werror_flag
+ac_cxx_werror_flag=yes
+ac_ext=cpp
+ac_cpp='$CXXCPP $CPPFLAGS'
+ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS 
conftest.$ac_ext $LIBS >&5'
+ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  pgac_cv_prog_CXX_cxxflags__Werror_unguarded_availability_new=yes
+else
+  pgac_cv_prog_CXX_cxxflags__Werror_unguarded_availability_new=no
+fi
+rm -f core conftest.err

Re: psql \df choose functions by their arguments

2021-01-19 Thread Greg Sabino Mullane
Ha ha ha, my bad, I am not sure why I left those out. Here is a new patch
with int2, int4, and int8. Thanks for the email.

Cheers,
Greg


v6-psql-df-pick-function-by-type.patch
Description: Binary data


Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread Tom Lane
James Hilliard  writes:
> On Tue, Jan 19, 2021 at 8:57 AM Tom Lane  wrote:
>> It worked for me and for Sergey, so we need to figure out what's different
>> about your setup.  What do you get from "xcrun --show-sdk-path" and
>> "xcrun --sdk macosx --show-sdk-path"?  What have you got under
>> /Library/Developer/CommandLineTools/SDKs ?

> $ xcrun --show-sdk-path
> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
> $ xcrun --sdk macosx --show-sdk-path
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
> $ ls -laht /Library/Developer/CommandLineTools/SDKs
> total 0
> drwxr-xr-x  5 root  wheel   160B Jan 14  2020 .
> drwxr-xr-x  8 root  wheel   256B Jan 14  2020 MacOSX10.15.sdk
> drwxr-xr-x  7 root  wheel   224B Jan 14  2020 MacOSX10.14.sdk
> lrwxr-xr-x  1 root  wheel15B Jan 14  2020 MacOSX.sdk -> MacOSX10.15.sdk

Ah, got it.  So "xcrun --show-sdk-path" tells us the right thing (that
is, it *does* give us a symlink to a 10.15 SDK) but by refusing to
believe we've got the right thing, we end up picking MacOSX11.1.sdk.
Drat.  I suppose we could drop the heuristic about wanting a version
number in the SDK path, but I really don't want to do that.  Now I'm
thinking about trying to dereference the symlink after the first step.

BTW, it's curious that you get a reference to the MacOSX.sdk symlink
where both Sergey and I got references to the actual directory.
Do you happen to recall the order in which you installed/upgraded
Xcode and its command line tools?

>> I don't think I believe that argument.  As a counterexample, supposing
>> that somebody were intentionally cross-compiling on an older OSX platform
>> but using a newer SDK, shouldn't they get an executable suited to the
>> SDK's target version?

> Yep, that's exactly what this should fix:
> MACOSX_DEPLOYMENT_TARGET=11.0 ./configure
> checking for pwritev... yes
> Which fails at runtime on 10.15:

Well yeah, exactly.  It should fail at run-time, because you
cross-compiled an executable that's not built for the machine
you're on.  What we need is to prevent configure from setting up
a cross-compile situation by default.

regards, tom lane




Re: [PATCH v2 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread Tom Lane
James Hilliard  writes:
> Fixes:
> fd.c:3661:10: warning: 'pwritev' is only available on macOS 11.0 or newer 
> [-Wunguarded-availability-new]

It's still missing preadv, and it still has nonzero chance of breaking
successful detection of pwritev on platforms other than yours, and it's
still really ugly.

But the main reason I don't want to go this way is that I don't think
it'll stop with preadv/pwritev.  If we make it our job to build
successfully even when using the wrong SDK version for the target
platform, we're going to be in for more and more pain with other
kernel APIs.

We could, of course, do what Apple wants us to do and try to build
executables that work across versions.  I do not intend to put up
with the sort of invasive, error-prone source-code-level runtime test
they recommend ... but given that there is weak linking involved here,
I wonder if there is a way to silently sub in src/port/pwritev.c
when executing on a pre-11 macOS, by dint of marking it a weak
symbol?

regards, tom lane




Re: Printing backtrace of postgres processes

2021-01-19 Thread Robert Haas
On Sat, Jan 16, 2021 at 3:21 PM Tom Lane  wrote:
> I'd argue that backtraces for those processes aren't really essential,
> and indeed that trying to make the syslogger report its own backtrace
> is damn dangerous.

I agree. Ideally I'd like to be able to use the same mechanism
everywhere and include those processes too, but surely regular
backends and parallel workers are going to be the things that come up
most often.

> (Personally, I think this whole patch fails the safety-vs-usefulness
> tradeoff, but I expect I'll get shouted down.)

You and I are frequently on opposite sides of these kinds of
questions, but I think this is a closer call than many cases. I'm
convinced that it's useful, but I'm not sure whether it's safe. On the
usefulness side, backtraces are often the only way to troubleshoot
problems that occur on production systems. I wish we had better
logging and tracing tools instead of having to ask for this sort of
thing, but we don't. EDB support today frequently asks customers to
attach gdb and take a backtrace that way, and that has risks which
this implementation does not: for example, suppose you were unlucky
enough to attach during a spinlock protected critical section, and
suppose you didn't continue the stopped process before the 60 second
timeout expired and some other process caused a PANIC. Even if this
implementation were to end up emitting a backtrace with a spinlock
held, it would remove the risk of leaving the process stopped while
holding a critical lock, and would in that sense be safer. However, as
soon as you make something like this accessible via an SQL callable
function, some people are going to start spamming it. And, as soon as
they do that, any risks inherent in the implementation are multiplied.
If it carries an 0.01% chance of crashing the system, we'll have
people taking production systems down with this all the time. At that
point I wouldn't want the feature, even if the gdb approach had the
same risk (which I don't think it does).

What do you see as the main safety risks here?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: New Table Access Methods for Multi and Single Inserts

2021-01-19 Thread Jeff Davis
On Mon, 2021-01-18 at 08:58 +0100, Luc Vlaming wrote:
> You mean how it could because of that the table modification API
> uses 
> the table_tuple_insert_speculative ? Just wondering if you think if
> it 
> generally cannot work or would like to see that path / more paths 
> integrated in to the patch.

I think the patch should support INSERT INTO ... SELECT, and it will be
easier to tell if we have the right API when that's integrated.

Regards,
Jeff Davis






Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
On Tue, Jan 19, 2021 at 10:17 AM Tom Lane  wrote:
>
> James Hilliard  writes:
> > On Tue, Jan 19, 2021 at 8:57 AM Tom Lane  wrote:
> >> It worked for me and for Sergey, so we need to figure out what's different
> >> about your setup.  What do you get from "xcrun --show-sdk-path" and
> >> "xcrun --sdk macosx --show-sdk-path"?  What have you got under
> >> /Library/Developer/CommandLineTools/SDKs ?
>
> > $ xcrun --show-sdk-path
> > /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
> > $ xcrun --sdk macosx --show-sdk-path
> > /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
> > $ ls -laht /Library/Developer/CommandLineTools/SDKs
> > total 0
> > drwxr-xr-x  5 root  wheel   160B Jan 14  2020 .
> > drwxr-xr-x  8 root  wheel   256B Jan 14  2020 MacOSX10.15.sdk
> > drwxr-xr-x  7 root  wheel   224B Jan 14  2020 MacOSX10.14.sdk
> > lrwxr-xr-x  1 root  wheel15B Jan 14  2020 MacOSX.sdk -> MacOSX10.15.sdk
>
> Ah, got it.  So "xcrun --show-sdk-path" tells us the right thing (that
> is, it *does* give us a symlink to a 10.15 SDK) but by refusing to
> believe we've got the right thing, we end up picking MacOSX11.1.sdk.
> Drat.  I suppose we could drop the heuristic about wanting a version
> number in the SDK path, but I really don't want to do that.  Now I'm
> thinking about trying to dereference the symlink after the first step.
The MacOSX11.1.sdk can build for a 10.15 target just fine when passed
an appropriate MACOSX_DEPLOYMENT_TARGET, so that SDK should be
fine.
>
> BTW, it's curious that you get a reference to the MacOSX.sdk symlink
> where both Sergey and I got references to the actual directory.
> Do you happen to recall the order in which you installed/upgraded
> Xcode and its command line tools?
I generally just upgrade to the latest as it becomes available.
>
> >> I don't think I believe that argument.  As a counterexample, supposing
> >> that somebody were intentionally cross-compiling on an older OSX platform
> >> but using a newer SDK, shouldn't they get an executable suited to the
> >> SDK's target version?
>
> > Yep, that's exactly what this should fix:
> > MACOSX_DEPLOYMENT_TARGET=11.0 ./configure
> > checking for pwritev... yes
> > Which fails at runtime on 10.15:
>
> Well yeah, exactly.  It should fail at run-time, because you
> cross-compiled an executable that's not built for the machine
> you're on.  What we need is to prevent configure from setting up
> a cross-compile situation by default.
The toolchain already selects the correct deployment target by default, the
issue is just that the configure test for pwritev was being done in a way that
ignored the deployment target version, I fixed that.
>
> regards, tom lane




Re: Printing backtrace of postgres processes

2021-01-19 Thread Tom Lane
Robert Haas  writes:
> On Sat, Jan 16, 2021 at 3:21 PM Tom Lane  wrote:
>> (Personally, I think this whole patch fails the safety-vs-usefulness
>> tradeoff, but I expect I'll get shouted down.)

> What do you see as the main safety risks here?

The thing that is scaring me the most is the "broadcast" aspect.
For starters, I think that that is going to be useless in the
field because of the likelihood that different backends' stack
traces will get interleaved in whatever log file the traces are
going to.  Also, having hundreds of processes spitting dozens of
lines to the same place at the same time is going to expose any
little weaknesses in your logging arrangements, such as rsyslog's
tendency to drop messages under load.

I think it's got security hazards as well.  If we restricted the
feature to cause a trace of only one process at a time, and required
that process to be logged in as the same database user as the
requestor (or at least someone with the privs of that user, such
as a superuser), that'd be less of an issue.

Beyond that, well, maybe it's all right.  In theory anyplace that
there's a CHECK_FOR_INTERRUPTS should be okay to call elog from;
but it's completely untested whether we can do that and then
continue, as opposed to aborting the transaction or whole session.
I share your estimate that there'll be small-fraction-of-a-percent
hazards that could still add up to dangerous instability if people
go wild with this.

regards, tom lane




Re: simplifying foreign key/RI checks

2021-01-19 Thread Corey Huinker
>
> I decided not to deviate from pk_ terminology so that the new code
> doesn't look too different from the other code in the file.  Although,
> I guess we can at least call the main function
> ri_ReferencedKeyExists() instead of ri_PrimaryKeyExists(), so I've
> changed that.
>

I agree with leaving the existing terminology where it is for this patch.
Changing the function name is probably enough to alert the reader that the
things that are called pks may not be precisely that.


Re: [PATCH v2 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
On Tue, Jan 19, 2021 at 10:29 AM Tom Lane  wrote:
>
> James Hilliard  writes:
> > Fixes:
> > fd.c:3661:10: warning: 'pwritev' is only available on macOS 11.0 or newer 
> > [-Wunguarded-availability-new]
>
> It's still missing preadv, and it still has nonzero chance of breaking
> successful detection of pwritev on platforms other than yours, and it's
> still really ugly.
Setting -Werror=unguarded-availability-new should in theory always
ensure that configure checks fail if the symbol is unavailable or marked
as requiring a target newer than the MACOSX_DEPLOYMENT_TARGET.
>
> But the main reason I don't want to go this way is that I don't think
> it'll stop with preadv/pwritev.  If we make it our job to build
> successfully even when using the wrong SDK version for the target
> platform, we're going to be in for more and more pain with other
> kernel APIs.
This issue really has nothing to do with the SDK version at all, it's the
MACOSX_DEPLOYMENT_TARGET that matters which must be taken
into account during configure in some way, this is what my patch does
by triggering the pwritev compile test error by setting
-Werror=unguarded-availability-new.

It's expected that MACOSX_DEPLOYMENT_TARGET=10.15 with a
MacOSX11.1.sdk will produce a binary that can run on OSX 10.15.

The MacOSX11.1.sdk is not the wrong SDK for a 10.15 target and
is fully capable of producing 10.15 compatible binaries.
>
> We could, of course, do what Apple wants us to do and try to build
> executables that work across versions.  I do not intend to put up
> with the sort of invasive, error-prone source-code-level runtime test
> they recommend ... but given that there is weak linking involved here,
> I wonder if there is a way to silently sub in src/port/pwritev.c
> when executing on a pre-11 macOS, by dint of marking it a weak
> symbol?
The check I added is strictly a compile time check still, not runtime.

I also don't think this is a weak symbol.

>From the header file it is not have __attribute__((weak_import)):
ssize_t pwritev(int, const struct iovec *, int, off_t)
__DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0),
watchos(7.0), tvos(14.0));
>
> regards, tom lane




Re: Add primary keys to system catalogs

2021-01-19 Thread Mark Rofail
Dear Joel,

I would love to start working on this again, I tried to revive the patch
again in 2019, however, I faced the same problems as last time (detailed in
the thread) and I didn't get the support I needed to continue with this
patch.

Are you willing to help rebase and finally publish this patch?

Best Regards,
Mark Rofail

On Tue, 19 Jan 2021 at 2:22 PM Joel Jacobson  wrote:

> On Mon, Jan 18, 2021, at 18:23, Tom Lane wrote:
> > I realized that there's a stronger roadblock for
> > treating catalog interrelationships as SQL foreign keys.  Namely,
> > that we always represent no-reference situations with a zero OID,
> > whereas it'd have to be NULL to look like a valid foreign-key case.
>
> Another roadblock is perhaps the lack of foreign keys on arrays,
> which would be needed by the following references:
>
> SELECT * FROM oidjoins WHERE column_type ~ '(vector|\[\])$' ORDER BY 1,2,3;
>   table_name  |  column_name   | column_type | ref_table_name |
> ref_column_name
>
> --++-++-
> pg_constraint| conexclop  | oid[]   | pg_operator| oid
> pg_constraint| conffeqop  | oid[]   | pg_operator| oid
> pg_constraint| confkey| int2[]  | pg_attribute   |
> attnum
> pg_constraint| conkey | int2[]  | pg_attribute   |
> attnum
> pg_constraint| conpfeqop  | oid[]   | pg_operator| oid
> pg_constraint| conppeqop  | oid[]   | pg_operator| oid
> pg_extension | extconfig  | oid[]   | pg_class   | oid
> pg_index | indclass   | oidvector   | pg_opclass | oid
> pg_index | indcollation   | oidvector   | pg_collation   | oid
> pg_index | indkey | int2vector  | pg_attribute   |
> attnum
> pg_partitioned_table | partattrs  | int2vector  | pg_attribute   |
> attnum
> pg_partitioned_table | partclass  | oidvector   | pg_opclass | oid
> pg_partitioned_table | partcollation  | oidvector   | pg_collation   | oid
> pg_policy| polroles   | oid[]   | pg_authid  | oid
> pg_proc  | proallargtypes | oid[]   | pg_type| oid
> pg_proc  | proargtypes| oidvector   | pg_type| oid
> pg_statistic_ext | stxkeys| int2vector  | pg_attribute   |
> attnum
> pg_trigger   | tgattr | int2vector  | pg_attribute   |
> attnum
> (18 rows)
>
> I see there is an old thread with work in this area,  "Re: GSoC 2017:
> Foreign Key Arrays":
>
>
> https://www.postgresql.org/message-id/flat/76a8d3d8-a22e-3299-7c4e-6e115dbf3762%40proxel.se#a3ddc34863465ef83adbd26022cdbcc0
>
> The last message in the thread is from 2018-10-02:
> >On Tue, 2 Oct, 2018 at 05:13:26AM +0200, Michael Paquier wrote:
> >>On Sat, Aug 11, 2018 at 05:20:57AM +0200, Mark Rofail wrote:
> >> I am still having problems rebasing this patch. I can not figure it out
> on
> >> my own.
> >Okay, it's been a couple of months since this last email, and nothing
> >has happened, so I am marking it as returned with feedback.
> >--
> >Michael
>
> Personally, I would absolutely *love* FKs on array columns.
> I always feel shameful when adding array columns to tables,
> when the elements are PKs in some other table.
> It's convenient and often the best design,
> but it feels dirty knowing there are no FKs to help detect application
> bugs.
>
> Let's hope the current FKs-on-catalog-discussion can ignite the old
> Foreign Key Arrays thread.
>
>
> /Joel
>


Re: Rethinking plpgsql's assignment implementation

2021-01-19 Thread Pavel Stehule
Hi

Now, I am testing subscribing on the jsonb feature, and I found one issue,
that is not supported by parser.

When the target is scalar, then all is ok. But we can have a plpgsql array
of jsonb values.

postgres=# do $$
declare j jsonb[];
begin
  j[1] = '{"b":"Ahoj"}';
  raise notice '%', j;
  raise notice '%', (j[1])['b'];
end
$$;
NOTICE:  {"{\"b\": \"Ahoj\"}"}
NOTICE:  "Ahoj"
DO

Parenthesis work well in expressions, but are not supported on the left
side of assignment.

postgres=# do $$
declare j jsonb[];
begin
  (j[1])['b'] = '"Ahoj"';
  raise notice '%', j;
  raise notice '%', j[1]['b'];
end
$$;
ERROR:  syntax error at or near "("
LINE 4:   (j[1])['b'] = '"Ahoj"';
  ^

Regards

Pavel


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-19 Thread Pavel Stehule
Hi

I found minor issues.

Doc - missing tag

and three whitespaces issues

see attached patch

Following sentence is hard to read due long nested example

If the
+   path contradicts structure of modified jsonb for any
individual
+   value (e.g. path val['a']['b']['c'] assumes keys
+   'a' and 'b' have object values
+   assigned to them, but if val['a'] or
+   val['b'] is null, a string, or a number, then the
path
+   contradicts with the existing structure), an error is raised even if
other
+   values do conform.

It can be divided into two sentences - predicate, and example.

Regards

Pavel
commit 7ce4fbe2620a5d8efdff963b2368c3d0fd904c59
Author: ok...@github.com 
Date:   Tue Jan 19 19:37:02 2021 +0100

fix whitespaces

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 924762e128..4e19fe4fb8 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -613,6 +613,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu
e.g. in case of arrays it is a 0-based operation or that negative integers
that appear in path count from the end of JSON arrays.
The result of subscripting expressions is always jsonb data type.
+  
 
   
UPDATE statements may use subscripting in the
diff --git a/src/backend/utils/adt/jsonbsubs.c b/src/backend/utils/adt/jsonbsubs.c
index 306c37b5a6..64979f3a5b 100644
--- a/src/backend/utils/adt/jsonbsubs.c
+++ b/src/backend/utils/adt/jsonbsubs.c
@@ -342,8 +342,8 @@ jsonb_subscript_fetch_old(ExprState *state,
 	{
 		Jsonb	*jsonbSource = DatumGetJsonbP(*op->resvalue);
 		sbsrefstate->prevvalue = jsonb_get_element(jsonbSource,
-	  			   sbsrefstate->upperindex,
-	  			   sbsrefstate->numupper,
+   sbsrefstate->upperindex,
+   sbsrefstate->numupper,
    &sbsrefstate->prevnull,
    false);
 	}
@@ -366,7 +366,7 @@ jsonb_exec_setup(const SubscriptingRef *sbsref,
 
 	/* Allocate type-specific workspace with space for per-subscript data */
 	workspace = palloc0(MAXALIGN(sizeof(JsonbSubWorkspace)) +
-	nupper * (sizeof(Datum) + sizeof(Oid)));
+		nupper * (sizeof(Datum) + sizeof(Oid)));
 	workspace->expectArray = false;
 	ptr = ((char *) workspace) + MAXALIGN(sizeof(JsonbSubWorkspace));
 	workspace->indexOid = (Oid *) ptr;
@@ -379,7 +379,7 @@ jsonb_exec_setup(const SubscriptingRef *sbsref,
 	foreach(lc, sbsref->refupperindexpr)
 	{
 		Node   *expr = lfirst(lc);
-		int 	i = foreach_current_index(lc);
+		int		i = foreach_current_index(lc);
 
 		workspace->indexOid[i] = exprType(expr);
 	}


Re: Add docs stub for recovery.conf

2021-01-19 Thread Stephen Frost
Greetings,

* Craig Ringer (craig.rin...@enterprisedb.com) wrote:
> On Thu, 14 Jan 2021 at 03:44, Stephen Frost  wrote:
> > Alright, how does this look?  The new entries are all under the
> > 'obsolete' section to keep it out of the main line, but should work to
> > 'fix' the links that currently 404 and provide a bit of a 'softer'
> > landing for the other cases that currently just forcibly redirect using
> > the website doc alias capability.
> 
> Thanks for expanding the change to other high profile obsoleted or renamed
> features and tools.

Thanks for taking the time to review it and comment on it!

> One minor point. I'm not sure this is quite the best way to spell the index
> entries:
> 
> +   
> + obsolete
> + pg_receivexlog
> +   
> 
> as it will produce an index term "obsolete" with a list of various
> components under it. While that concentrates them nicely, it means people
> won't actually find them if they're using the index alphabetically.

Ah, yeah, that's definitely a good point and one that I hadn't really
spent much time thinking about.

> I'd slightly prefer
> 
> +   
> + pg_receivexlog
> + pg_receivewal
> +   
> 
> even though that bulks the index up a little, because then people are a bit
> more likely to find it.

Yup, makes sense, updated patch attached which makes that change.

> > I ended up not actually doing this for the catalog -> view change of
> > pg_replication_slots simply because I don't really think folks will
> > misunderstand or be confused by that redirect since it's still the same
> > relation.  If others disagree though, we could certainly change that
> > too.
> 
> I agree with you.

Ok, great.

How does the attached look then?

Bruce, did you want to review or comment on this as to if it addresses
your concerns appropriately?  Would be great to get this in as there's
the follow-on for default roles.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Add docs stub for recovery.conf

2021-01-19 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Craig Ringer (craig.rin...@enterprisedb.com) wrote:
> > On Thu, 14 Jan 2021 at 03:44, Stephen Frost  wrote:
> > > Alright, how does this look?  The new entries are all under the
> > > 'obsolete' section to keep it out of the main line, but should work to
> > > 'fix' the links that currently 404 and provide a bit of a 'softer'
> > > landing for the other cases that currently just forcibly redirect using
> > > the website doc alias capability.
> > 
> > Thanks for expanding the change to other high profile obsoleted or renamed
> > features and tools.
> 
> Thanks for taking the time to review it and comment on it!
> 
> > One minor point. I'm not sure this is quite the best way to spell the index
> > entries:
> > 
> > +   
> > + obsolete
> > + pg_receivexlog
> > +   
> > 
> > as it will produce an index term "obsolete" with a list of various
> > components under it. While that concentrates them nicely, it means people
> > won't actually find them if they're using the index alphabetically.
> 
> Ah, yeah, that's definitely a good point and one that I hadn't really
> spent much time thinking about.
> 
> > I'd slightly prefer
> > 
> > +   
> > + pg_receivexlog
> > + pg_receivewal
> > +   
> > 
> > even though that bulks the index up a little, because then people are a bit
> > more likely to find it.
> 
> Yup, makes sense, updated patch attached which makes that change.
> 
> > > I ended up not actually doing this for the catalog -> view change of
> > > pg_replication_slots simply because I don't really think folks will
> > > misunderstand or be confused by that redirect since it's still the same
> > > relation.  If others disagree though, we could certainly change that
> > > too.
> > 
> > I agree with you.
> 
> Ok, great.
> 
> How does the attached look then?
> 
> Bruce, did you want to review or comment on this as to if it addresses
> your concerns appropriately?  Would be great to get this in as there's
> the follow-on for default roles.

... really attached now, sorry about that. :)

Thanks,

Stephen
diff --git a/doc/src/sgml/appendix-obsolete-pgreceivexlog.sgml b/doc/src/sgml/appendix-obsolete-pgreceivexlog.sgml
new file mode 100644
index 00..391eb5dcb2
--- /dev/null
+++ b/doc/src/sgml/appendix-obsolete-pgreceivexlog.sgml
@@ -0,0 +1,26 @@
+
+
+
+
+  The pg_receivexlog command
+
+   
+ pg_receivexlog
+ pg_receivewal
+   
+
+   
+PostgreSQL 9.6 and below provided a command named
+pg_receivexlog
+pg_receivexlog
+to fetch write-ahead-log (WAL) files.  This command was renamed to pg_receivewal, see
+ for documentation of pg_receivewal and see
+the release notes for PostgreSQL 10 for details
+on this change.
+   
+
+
diff --git a/doc/src/sgml/appendix-obsolete-pgresetxlog.sgml b/doc/src/sgml/appendix-obsolete-pgresetxlog.sgml
new file mode 100644
index 00..44452b5627
--- /dev/null
+++ b/doc/src/sgml/appendix-obsolete-pgresetxlog.sgml
@@ -0,0 +1,26 @@
+
+
+
+
+  The pg_resetxlog command
+
+   
+ pg_resetxlog
+ pg_resetwal
+   
+
+   
+PostgreSQL 9.6 and below provided a command named
+pg_resetxlog
+pg_resetxlog
+to reset the write-ahead-log (WAL) files.  This command was renamed to pg_resetwal, see
+ for documentation of pg_resetwal and see
+the release notes for PostgreSQL 10 for details
+on this change.
+   
+
+
diff --git a/doc/src/sgml/appendix-obsolete-pgxlogdump.sgml b/doc/src/sgml/appendix-obsolete-pgxlogdump.sgml
new file mode 100644
index 00..325316b4e6
--- /dev/null
+++ b/doc/src/sgml/appendix-obsolete-pgxlogdump.sgml
@@ -0,0 +1,26 @@
+
+
+
+
+  The pg_xlogdump command
+
+   
+ pg_xlogdump
+ pg_waldump
+   
+
+   
+PostgreSQL 9.6 and below provided a command named
+pg_xlogdump
+pg_xlogdump
+to read write-ahead-log (WAL) files.  This command was renamed to pg_waldump, see
+ for documentation of pg_waldump and see
+the release notes for PostgreSQL 10 for details
+on this change.
+   
+
+
diff --git a/doc/src/sgml/appendix-obsolete-recovery-config.sgml b/doc/src/sgml/appendix-obsolete-recovery-config.sgml
new file mode 100644
index 00..3f858e5cb0
--- /dev/null
+++ b/doc/src/sgml/appendix-obsolete-recovery-config.sgml
@@ -0,0 +1,60 @@
+
+
+
+
+  The recovery.conf file
+
+   
+ recovery.conf
+   
+
+   
+PostgreSQL 11 and below used a configuration file named
+recovery.conf
+recovery.conf
+to manage replicas and standbys. Support for this file was removed in PostgreSQL 12. See
+the release notes for PostgreSQL 12 for details
+on this change.
+   
+
+   
+On PostgreSQL 12 and above,
+archive recovery, streaming replication, and PITR
+are configured using
+normal server configuration parameters.
+These are set in postgresql.conf or via
+ALTER SYSTEM
+like any other parameter.
+   
+
+   
+The server will not start if a recovery.conf exists.
+   

Re: [HACKERS] Custom compression methods

2021-01-19 Thread Justin Pryzby
Thanks for updating the patch.

On Mon, Jan 4, 2021 at 6:52 AM Justin Pryzby  wrote:
> The most recent patch doesn't compile --without-lz4:
On Tue, Jan 05, 2021 at 11:19:33AM +0530, Dilip Kumar wrote:
> On Mon, Jan 4, 2021 at 10:08 PM Justin Pryzby  wrote:
> > I think I first saw it on cfbot and I reproduced it locally, too.
> > http://cfbot.cputube.org/dilip-kumar.html
> >
> > I think you'll have to make --without-lz4 the default until the build
> > environments include it, otherwise the patch checker will show red :(
> 
> Oh ok,  but if we make by default --without-lz4 then the test cases
> will start failing which is using lz4 compression.  Am I missing
> something?

The CIs are failing like this:

http://cfbot.cputube.org/dilip-kumar.html
|checking for LZ4_compress in -llz4... no
|configure: error: lz4 library not found
|If you have lz4 already installed, see config.log for details on the
|failure.  It is possible the compiler isn't looking in the proper directory.
|Use --without-lz4 to disable lz4 support.

I thought that used to work (except for windows).  I don't see that anything
changed in the configure tests...  Is it because the CI moved off travis 2
weeks ago ?  I don't' know whether the travis environment had liblz4, and I
don't remember if the build was passing or if it was failing for some other
reason.  I'm guessing historic logs from travis are not available, if they ever
were.

I'm not sure how to deal with that, but maybe you'd need:
1) A separate 0001 patch *allowing* LZ4 to be enabled/disabled;
2) Current patchset needs to compile with/without LZ4, and pass tests in both
cases - maybe you can use "alternate test" output [0] to handle the "without"
case.
3) Eventually, the CI and build environments may have LZ4 installed, and then
we can have a separate debate about whether to enable it by default.

[0] cp -iv src/test/regress/results/compression.out 
src/test/regress/expected/compression_1.out

On Tue, Jan 05, 2021 at 02:20:26PM +0530, Dilip Kumar wrote:
> On Tue, Jan 5, 2021 at 11:19 AM Dilip Kumar  wrote:
> > On Mon, Jan 4, 2021 at 10:08 PM Justin Pryzby  wrote:
> > > I see the windows build is failing:
> > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.123730
> > > |undefined symbol: HAVE_LIBLZ4 at src/include/pg_config.h line 350 at 
> > > src/tools/msvc/Mkvcbuild.pm line 852.
> > > This needs to be patched: src/tools/msvc/Solution.pm
> > > You can see my zstd/pg_dump patch for an example, if needed (actually I'm 
> > > not
> > > 100% sure it's working yet, since the windows build failed for another 
> > > reason).
> >
> > Okay, I will check that.

This still needs help.
perl ./src/tools/msvc/mkvcbuild.pl
...
undefined symbol: HAVE_LIBLZ4 at src/include/pg_config.h line 350 at 
/home/pryzbyj/src/postgres/src/tools/msvc/Mkvcbuild.pm line 852.

Fix like:

+   HAVE_LIBLZ4 => $self->{options}->{zlib} ? 1 : 
undef,

Some more language fixes:

commit 3efafee52414503a87332fa6070541a3311a408c
Author: dilipkumar 
Date:   Tue Sep 8 15:24:33 2020 +0530

Built-in compression method

+  If the compression method is not specified for the compressible type then
+  it will have the default compression method.  The default compression

I think this should say:
If no compression method is specified, then compressible types will have the
default compression method (pglz).

+ *
+ * Since version 11 TOAST_COMPRESS_SET_RAWSIZE also marks compressed

Should say v14 ??

diff --git a/src/include/catalog/pg_attribute.h 
b/src/include/catalog/pg_attribute.h
index 059dec3647..e4df6bc5c1 100644
--- a/src/include/catalog/pg_attribute.h
+++ b/src/include/catalog/pg_attribute.h
@@ -156,6 +156,14 @@ CATALOG(pg_attribute,1249,AttributeRelationId) 
BKI_BOOTSTRAP BKI_ROWTYPE_OID(75,
/* attribute's collation */
Oid attcollation;
 
+   /*
+* Oid of the compression method that will be used for compressing the 
value
+* for this attribute.  For the compressible atttypid this must always 
be a

say "For compressible types, ..."

+* valid Oid irrespective of what is the current value of the 
attstorage.
+* And for the incompressible atttypid this must always be an invalid 
Oid.

say "must be InvalidOid"
 
@@ -685,6 +686,7 @@ typedef enum TableLikeOption
CREATE_TABLE_LIKE_INDEXES = 1 << 5,
CREATE_TABLE_LIKE_STATISTICS = 1 << 6,
CREATE_TABLE_LIKE_STORAGE = 1 << 7,
+   CREATE_TABLE_LIKE_COMPRESSION = 1 << 8,

This is interesting...
I have a patch to implement LIKE .. (INCLUDING ACCESS METHOD).
I guess I should change it to say LIKE .. (TABLE ACCESS METHOD), right ?
https://commitfest.postgresql.org/31/2865/

Your first patch is large due to updating a large number of test cases to
include the "compression" column in \d+ output.  Maybe that column should be
hidden when HIDE_TABLEAM is set by pg_regress ?  I think that would allow
testing with alternate, default com

Re: Change default of checkpoint_completion_target

2021-01-19 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
> On 2021-01-13 23:10, Stephen Frost wrote:
> >>Yes, I agree, and am involved in that thread as well- currently waiting
> >>feedback from others about the proposed approach.
> >I've tried to push that forward.  I'm happy to update this patch once
> >we've got agreement to move forward on that, to wit, adding to an
> >'obsolete' section in the documentation information about this
> >particular GUC and how it's been removed due to not being sensible or
> >necessary to continue to have.
> 
> Some discussion a few days ago was arguing that it was still necessary in
> some cases as a way to counteract the possible lack of tuning in the kernel
> flushing behavior.  I think in light of that we should go with your first
> patch that just changes the default, possibly with the documentation updated
> a bit.

Rebased and updated patch attached which moves back to just changing the
default instead of removing the option, with a more explicit call-out of
the '90%', as suggested by Michael on the other patch.

Any further comments or thoughts on this one?

Thanks,

Stephen
From 335b8e630fae6c229f27f70f85847e29dfc1b783 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Tue, 19 Jan 2021 13:53:34 -0500
Subject: [PATCH] Change the default of checkpoint_completion_target to 0.9

Common recommendations are that the checkpoint should be spread out as
much as possible, provided we avoid having it take too long.  This
change updates the default to 0.9 (from 0.5) to match that
recommendation.

There was some debate about possibly removing the option entirely but it
seems there may be some corner-cases where having it set much lower to
try to force the checkpoint to be as fast as possible could result in
fewer periods of time of reduced performance due to kernel flushing.
General agreement is that the "spread more" is the preferred approach
though and those who need to tune away from that value are much less
common.

Reviewed-By: Michael Paquier, Peter Eisentraut
Discussion: https://postgr.es/m/20201207175329.GM16415%40tamriel.snowman.net
---
 doc/src/sgml/config.sgml  |  8 -
 doc/src/sgml/wal.sgml | 29 ---
 src/backend/utils/misc/guc.c  |  2 +-
 src/backend/utils/misc/postgresql.conf.sample |  2 +-
 src/test/recovery/t/015_promotion_pages.pl|  1 -
 5 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 82864bbb24..7e06d0febb 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3266,7 +3266,13 @@ include_dir 'conf.d'
   

 Specifies the target of checkpoint completion, as a fraction of
-total time between checkpoints. The default is 0.5.
+total time between checkpoints. The default is 0.9, which spreads the
+checkpoint across the entire checkpoint timeout period of time,
+providing a consistent amount of I/O during the entire checkpoint.
+Reducing this parameter is not recommended as that causes the I/O from
+the checkpoint to have to complete faster, resulting in a higher I/O
+rate, while then having a period of less I/O between the completion of
+the checkpoint and the start of the next scheduled checkpoint.
 This parameter can only be set in the postgresql.conf
 file or on the server command line.

diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 66de1ee2f8..733eba22db 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -571,22 +571,29 @@
writing dirty buffers during a checkpoint is spread over a period of time.
That period is controlled by
, which is
-   given as a fraction of the checkpoint interval.
+   given as a fraction of the checkpoint interval (configured by using
+   checkpoint_timeout).
The I/O rate is adjusted so that the checkpoint finishes when the
given fraction of
checkpoint_timeout seconds have elapsed, or before
max_wal_size is exceeded, whichever is sooner.
-   With the default value of 0.5,
+   With the default value of 0.9,
PostgreSQL can be expected to complete each checkpoint
-   in about half the time before the next checkpoint starts.  On a system
-   that's very close to maximum I/O throughput during normal operation,
-   you might want to increase checkpoint_completion_target
-   to reduce the I/O load from checkpoints.  The disadvantage of this is that
-   prolonging checkpoints affects recovery time, because more WAL segments
-   will need to be kept around for possible use in recovery.  Although
-   checkpoint_completion_target can be set as high as 1.0,
-   it is best to keep it less than that (perhaps 0.9 at most) since
-   checkpoints include some other activities besides writing dirty buffers.
+   a bit before the next scheduled checkpoint (at around 90% of the last checkpoint's
+   d

Re: Change default of checkpoint_completion_target

2021-01-19 Thread Tom Lane
Stephen Frost  writes:
> Any further comments or thoughts on this one?

This:

+total time between checkpoints. The default is 0.9, which spreads the
+checkpoint across the entire checkpoint timeout period of time,

is confusing because 0.9 is obviously not 1.0; people will wonder
whether the scale is something strange or the text is just wrong.
They will also wonder why not use 1.0 instead.  So perhaps more like

... The default is 0.9, which spreads the checkpoint across almost
all the available interval, providing fairly consistent I/O load
while also leaving some slop for checkpoint completion overhead.

The other chunk of text seems accurate, but there's no reason to let
this one be misleading.

regards, tom lane




Re: Change default of checkpoint_completion_target

2021-01-19 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Any further comments or thoughts on this one?
> 
> This:
> 
> +total time between checkpoints. The default is 0.9, which spreads the
> +checkpoint across the entire checkpoint timeout period of time,
> 
> is confusing because 0.9 is obviously not 1.0; people will wonder
> whether the scale is something strange or the text is just wrong.
> They will also wonder why not use 1.0 instead.  So perhaps more like
> 
>   ... The default is 0.9, which spreads the checkpoint across almost
>   all the available interval, providing fairly consistent I/O load
>   while also leaving some slop for checkpoint completion overhead.
> 
> The other chunk of text seems accurate, but there's no reason to let
> this one be misleading.

Good point, updated along those lines.

In passing, I noticed that we have a lot of documentation like:

This parameter can only be set in the postgresql.conf file or on the
server command line.

... which hasn't been true since the introduction of ALTER SYSTEM.  I
don't really think it's this patch's job to clean that up but it doesn't
seem quite right that we don't include ALTER SYSTEM in that list either.
If this was C code, maybe we could get away with just changing such
references as we find them, but I don't think we'd want the
documentation to be in an inconsistent state regarding that.

Anyone want to opine about what to do with that?  Should we consider
changing those to mention ALTER SYSTEM?  Or perhaps have a way of saying
"at server start" that then links to "how to set options at server
start", perhaps..

Thanks,

Stephen
From 97c24d92e4ae470a257aa2ac9501032aba5edd82 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Tue, 19 Jan 2021 13:53:34 -0500
Subject: [PATCH] Change the default of checkpoint_completion_target to 0.9

Common recommendations are that the checkpoint should be spread out as
much as possible, provided we avoid having it take too long.  This
change updates the default to 0.9 (from 0.5) to match that
recommendation.

There was some debate about possibly removing the option entirely but it
seems there may be some corner-cases where having it set much lower to
try to force the checkpoint to be as fast as possible could result in
fewer periods of time of reduced performance due to kernel flushing.
General agreement is that the "spread more" is the preferred approach
though and those who need to tune away from that value are much less
common.

Reviewed-By: Michael Paquier, Peter Eisentraut, Tom Lane
Discussion: https://postgr.es/m/20201207175329.GM16415%40tamriel.snowman.net
---
 doc/src/sgml/config.sgml  | 12 ++--
 doc/src/sgml/wal.sgml | 29 ---
 src/backend/utils/misc/guc.c  |  2 +-
 src/backend/utils/misc/postgresql.conf.sample |  2 +-
 src/test/recovery/t/015_promotion_pages.pl|  1 -
 5 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 82864bbb24..666b467eda 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3266,9 +3266,15 @@ include_dir 'conf.d'
   

 Specifies the target of checkpoint completion, as a fraction of
-total time between checkpoints. The default is 0.5.
-This parameter can only be set in the postgresql.conf
-file or on the server command line.
+total time between checkpoints. The default is 0.9, which spreads the
+checkpoint across almost all of the available interval, providing fairly
+consistent I/O load while also leaving some slop for checkpoint
+completion overhead.  Reducing this parameter is not recommended as that
+causes the I/O from the checkpoint to have to complete faster, resulting
+in a higher I/O rate, while then having a period of less I/O between the
+completion of the checkpoint and the start of the next scheduled
+checkpoint.  This parameter can only be set in the
+postgresql.conf file or on the server command line.

   
  
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 66de1ee2f8..733eba22db 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -571,22 +571,29 @@
writing dirty buffers during a checkpoint is spread over a period of time.
That period is controlled by
, which is
-   given as a fraction of the checkpoint interval.
+   given as a fraction of the checkpoint interval (configured by using
+   checkpoint_timeout).
The I/O rate is adjusted so that the checkpoint finishes when the
given fraction of
checkpoint_timeout seconds have elapsed, or before
max_wal_size is exceeded, whichever is sooner.
-   With the default value of 0.5,
+   With the default value of 0.9,
PostgreSQL can be expected to complete each checkpoint
-   in about half the time before the nex

Re: Change default of checkpoint_completion_target

2021-01-19 Thread Tom Lane
Stephen Frost  writes:
> In passing, I noticed that we have a lot of documentation like:

> This parameter can only be set in the postgresql.conf file or on the
> server command line.

> ... which hasn't been true since the introduction of ALTER SYSTEM.

Well, it's still true if you understand "the postgresql.conf file"
to cover whatever's included by postgresql.conf, notably
postgresql.auto.conf (and the include facility existed long before
that, too, so you needed the expanded interpretation even then).
Still, I take your point that it's confusing.

I like your suggestion of shortening all of these to be "can only be set
at server start", or maybe better "cannot be changed after server start".
I'm not sure whether or not we really need new text elsewhere; I think
section 20.1 is pretty long already.

regards, tom lane




Re: Add primary keys to system catalogs

2021-01-19 Thread Tom Lane
Laurenz Albe  writes:
> On Sun, 2021-01-17 at 17:07 -0500, Tom Lane wrote:
>> [...] do we really want to prefer
>> using the OID indexes as the primary keys?  In most cases there's some
>> other index that seems to me to be what a user would think of as the
>> pkey, for example pg_class_relname_nsp_index for pg_class or
>> pg_proc_proname_args_nsp_index for pg_proc.  Preferring OID where it
>> exists is a nice simple rule, which has some attractiveness, but the
>> OID indexes seem to me like a lookup aid rather than the "real" object
>> identity.

> I disagree.  The OID is the real, immutable identity of an object.
> The "relname" of a "pg_class" encatalogtry can change any time.
> Since there are no foreign keys that reference catalogs, that won't cause
> problems, but I still think that primary keys should change as little as
> possible.

Yeah, there's also the point that the OID is what we use for "foreign
key" references from other catalogs.  I don't deny that there are
reasons to think of OID as the pkey.  But as long as these constraints
are basically cosmetic, it seemed to me that we should consider the
other approach.  I'm not dead set on that, just wanted to discuss it.

regards, tom lane




Re: Support for NSS as a libpq TLS backend

2021-01-19 Thread Daniel Gustafsson
> On 4 Dec 2020, at 01:57, Jacob Champion  wrote:
> 
> On Nov 17, 2020, at 7:00 AM, Daniel Gustafsson  wrote:
>> 
>> Nice, thanks for the fix!  I've incorporated your patch into the attached v20
>> which also fixes client side error reporting to be more readable.
> 
> I was testing handshake failure modes and noticed that some FATAL
> messages are being sent through to the client in cleartext. The OpenSSL
> implementation doesn't do this, because it logs handshake problems at
> COMMERROR level. Should we switch all those ereport() calls in the NSS
> be_tls_open_server() to COMMERROR as well (and return explicitly), to
> avoid this? Or was there a reason for logging at FATAL/ERROR level?

The ERROR logging made early development easier but then stuck around, I've
changed them to COMMERROR returning an error instead in the v21 patch just
sent to the list.

> Related note, at the end of be_tls_open_server():
> 
>>...
>>port->ssl_in_use = true;
>>return 0;
>> 
>> error:
>>return 1;
>> }
> 
> This needs to return -1 in the error case; the only caller of
> secure_open_server() does a direct `result == -1` comparison rather than
> checking `result != 0`.

Fixed.

cheers ./daniel



Re: WIP: BRIN multi-range indexes

2021-01-19 Thread John Naylor
On Tue, Jan 12, 2021 at 1:42 PM Tomas Vondra 
wrote:
> I suspect it'd due to minmax having to decide which "ranges" to merge,
> which requires repeated sorting, etc. I certainly don't dare to claim
> the current algorithm is perfect. I wouldn't have expected such big
> difference, though - so definitely worth investigating.

It seems that monotonically increasing (or decreasing) values in a table
are a worst case scenario for multi-minmax indexes, or basically, unique
values within a range. I'm guessing it's because it requires many passes
to fit all the values into a limited number of ranges. I tried using
smaller pages_per_range numbers, 32 and 8, and that didn't help.

Now, with a different data distribution, using only 10 values that repeat
over and over, the results are much more sympathetic to multi-minmax:

insert into iot (num, create_dt)
select random(), '2020-01-01 0:00'::timestamptz + (x % 10 || '
seconds')::interval
from generate_series(1,5*365*24*60*60) x;

create index cd_single on iot using brin(create_dt);
27.2s

create index cd_multi on iot using brin(create_dt
timestamptz_minmax_multi_ops);
30.4s

create index cd_bt on iot using btree(create_dt);
61.8s

Circling back to the monotonic case, I tried running a simple perf record
on a backend creating a multi-minmax index on a timestamptz column and
these were the highest non-kernel calls:
+   21.98%21.91%  postgres postgres[.]
FunctionCall2Coll
+9.31% 9.29%  postgres postgres[.]
compare_combine_ranges
+8.60% 8.58%  postgres postgres[.] qsort_arg
+5.68% 5.66%  postgres postgres[.]
brin_minmax_multi_add_value
+5.63% 5.60%  postgres postgres[.] timestamp_lt
+4.73% 4.71%  postgres postgres[.]
reduce_combine_ranges
+3.80% 0.00%  postgres [unknown]   [.]
0x032001680004
+3.51% 3.50%  postgres postgres[.] timestamp_eq

There's no one place that's pathological enough to explain the 4x slowness
over traditional BRIN and nearly 3x slowness over btree when using a large
number of unique values per range, so making progress here would have to
involve a more holistic approach.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [Patch] ALTER SYSTEM READ ONLY

2021-01-19 Thread Robert Haas
On Thu, Jan 14, 2021 at 6:29 AM Amul Sul  wrote:
> To move development, testing, and the review forward, I have commented out the
> code acquiring CheckpointLock from CreateCheckPoint() in the 0003 patch and
> added the changes for the checkpointer so that system read-write state change
> request can be processed as soon as possible, as suggested by Robert[1].

I am extremely doubtful about SetWALProhibitState()'s claim that "The
final state can only be requested by the checkpointer or by the
single-user so that there will be no chance that the server is already
in the desired final state." It seems like there is an obvious race
condition: CompleteWALProhibitChange() is called with a cur_state_gen
argument which embeds the last state we saw, but there's nothing to
keep it from changing between the time we saw it and the time that
function calls SetWALProhibitState(), is there? We aren't holding any
lock. It seems to me that SetWALProhibitState() needs to be rewritten
to avoid this assumption.

On a related note, SetWALProhibitState() has only two callers. One
passes is_final_state as true, and the other as false: it's never a
variable. The two cases are handled mostly differently. This doesn't
seem good. A lot of the logic in this function should probably be
moved to the calling sites, especially because it's almost certainly
wrong for this function to be basing what it does on the *current* WAL
prohibit state rather than the WAL prohibit state that was in effect
at the time we made the decision to call this function in the first
place. As I mentioned in the previous paragraph, that's a built-in
race condition. To put that another way, this function should NOT feel
free to call GetWALProhibitStateGen().

I don't really see why we should have both an SQL callable function
pg_alter_wal_prohibit_state() and also a DDL command for this. If
we're going to go with a functional interface, and I guess the idea of
that is to make it so GRANT EXECUTE works, then why not just get rid
of the DDL?

RequestWALProhibitChange() doesn't look very nice. It seems like it's
basically the second half of pg_alter_wal_prohibit_state(), not being
called from anywhere else. It doesn't seem to add anything to separate
it out like this; the interface between the two is not especially
clean.

It seems odd that ProcessWALProhibitStateChangeRequest() returns
without doing anything if !AmCheckpointerProcess(), rather than having
that be an Assert(). Why is it like that?

I think WALProhibitStateShmemInit() would probably look more similar
to other functions if it did if (found) { stuff; } rather than if
(!found) return; stuff; -- but I might be wrong about the existing
precedent.

The SetLastCheckPointSkipped() and LastCheckPointIsSkipped() stuff
seems confusingly-named, because we have other reasons for skipping a
checkpoint that are not what we're talking about here. I think this is
talking about whether we've performed a checkpoint after recovery, and
the naming should reflect that. But I think there's something else
wrong with the design, too: why is this protected by a spinlock? I
have questions in both directions. On the one hand, I wonder why we
need any kind of lock at all. On the other hand, if we do need a lock,
I wonder why a spinlock that protects only the setting and clearing of
the flag and nothing else is sufficient. There are zero comments
explaining what the idea behind this locking regime is, and I can't
understand why it should be correct.

In fact, I think this area needs a broader rethink. Like, the way you
integrated that stuff into StartupXLog(), it sure looks to me like we
might skip the checkpoint but still try to write other WAL records.
Before we reach the offending segment of code, we call
UpdateFullPageWrites(). Afterwards, we call XLogReportParameters().
Both of those are going to potentially write WAL. I guess you could
argue that's OK, on the grounds that neither function is necessarily
going to log anything, but I don't think I believe that. If I make my
server read only, take the OS down, change some GUCs, and then start
it again, I don't expect it to PANIC.

Also, I doubt that it's OK to skip the checkpoint as this code does
and then go ahead and execute recovery_end_command and update the
control file anyway. It sure looks like the existing code is written
with the assumption that the checkpoint happens before those other
things. One idea I just had was: suppose that, if the system is READ
ONLY, we don't actually exit recovery right away, and the startup
process doesn't exit. Instead we just sit there and wait for the
system to be made read-write again before doing anything else. But
then if hot_standby=false, there's no way for someone to execute a
ALTER SYSTEM READ WRITE and/or pg_alter_wal_prohibit_state(), which
seems bad. So perhaps we need to let in regular connections *as if*
the system were read-write while postponing not just the
end-of-recovery checkpoint but also the other associated things

Re: Some coverage for DROP OWNED BY with pg_default_acl

2021-01-19 Thread Alvaro Herrera
On 2021-Jan-19, Michael Paquier wrote:

> And while reviewing the thing, I have spotted that there is a specific
> path for pg_default_acl in RemoveRoleFromObjectACL() that has zero
> coverage.  This can be triggered with DROP OWNED BY, and it is
> actually safe to run as long as this is done in a separate transaction
> to avoid any interactions with parallel regression sessions.
> privileges.sql already has similar tests, so I'd like to add some
> coverage as per the attached (the duplicated role name is wanted).

Heh, interesting case.  Added coverage is good, so +1.
Since the role regress_priv_user2 is "private" to the privileges.sql
script, there's no danger of a concurrent test getting the added lines
in trouble AFAICS.

> +SELECT count(*) FROM pg_shdepend
> +  WHERE deptype = 'a' AND
> +refobjid = 'regress_priv_user2'::regrole AND
> + classid = 'pg_default_acl'::regclass;
> + count 
> +---
> + 5
> +(1 row)

Shrug.  Seems sufficient.

-- 
Álvaro Herrera   Valdivia, Chile




Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread Tom Lane
James Hilliard  writes:
> On Tue, Jan 19, 2021 at 10:17 AM Tom Lane  wrote:
>> Ah, got it.  So "xcrun --show-sdk-path" tells us the right thing (that
>> is, it *does* give us a symlink to a 10.15 SDK) but by refusing to
>> believe we've got the right thing, we end up picking MacOSX11.1.sdk.
>> Drat.  I suppose we could drop the heuristic about wanting a version
>> number in the SDK path, but I really don't want to do that.  Now I'm
>> thinking about trying to dereference the symlink after the first step.

> The MacOSX11.1.sdk can build for a 10.15 target just fine when passed
> an appropriate MACOSX_DEPLOYMENT_TARGET, so that SDK should be
> fine.

But our out-of-the-box default should be to build for the current
platform; we don't want users to have to set MACOSX_DEPLOYMENT_TARGET
for that case.  Besides, the problem we're having is exactly that Apple's
definition of "builds for a 10.15 target just fine" is different from
ours.  They think you should use a run-time test not a compile-time test
to discover whether preadv is available, and we don't want to do that.

In almost all of the cases I've seen so far, Apple's compiler actually
does default to using an SDK matching the platform.  The problem we
have is that we try to name the SDK explicitly, and the current
method is failing to pick the right one in your case.  There are
several reasons for using an explicit -isysroot rather than just
letting the compiler default:

* We have seen cases in which the compiler acts as though it has
*no* default sysroot, and we have to help it out.

* The explicit root reduces version-skew build hazards for extensions
that are not built at the same time as the core system.

* There are a few tests in configure itself that need to know the
sysroot path to check for files there.

Anyway, the behavior you're seeing shows that 4823621db is still a
bit shy of a load.  I'm thinking about the attached as a further
fix --- can you verify it helps for you?

regards, tom lane

diff --git a/src/template/darwin b/src/template/darwin
index 1868c147cb..e14d53b601 100644
--- a/src/template/darwin
+++ b/src/template/darwin
@@ -7,13 +7,20 @@
 if test x"$PG_SYSROOT" = x"" ; then
   # This is far more complicated than it ought to be.  We first ask
   # "xcrun --show-sdk-path", which seems to match the default -isysroot
-  # setting of Apple's compilers.  However, that may produce no result or
-  # a result that is not version-specific (i.e., just ".../SDKs/MacOSX.sdk").
-  # Using a version-specific sysroot seems desirable, so if there are not
-  # digits in the directory name, try "xcrun --sdk macosx --show-sdk-path";
-  # and if that still doesn't work, fall back to asking xcodebuild,
-  # which is often a good deal slower.
+  # setting of Apple's compilers.
   PG_SYSROOT=`xcrun --show-sdk-path 2>/dev/null`
+  # That may fail, or produce a result that is not version-specific (i.e.,
+  # just ".../SDKs/MacOSX.sdk").  Using a version-specific sysroot seems
+  # desirable, so if the path is a non-version-specific symlink, expand it.
+  if test -L "$PG_SYSROOT"; then
+if expr x"$PG_SYSROOT" : '.*[0-9]\.[0-9][^/]*$' >/dev/null ; then : okay
+else
+  PG_SYSROOT=`expr "$PG_SYSROOT" : '\(.*\)/'`/`readlink "$PG_SYSROOT"`
+fi
+  fi
+  # If there are still not digits in the directory name, try
+  # "xcrun --sdk macosx --show-sdk-path"; and if that still doesn't work,
+  # fall back to asking xcodebuild, which is often a good deal slower.
   if expr x"$PG_SYSROOT" : '.*[0-9]\.[0-9][^/]*$' >/dev/null ; then : okay
   else
 PG_SYSROOT=`xcrun --sdk macosx --show-sdk-path 2>/dev/null`


compression libraries and CF bot

2021-01-19 Thread Justin Pryzby
Do you know if the old travis build environment had liblz4 installed ?

I'm asking regarding Dilip's patch, which was getting to "check world" 2 weeks
ago but now failing to even compile, not apparently due to any change in the
patch.  Also, are the historic logs available somewhere ?
http://cfbot.cputube.org/dilip-kumar.html

Also, what's the process for having new libraries installed in the CI
environment ?

There's 3 compression patches going around, so I think eventually we'll ask to
get libzstd-devel (for libpq and pg_dump) and liblz4-devel (for toast and
libpq).  Maybe all compression methods would be supported in each place - I
hope the patches will share common code.

-- 
Justin




Re: Printing backtrace of postgres processes

2021-01-19 Thread Robert Haas
On Tue, Jan 19, 2021 at 12:50 PM Tom Lane  wrote:
> The thing that is scaring me the most is the "broadcast" aspect.
> For starters, I think that that is going to be useless in the
> field because of the likelihood that different backends' stack
> traces will get interleaved in whatever log file the traces are
> going to.  Also, having hundreds of processes spitting dozens of
> lines to the same place at the same time is going to expose any
> little weaknesses in your logging arrangements, such as rsyslog's
> tendency to drop messages under load.

+1. I don't think broadcast is a good idea.

> I think it's got security hazards as well.  If we restricted the
> feature to cause a trace of only one process at a time, and required
> that process to be logged in as the same database user as the
> requestor (or at least someone with the privs of that user, such
> as a superuser), that'd be less of an issue.

I am not sure I see a security hazard here. I think the checks for
this should have the same structure as pg_terminate_backend() and
pg_cancel_backend(); whatever is required there should be required
here, too, but not more, unless we have a real clear reason for such
an inconsistency.

> Beyond that, well, maybe it's all right.  In theory anyplace that
> there's a CHECK_FOR_INTERRUPTS should be okay to call elog from;
> but it's completely untested whether we can do that and then
> continue, as opposed to aborting the transaction or whole session.

I guess that's a theoretical risk but it doesn't seem very likely.
And, if we do have such a problem, I think that'd probably be a case
of bad code that we would want to fix either way.

> I share your estimate that there'll be small-fraction-of-a-percent
> hazards that could still add up to dangerous instability if people
> go wild with this.

Right. I was more concerned about whether we could, for example, crash
while inside the function that generates the backtrace, on some
platforms or in some scenarios. That would be super-sad. I assume that
the people who wrote the code tried to make sure that didn't happen
but I don't really know what's reasonable to expect.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2021-01-19 Thread Matthias van de Meent
On Mon, 18 Jan 2021, 21:25 Álvaro Herrera,  wrote:
>
> On 2021-Jan-18, Matthias van de Meent wrote:
>
> > Example:
> >
> > 1.) RI starts
> > 2.) PHASE 2: filling the index:
> > 2.1.) scanning the heap (live tuple is cached)
> > < tuple is deleted
> > < last transaction other than RI commits, only snapshot of RI exists
> > < vacuum drops the tuple, and cannot remove it from the new index
> > because this new index is not yet populated.
> > 2.2.) sorting tuples
> > 2.3.) index filled with tuples, incl. deleted tuple
> > 3.) PHASE 3: wait for transactions
> > 4.) PHASE 4: validate does not remove the tuple from the index,
> > because it is not built to do so: it will only insert new tuples.
> > Tuples that are marked for deletion are removed from the index only
> > through VACUUM (and optimistic ALL_DEAD detection).
> >
> > According to my limited knowledge of RI, it requires VACUUM to not run
> > on the table during the initial index build process (which is
> > currently guaranteed through the use of a snapshot).
>
> VACUUM cannot run concurrently with CIC or RI in a table -- both acquire
> ShareUpdateExclusiveLock, which conflicts with itself, so this cannot
> occur.

Yes, you are correct. Vacuum indeed has a ShareUpdateExclusiveLock.
Are there no other ways that pages are optimistically pruned?

But the base case still stands, ignoring CIC snapshots in  would give
the semantic of all_dead to tuples that are actually still considered
alive in some context, and should not yet be deleted (you're deleting
data from an in-use snapshot). Any local pruning optimizations using
all_dead mechanics now cannot be run on the table unless they hold an
ShareUpdateExclusiveLock; though I'm unaware of any such mechanisms
(other than below).

> I do wonder if the problem you suggest (or something similar) can occur
> via HOT pruning, though.

It could not, at least not at the current HEAD, as only one tuple in a
HOT-chain can be alive at one point, and all indexes point to the root
of the HOT-chain, which is never HOT-pruned. See also the
src/backend/access/heap/README.HOT.

Regards,

Matthias van de Meent




Re: [PATCH v2 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread Tom Lane
James Hilliard  writes:
> I also don't think this is a weak symbol.

> From the header file it is not have __attribute__((weak_import)):
> ssize_t pwritev(int, const struct iovec *, int, off_t)
> __DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0),
> watchos(7.0), tvos(14.0));

See the other thread.  I found by looking at the asm output that
what __API_AVAILABLE actually does is cause the compiler to emit
a ".weak_reference" directive when calling a function it thinks
might not be available.  So there's some sort of weak linking
going on, though it's certainly possible that it's not shaped
in a way that'd help us do this the way we'd prefer.

regards, tom lane




Re: Add primary keys to system catalogs

2021-01-19 Thread Joel Jacobson
On Tue, Jan 19, 2021, at 18:25, Mark Rofail wrote:
>Dear Joel,
>
>I would love to start working on this again, I tried to revive the patch again 
>in 2019, however, I faced the same problems as >last time (detailed in the 
>thread) and I didn't get the support I needed to continue with this patch.
>
>Are you willing to help rebase and finally publish this patch?

Willing, sure, not perhaps not able,
if there are complex problems that need a deep understanding of code base,
I'm not sure I will be of much help, but I can for sure do testing and 
reviewing.

/Joel

Re: create table like: ACCESS METHOD

2021-01-19 Thread Justin Pryzby
On Wed, Dec 30, 2020 at 12:33:56PM +, Simon Riggs wrote:
> There are no tests for the new functionality, please could you add some?

Did you look at the most recent patch?

+CREATE ACCESS METHOD heapdup TYPE TABLE HANDLER heap_tableam_handler;
+CREATE TABLE likeam() USING heapdup;
+CREATE TABLE likeamlike(LIKE likeam INCLUDING ALL);

   

Also, I just realized that Dilip's toast compression patch adds "INCLUDING
COMPRESSION", which is stored in pg_am.  That's an implementation detail of
that patch, but it's not intuitive that "including access method" wouldn't
include the compression stored there.  So I think this should use "INCLUDING
TABLE ACCESS METHOD" not just ACCESS METHOD.  

-- 
Justin
>From f27fd6291aa10af1ca0be4bc72a656811c8e0c9f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 15 Nov 2020 16:54:53 -0600
Subject: [PATCH v3] create table (like .. including ACCESS METHOD)

---
 doc/src/sgml/ref/create_table.sgml  | 12 +++-
 src/backend/parser/gram.y   |  1 +
 src/backend/parser/parse_utilcmd.c  | 10 ++
 src/include/nodes/parsenodes.h  |  1 +
 src/test/regress/expected/create_table_like.out | 12 
 src/test/regress/sql/create_table_like.sql  |  8 
 6 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 569f4c9da7..e3c607f6b1 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -87,7 +87,7 @@ class="parameter">referential_action ] [ ON UPDATE and like_option is:
 
-{ INCLUDING | EXCLUDING } { COMMENTS | CONSTRAINTS | DEFAULTS | GENERATED | IDENTITY | INDEXES | STATISTICS | STORAGE | ALL }
+{ INCLUDING | EXCLUDING } { COMMENTS | CONSTRAINTS | DEFAULTS | GENERATED | IDENTITY | INDEXES | STATISTICS | STORAGE | TABLE ACCESS METHOD | ALL }
 
 and partition_bound_spec is:
 
@@ -689,6 +689,16 @@ WITH ( MODULUS numeric_literal, REM
 

 
+   
+INCLUDING TABLE ACCESS METHOD
+
+ 
+  The table's access method will be copied.  By default, the
+  default_table_access_method is used.
+ 
+
+   
+

 INCLUDING ALL
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 31c95443a5..719ac838e3 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3708,6 +3708,7 @@ TableLikeOption:
 | INDEXES			{ $$ = CREATE_TABLE_LIKE_INDEXES; }
 | STATISTICS		{ $$ = CREATE_TABLE_LIKE_STATISTICS; }
 | STORAGE			{ $$ = CREATE_TABLE_LIKE_STORAGE; }
+| TABLE ACCESS METHOD { $$ = CREATE_TABLE_LIKE_TABLE_ACCESS_METHOD; }
 | ALL{ $$ = CREATE_TABLE_LIKE_ALL; }
 		;
 
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index b31f3afa03..f34f42aae3 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -96,6 +96,7 @@ typedef struct
 	bool		ispartitioned;	/* true if table is partitioned */
 	PartitionBoundSpec *partbound;	/* transformed FOR VALUES */
 	bool		ofType;			/* true if statement contains OF typename */
+	char		*accessMethod;	/* table access method */
 } CreateStmtContext;
 
 /* State shared by transformCreateSchemaStmt and its subroutines */
@@ -252,6 +253,7 @@ transformCreateStmt(CreateStmt *stmt, const char *queryString)
 	cxt.ispartitioned = stmt->partspec != NULL;
 	cxt.partbound = stmt->partbound;
 	cxt.ofType = (stmt->ofTypename != NULL);
+	cxt.accessMethod = NULL;
 
 	Assert(!stmt->ofTypename || !stmt->inhRelations);	/* grammar enforces */
 
@@ -346,6 +348,9 @@ transformCreateStmt(CreateStmt *stmt, const char *queryString)
 	stmt->tableElts = cxt.columns;
 	stmt->constraints = cxt.ckconstraints;
 
+	if (cxt.accessMethod != NULL)
+		stmt->accessMethod = cxt.accessMethod;
+
 	result = lappend(cxt.blist, stmt);
 	result = list_concat(result, cxt.alist);
 	result = list_concat(result, save_alist);
@@ -1118,6 +1123,11 @@ transformTableLikeClause(CreateStmtContext *cxt, TableLikeClause *table_like_cla
 		cxt->likeclauses = lappend(cxt->likeclauses, table_like_clause);
 	}
 
+	/* ACCESS METHOD doesn't apply and isn't copied for partitioned tables */
+	if ((table_like_clause->options & CREATE_TABLE_LIKE_TABLE_ACCESS_METHOD) != 0 &&
+		!cxt->ispartitioned)
+		cxt->accessMethod = get_am_name(relation->rd_rel->relam);
+
 	/*
 	 * We may copy extended statistics if requested, since the representation
 	 * of CreateStatsStmt doesn't depend on column numbers.
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index dc2bb40926..600856c229 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -685,6 +685,7 @@ typedef enum TableLikeOption
 	CREATE_TABLE_LIKE

cfbot building docs - serving results

2021-01-19 Thread Erik Rijkers

Hi Thomas,

I am wondering if the cfbot at the moment is building the docs 
(html+pdf), for the patches that it tests.  I suppose that it does?  If 
so, what happens with the resulting (doc)files? To /dev/null?   They are 
not available as far as I can see.  Would it be feasible to make them 
available, either serving the html, or to make docs html+pdf a 
downloadable zipfile?


(it would also be useful to be able see at a glance somewhere if the 
patch contains sgml-changes at all...)



Thanks,

Erik Rijkers




Re: Printing backtrace of postgres processes

2021-01-19 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jan 19, 2021 at 12:50 PM Tom Lane  wrote:
>> I think it's got security hazards as well.  If we restricted the
>> feature to cause a trace of only one process at a time, and required
>> that process to be logged in as the same database user as the
>> requestor (or at least someone with the privs of that user, such
>> as a superuser), that'd be less of an issue.

> I am not sure I see a security hazard here. I think the checks for
> this should have the same structure as pg_terminate_backend() and
> pg_cancel_backend(); whatever is required there should be required
> here, too, but not more, unless we have a real clear reason for such
> an inconsistency.

Yeah, agreed.  The "broadcast" option seems inconsistent with doing
things that way, but I don't have a problem with being able to send
a trace signal to the same processes you could terminate.

>> I share your estimate that there'll be small-fraction-of-a-percent
>> hazards that could still add up to dangerous instability if people
>> go wild with this.

> Right. I was more concerned about whether we could, for example, crash
> while inside the function that generates the backtrace, on some
> platforms or in some scenarios. That would be super-sad. I assume that
> the people who wrote the code tried to make sure that didn't happen
> but I don't really know what's reasonable to expect.

Recursion is scary, but it should (I think) not be possible if this
is driven off CHECK_FOR_INTERRUPTS.  There will certainly be none
of those in libbacktrace.

One point here is that it might be a good idea to suppress elog.c's
calls to functions in the error context stack.  As we saw in another
connection recently, allowing that to happen makes for a *very*
large increase in the footprint of code that you are expecting to
work at any random CHECK_FOR_INTERRUPTS call site.

BTW, it also looks like the patch is doing nothing to prevent the
backtrace from being sent to the connected client.  I'm not sure
what I think about whether it'd be okay from a security standpoint
to do that on the connection that requested the trace, but I sure
as heck don't want it to happen on connections that didn't.  Also,
whatever you think about security concerns, it seems like there'd be
pretty substantial reentrancy hazards if the interrupt occurs
anywhere in code dealing with normal client communication.

Maybe, given all of these things, we should forget using elog at
all and just emit the trace with fprintf(stderr).  That seems like
it would decrease the odds of trouble by about an order of magnitude.
It would make it more painful to collect the trace in some logging
setups, of course.

regards, tom lane




Re: compression libraries and CF bot

2021-01-19 Thread Thomas Munro
On Wed, Jan 20, 2021 at 9:56 AM Justin Pryzby  wrote:
> Do you know if the old travis build environment had liblz4 installed ?

It sounds like it.

> I'm asking regarding Dilip's patch, which was getting to "check world" 2 weeks
> ago but now failing to even compile, not apparently due to any change in the
> patch.  Also, are the historic logs available somewhere ?
> http://cfbot.cputube.org/dilip-kumar.html

I can find some of them but not that one, because Travis's "branches"
page truncates well before our ~250 active branches, and that one
isn't in there.

https://travis-ci.org/github/postgresql-cfbot/postgresql/branches

> Also, what's the process for having new libraries installed in the CI
> environment ?

I have added lz4 to the FreeBSD and Ubuntu build tasks, so we'll see
if that helps at the next periodic build or when a new patch is
posted.  It's failing on Windows because there is no HAVE_LIBLZ4 in
Solution.pm, and I don't know how to install that on a Mac.  Is this
patch supposed to be adding a new required dependency, or a new
optional dependency?

In general, you could ask for changes here, or send me a pull request for eg:

https://github.com/macdice/cfbot/blob/master/cirrus/.cirrus.yml

If we eventually think the CI control file is good enough, and can get
past the various political discussions required to put CI
vendor-specific material in our tree, it'd be just a regular patch
proposal and could even be tweaked as part of a feature submission.

> There's 3 compression patches going around, so I think eventually we'll ask to
> get libzstd-devel (for libpq and pg_dump) and liblz4-devel (for toast and
> libpq).  Maybe all compression methods would be supported in each place - I
> hope the patches will share common code.

+1, nice to see modern compression coming to PostgreSQL.




Re: compression libraries and CF bot

2021-01-19 Thread Tom Lane
Thomas Munro  writes:
> On Wed, Jan 20, 2021 at 9:56 AM Justin Pryzby  wrote:
>> Also, what's the process for having new libraries installed in the CI
>> environment ?

> I have added lz4 to the FreeBSD and Ubuntu build tasks, so we'll see
> if that helps at the next periodic build or when a new patch is
> posted.  It's failing on Windows because there is no HAVE_LIBLZ4 in
> Solution.pm, and I don't know how to install that on a Mac.  Is this
> patch supposed to be adding a new required dependency, or a new
> optional dependency?

It had better be optional.

regards, tom lane




Re: cfbot building docs - serving results

2021-01-19 Thread Thomas Munro
On Wed, Jan 20, 2021 at 10:22 AM Erik Rijkers  wrote:
> I am wondering if the cfbot at the moment is building the docs
> (html+pdf), for the patches that it tests.  I suppose that it does?  If
> so, what happens with the resulting (doc)files? To /dev/null?   They are
> not available as far as I can see.  Would it be feasible to make them
> available, either serving the html, or to make docs html+pdf a
> downloadable zipfile?

It does build the docs as part of the Linux build.  I picked that
because Cirrus has more Linux horsepower available than the other
OSes, and there's no benefit to doing that on all the OSes.

That's a good idea, and I suspect it could be handled as an
"artifact", though I haven't looked into that:

https://cirrus-ci.org/guide/writing-tasks/#artifacts-instruction

It'd also be nice to (somehow) know which .html pages changed so you
could go straight to the new stuff without the intermediate step of
wondering where .sgml changes come out!

Another good use for artifacts that I used once or twice is the
ability to allow the results of the Windows build to be downloaded in
a .zip file and tested by non-developers without the build tool chain.

> (it would also be useful to be able see at a glance somewhere if the
> patch contains sgml-changes at all...)

True.  Basically you want to be able to find the diffstat output quickly.




Re: Odd, intermittent failure in contrib/pageinspect

2021-01-19 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Jan 18, 2021 at 05:47:40PM -0500, Tom Lane wrote:
>> So, do we have any other tests that are invoking a manual vacuum
>> and assuming it won't skip any pages?  By this theory, they'd
>> all be failures waiting to happen.

> check_heap.sql and heap_surgery.sql have one VACUUM FREEZE each and it
> seems to me that we had better be sure that no pages are skipped for
> their cases?

It looks to me like heap_surgery ought to be okay, because it's operating
on a temp table; if there are any page access conflicts on that, we've
got BIG trouble ;-)

Poking around, I found a few other places where it looked like a skipped
page could produce diffs in the expected output:
contrib/amcheck/t/001_verify_heapam.pl
contrib/pg_visibility/sql/pg_visibility.sql

There are lots of other vacuums of course, but they don't look like
a missed page would have any effect on the visible results, so I think
we should leave them alone.

In short I propose the attached patch, which also gets rid of
that duplicate query.

regards, tom lane

diff --git a/contrib/amcheck/expected/check_heap.out b/contrib/amcheck/expected/check_heap.out
index 882f853d56..1fb3823142 100644
--- a/contrib/amcheck/expected/check_heap.out
+++ b/contrib/amcheck/expected/check_heap.out
@@ -109,7 +109,7 @@ ERROR:  ending block number must be between 0 and 0
 SELECT * FROM verify_heapam(relation := 'heaptest', startblock := 1, endblock := 11000);
 ERROR:  starting block number must be between 0 and 0
 -- Vacuum freeze to change the xids encountered in subsequent tests
-VACUUM FREEZE heaptest;
+VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) heaptest;
 -- Check that valid options are not rejected nor corruption reported
 -- for a non-empty frozen table
 SELECT * FROM verify_heapam(relation := 'heaptest', skip := 'none');
diff --git a/contrib/amcheck/sql/check_heap.sql b/contrib/amcheck/sql/check_heap.sql
index c10a25f21c..298de6886a 100644
--- a/contrib/amcheck/sql/check_heap.sql
+++ b/contrib/amcheck/sql/check_heap.sql
@@ -51,7 +51,7 @@ SELECT * FROM verify_heapam(relation := 'heaptest', startblock := 0, endblock :=
 SELECT * FROM verify_heapam(relation := 'heaptest', startblock := 1, endblock := 11000);
 
 -- Vacuum freeze to change the xids encountered in subsequent tests
-VACUUM FREEZE heaptest;
+VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) heaptest;
 
 -- Check that valid options are not rejected nor corruption reported
 -- for a non-empty frozen table
diff --git a/contrib/amcheck/t/001_verify_heapam.pl b/contrib/amcheck/t/001_verify_heapam.pl
index 1581e51f3c..a2f65b826d 100644
--- a/contrib/amcheck/t/001_verify_heapam.pl
+++ b/contrib/amcheck/t/001_verify_heapam.pl
@@ -46,7 +46,7 @@ detects_heap_corruption(
 # Check a corrupt table with all-frozen data
 #
 fresh_test_table('test');
-$node->safe_psql('postgres', q(VACUUM FREEZE test));
+$node->safe_psql('postgres', q(VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) test));
 corrupt_first_page('test');
 detects_heap_corruption("verify_heapam('test')",
 	"all-frozen corrupted table");
diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out
index 4cd0db8018..4da28f0a1d 100644
--- a/contrib/pageinspect/expected/page.out
+++ b/contrib/pageinspect/expected/page.out
@@ -1,7 +1,7 @@
 CREATE EXTENSION pageinspect;
 CREATE TABLE test1 (a int, b int);
 INSERT INTO test1 VALUES (16777217, 131584);
-VACUUM test1;  -- set up FSM
+VACUUM (DISABLE_PAGE_SKIPPING) test1;  -- set up FSM
 -- The page contents can vary, so just test that it can be read
 -- successfully, but don't keep the output.
 SELECT octet_length(get_raw_page('test1', 'main', 0)) AS main_0;
@@ -87,18 +87,8 @@ SELECT * FROM fsm_page_contents(get_raw_page('test1', 'fsm', 0));
 (1 row)
 
 -- If we freeze the only tuple on test1, the infomask should
--- always be the same in all test runs. we show raw flags by
--- default: HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID.
-VACUUM FREEZE test1;
-SELECT t_infomask, t_infomask2, raw_flags, combined_flags
-FROM heap_page_items(get_raw_page('test1', 0)),
- LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2);
- t_infomask | t_infomask2 | raw_flags |   combined_flags   
-+-+---+
-   2816 |   2 | {HEAP_XMIN_COMMITTED,HEAP_XMIN_INVALID,HEAP_XMAX_INVALID} | {HEAP_XMIN_FROZEN}
-(1 row)
-
--- output the decoded flag HEAP_XMIN_FROZEN instead
+-- always be the same in all test runs.
+VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) test1;
 SELECT t_infomask, t_infomask2, raw_flags, combined_flags
 FROM heap_page_items(get_raw_page('test1', 0)),
  LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2);
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index 01844cb629..d333b763d7 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql

Re: Add primary keys to system catalogs

2021-01-19 Thread Mark Rofail
I'll post tomorrow the latest rebased patch with a summary of the issues.
Let's continue this in the foreign key array thread, as to not clutter this
thread.

Regards, Mark Rofail.

On Tue, Jan 19, 2021, 11:00 PM Joel Jacobson  wrote:

> On Tue, Jan 19, 2021, at 18:25, Mark Rofail wrote:
> >Dear Joel,
> >
> >I would love to start working on this again, I tried to revive the patch
> again in 2019, however, I faced the same problems as >last time (detailed
> in the thread) and I didn't get the support I needed to continue with this
> patch.
> >
> >Are you willing to help rebase and finally publish this patch?
>
> Willing, sure, not perhaps not able,
> if there are complex problems that need a deep understanding of code base,
> I'm not sure I will be of much help, but I can for sure do testing and
> reviewing.
>
> /Joel
>


Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
On Tue, Jan 19, 2021 at 1:54 PM Tom Lane  wrote:
>
> James Hilliard  writes:
> > On Tue, Jan 19, 2021 at 10:17 AM Tom Lane  wrote:
> >> Ah, got it.  So "xcrun --show-sdk-path" tells us the right thing (that
> >> is, it *does* give us a symlink to a 10.15 SDK) but by refusing to
> >> believe we've got the right thing, we end up picking MacOSX11.1.sdk.
> >> Drat.  I suppose we could drop the heuristic about wanting a version
> >> number in the SDK path, but I really don't want to do that.  Now I'm
> >> thinking about trying to dereference the symlink after the first step.
>
> > The MacOSX11.1.sdk can build for a 10.15 target just fine when passed
> > an appropriate MACOSX_DEPLOYMENT_TARGET, so that SDK should be
> > fine.
>
> But our out-of-the-box default should be to build for the current
> platform; we don't want users to have to set MACOSX_DEPLOYMENT_TARGET
> for that case.  Besides, the problem we're having is exactly that Apple's
> definition of "builds for a 10.15 target just fine" is different from
> ours.  They think you should use a run-time test not a compile-time test
> to discover whether preadv is available, and we don't want to do that.
The default for MACOSX_DEPLOYMENT_TARGET is always the current
running OS version from my understanding. So if I build with MacOSX11.1.sdk
on 10.15 with default settings the binaries will work fine because the
MACOSX_DEPLOYMENT_TARGET gets set to 10.15 automatically even
if the same SDK is capable of producing incompatible binaries if you set
MACOSX_DEPLOYMENT_TARGET to 11.0.
>
> In almost all of the cases I've seen so far, Apple's compiler actually
> does default to using an SDK matching the platform.  The problem we
> have is that we try to name the SDK explicitly, and the current
> method is failing to pick the right one in your case.  There are
> several reasons for using an explicit -isysroot rather than just
> letting the compiler default:
No, it's only the MACOSX_DEPLOYMENT_TARGET that matches the
platform, SDK can be arbitrary more or less, but it will work fine because
the autoselected MACOSX_DEPLOYMENT_TARGET will force compatibility
no matter what SDK version you use. This is always how it has worked
from what I've seen.
>
> * We have seen cases in which the compiler acts as though it has
> *no* default sysroot, and we have to help it out.
>
> * The explicit root reduces version-skew build hazards for extensions
> that are not built at the same time as the core system.
The deployment target is effectively entirely separate from SDK version,
so it really shouldn't make a difference unless the SDK is significantly
older or newer than the running version from what I can tell.
>
> * There are a few tests in configure itself that need to know the
> sysroot path to check for files there.
>
> Anyway, the behavior you're seeing shows that 4823621db is still a
> bit shy of a load.  I'm thinking about the attached as a further
> fix --- can you verify it helps for you?
Best I can tell it provides no change for me(this patch is tested on top of it)
because it does not provide any MACOSX_DEPLOYMENT_TARGET
based feature detection for pwritev at all.
>
> regards, tom lane
>




Re: New IndexAM API controlling index vacuum strategies

2021-01-19 Thread Peter Geoghegan
On Sun, Jan 17, 2021 at 9:18 PM Masahiko Sawada  wrote:
> After more thought, I think that ambulkdelete needs to be able to
> refer the answer to amvacuumstrategy. That way, the index can skip
> bulk-deletion when lazy vacuum doesn't vacuum heap and it also doesn’t
> want to do that.

Makes sense.

BTW, your patch has bitrot already. Peter E's recent pageinspect
commit happens to conflict with this patch. It might make sense to
produce a new version that just fixes the bitrot, so that other people
don't have to deal with it each time.

> I’ve attached the updated version patch that includes the following changes:

Looks good. I'll give this version a review now. I will do a lot more
soon. I need to come up with a good benchmark for this, that I can
return to again and again during review as needed.

Some feedback on the first patch:

* Just so you know: I agree with you about handling
VACOPT_TERNARY_DISABLED in the index AM's amvacuumstrategy routine. I
think that it's better to do that there, even though this choice may
have some downsides.

* Can you add some "stub" sgml doc changes for this? Doesn't have to
be complete in any way. Just a placeholder for later, that has the
correct general "shape" to orientate the reader of the patch. It can
just be a FIXME comment, plus basic mechanical stuff -- details of the
new amvacuumstrategy_function routine and its signature.

Some feedback on the second patch:

* Why do you move around IndexVacuumStrategy in the second patch?
Looks like a rebasing oversight.

* Actually, do we really need the first and second patches to be
separate patches? I agree that the nbtree patch should be a separate
patch, but dividing the first two sets of changes doesn't seem like it
adds much. Did I miss some something?

* Is the "MAXALIGN(sizeof(ItemIdData)))" change in the definition of
MaxHeapTuplesPerPage appropriate? Here is the relevant section from
the patch:

diff --git a/src/include/access/htup_details.h
b/src/include/access/htup_details.h
index 7c62852e7f..038e7cd580 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -563,17 +563,18 @@ do { \
 /*
  * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
  * fit on one heap page.  (Note that indexes could have more, because they
- * use a smaller tuple header.)  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * use a smaller tuple header.)  We arrive at the divisor because each line
+ * pointer must be maxaligned.
*** SNIP ***
 #define MaxHeapTuplesPerPage\
-((int) ((BLCKSZ - SizeOfPageHeaderData) / \
-(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData
+((int) ((BLCKSZ - SizeOfPageHeaderData) / (MAXALIGN(sizeof(ItemIdData)

It's true that ItemIdData structs (line pointers) are aligned, but
they're not MAXALIGN()'d. If they were then the on-disk size of line
pointers would generally be 8 bytes, not 4 bytes.

* Maybe it would be better if you just changed the definition such
that "MAXALIGN(SizeofHeapTupleHeader)" became "MAXIMUM_ALIGNOF", with
no other changes? (Some variant of this suggestion might be better,
not sure.)

For some reason that feels a bit safer: we still have an "imaginary
tuple header", but it's just 1 MAXALIGN() quantum now. This is still
much less than the current 3 MAXALIGN() quantums (i.e. what
MaxHeapTuplesPerPage treats as the tuple header size). Do you think
that this alternative approach will be noticeably less effective
within vacuumlazy.c?

Note that you probably understand the issue with MaxHeapTuplesPerPage
for vacuumlazy.c better than I do currently. I'm still trying to
understand your choices, and to understand what is really important
here.

* Maybe add a #define for the value 0.7? (I refer to the value used in
choose_vacuum_strategy() to calculate a "this is the number of LP_DEAD
line pointers that we consider too many" cut off point, which is to be
applied throughout lazy_scan_heap() processing.)

* I notice that your new lazy_vacuum_table_and_indexes() function is
the only place that calls lazy_vacuum_table_and_indexes(). I think
that you should merge them together -- replace the only remaining call
to lazy_vacuum_table_and_indexes() with the body of the function
itself. Having a separate lazy_vacuum_table_and_indexes() function
doesn't seem useful to me -- it doesn't actually hide complexity, and
might even be harder to maintain.

* I suggest thinking about what the last item will mean for the
reporting that currently takes place in
lazy_vacuum_table_and_indexes(), but will now go in an expanded
lazy_vacuum_table_and_indexes() -- how do we count the total number of
index scans now?

I don't actually believe that the logic needs to change, but some kind
of consolidation and streamlining seems like it might be helpful.
Maybe just a comment that says "note that all index scans might just
be no-ops because..." -- stuff like that.

* Any idea

Re: [PATCH 1/1] Fix detection of pwritev support for OSX.

2021-01-19 Thread James Hilliard
On Tue, Jan 19, 2021 at 3:47 PM James Hilliard
 wrote:
>
> On Tue, Jan 19, 2021 at 1:54 PM Tom Lane  wrote:
> >
> > James Hilliard  writes:
> > > On Tue, Jan 19, 2021 at 10:17 AM Tom Lane  wrote:
> > >> Ah, got it.  So "xcrun --show-sdk-path" tells us the right thing (that
> > >> is, it *does* give us a symlink to a 10.15 SDK) but by refusing to
> > >> believe we've got the right thing, we end up picking MacOSX11.1.sdk.
> > >> Drat.  I suppose we could drop the heuristic about wanting a version
> > >> number in the SDK path, but I really don't want to do that.  Now I'm
> > >> thinking about trying to dereference the symlink after the first step.
> >
> > > The MacOSX11.1.sdk can build for a 10.15 target just fine when passed
> > > an appropriate MACOSX_DEPLOYMENT_TARGET, so that SDK should be
> > > fine.
> >
> > But our out-of-the-box default should be to build for the current
> > platform; we don't want users to have to set MACOSX_DEPLOYMENT_TARGET
> > for that case.  Besides, the problem we're having is exactly that Apple's
> > definition of "builds for a 10.15 target just fine" is different from
> > ours.  They think you should use a run-time test not a compile-time test
> > to discover whether preadv is available, and we don't want to do that.
> The default for MACOSX_DEPLOYMENT_TARGET is always the current
> running OS version from my understanding. So if I build with MacOSX11.1.sdk
> on 10.15 with default settings the binaries will work fine because the
> MACOSX_DEPLOYMENT_TARGET gets set to 10.15 automatically even
> if the same SDK is capable of producing incompatible binaries if you set
> MACOSX_DEPLOYMENT_TARGET to 11.0.
> >
> > In almost all of the cases I've seen so far, Apple's compiler actually
> > does default to using an SDK matching the platform.  The problem we
> > have is that we try to name the SDK explicitly, and the current
> > method is failing to pick the right one in your case.  There are
> > several reasons for using an explicit -isysroot rather than just
> > letting the compiler default:
> No, it's only the MACOSX_DEPLOYMENT_TARGET that matches the
> platform, SDK can be arbitrary more or less, but it will work fine because
> the autoselected MACOSX_DEPLOYMENT_TARGET will force compatibility
> no matter what SDK version you use. This is always how it has worked
> from what I've seen.
> >
> > * We have seen cases in which the compiler acts as though it has
> > *no* default sysroot, and we have to help it out.
> >
> > * The explicit root reduces version-skew build hazards for extensions
> > that are not built at the same time as the core system.
> The deployment target is effectively entirely separate from SDK version,
> so it really shouldn't make a difference unless the SDK is significantly
> older or newer than the running version from what I can tell.
> >
> > * There are a few tests in configure itself that need to know the
> > sysroot path to check for files there.
> >
> > Anyway, the behavior you're seeing shows that 4823621db is still a
> > bit shy of a load.  I'm thinking about the attached as a further
> > fix --- can you verify it helps for you?
> Best I can tell it provides no change for me(this patch is tested on top of 
> it)
> because it does not provide any MACOSX_DEPLOYMENT_TARGET
> based feature detection for pwritev at all.
Actually, this looks path looks wrong in general, the value for
"xcrun --sdk macosx --show-sdk-path" should take precedence over
"xcrun --show-sdk-path" as the latter may be used for IOS potentially.
On my system "xcodebuild -version -sdk macosx Path" and
"xcrun --sdk macosx --show-sdk-path" both point to the
correct latest MacOSX11.1.sdk SDK while "xcrun --show-sdk-path"
points to the older one.
> >
> > regards, tom lane
> >




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-01-19 Thread Tatsuro Yamada

Hi Julien,


Rebase only, thanks to the cfbot!  V16 attached.


I tested the v16 patch on a0efda88a by using "make installcheck-parallel", and
my result is the following. Attached file is regression.diffs.


 1 of 202 tests failed.


The differences that caused some tests to fail can be viewed in the
file "/home/postgres/PG140/src/test/regress/regression.diffs".  A copy of the 
test summary that you see
above is saved in the file 
"/home/postgres/PG140/src/test/regress/regression.out".


src/test/regress/regression.diffs
-
diff -U3 /home/postgres/PG140/src/test/regress/expected/rules.out 
/home/postgres/PG140/src/test/regress/results/rules.out
--- /home/postgres/PG140/src/test/regress/expected/rules.out2021-01-20 
08:41:16.383175559 +0900
+++ /home/postgres/PG140/src/test/regress/results/rules.out 2021-01-20 
08:43:46.589171774 +0900
@@ -1760,10 +1760,9 @@
 s.state,
 s.backend_xid,
 s.backend_xmin,
-s.queryid,
 s.query,
 s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
...

Thanks,
Tatsuro Yamada
diff -U3 /home/postgres/PG140/src/test/regress/expected/rules.out 
/home/postgres/PG140/src/test/regress/results/rules.out
--- /home/postgres/PG140/src/test/regress/expected/rules.out2021-01-20 
08:41:16.383175559 +0900
+++ /home/postgres/PG140/src/test/regress/results/rules.out 2021-01-20 
08:52:10.891159065 +0900
@@ -1760,10 +1760,9 @@
 s.state,
 s.backend_xid,
 s.backend_xmin,
-s.queryid,
 s.query,
 s.backend_type
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
  LEFT JOIN pg_database d ON ((s.datid = d.oid)))
  LEFT JOIN pg_authid u ON ((s.usesysid = u.oid)));
 pg_stat_all_indexes| SELECT c.oid AS relid,
@@ -1875,7 +1874,7 @@
 s.gss_auth AS gss_authenticated,
 s.gss_princ AS principal,
 s.gss_enc AS encrypted
-   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid)
   WHERE (s.client_port IS NOT NULL);
 pg_stat_progress_analyze| SELECT s.pid,
 s.datid,
@@ -2032,7 +2031,7 @@
 w.sync_priority,
 w.sync_state,
 w.reply_time
-   FROM ((pg_stat_get_activity(NULL::integer) s(datid, pid, usesysid, 
application_name, state, query, wait_event_type, wait_event, xact_start, 
query_start, backend_start, state_change, client_addr, client_hostname, 
client_port, backend_xid, backend_xmin, backend_type, ssl, sslversion, 
sslcipher, sslbits, sslcompression, ssl_client_dn, ssl_client_serial, 
ssl_issuer_dn, gss_auth, gss_princ, gss_enc, leader_pid, queryid)
+   FROM ((pg_stat_get_activity(NULL::integer)

Re: WIP: BRIN multi-range indexes

2021-01-19 Thread Tomas Vondra

On 1/19/21 9:44 PM, John Naylor wrote:
On Tue, Jan 12, 2021 at 1:42 PM Tomas Vondra 
mailto:tomas.von...@enterprisedb.com>> 
wrote:

 > I suspect it'd due to minmax having to decide which "ranges" to merge,
 > which requires repeated sorting, etc. I certainly don't dare to claim
 > the current algorithm is perfect. I wouldn't have expected such big
 > difference, though - so definitely worth investigating.

It seems that monotonically increasing (or decreasing) values in a table 
are a worst case scenario for multi-minmax indexes, or basically, unique 
values within a range. I'm guessing it's because it requires many passes 
to fit all the values into a limited number of ranges. I tried using 
smaller pages_per_range numbers, 32 and 8, and that didn't help.


Now, with a different data distribution, using only 10 values that 
repeat over and over, the results are muchs more sympathetic to multi-minmax:


insert into iot (num, create_dt)
select random(), '2020-01-01 0:00'::timestamptz + (x % 10 || ' 
seconds')::interval

from generate_series(1,5*365*24*60*60) x;

create index cd_single on iot using brin(create_dt);
27.2s

create index cd_multi on iot using brin(create_dt 
timestamptz_minmax_multi_ops);

30.4s

create index cd_bt on iot using btree(create_dt);
61.8s

Circling back to the monotonic case, I tried running a simple perf 
record on a backend creating a multi-minmax index on a timestamptz 
column and these were the highest non-kernel calls:
+   21.98%    21.91%  postgres         postgres            [.] 
FunctionCall2Coll
+    9.31%     9.29%  postgres         postgres            [.] 
compare_combine_ranges

+    8.60%     8.58%  postgres         postgres            [.] qsort_arg
+    5.68%     5.66%  postgres         postgres            [.] 
brin_minmax_multi_add_value

+    5.63%     5.60%  postgres         postgres            [.] timestamp_lt
+    4.73%     4.71%  postgres         postgres            [.] 
reduce_combine_ranges
+    3.80%     0.00%  postgres         [unknown]           [.] 
0x032001680004

+    3.51%     3.50%  postgres         postgres            [.] timestamp_eq

There's no one place that's pathological enough to explain the 4x 
slowness over traditional BRIN and nearly 3x slowness over btree when 
using a large number of unique values per range, so making progress here 
would have to involve a more holistic approach.




Yeah. This very much seems like the primary problem is in how we build 
the ranges incrementally - with monotonic sequences, we end up having to 
merge the ranges over and over again. I don't know what was the 
structure of the table, but I guess it was kinda narrow (very few 
columns), which exacerbates the problem further, because the number of 
rows per range will be way higher than in real-world.


I do think the solution to this might be to allow more values during 
batch index creation, and only "compress" to the requested number at the 
very end (when serializing to on-disk format).


There are a couple additional comments about possibly replacing 
sequential scan with a binary search, that could help a bit too.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Add table access method as an option to pgbench

2021-01-19 Thread David Zhang

On 2021-01-15 1:22 p.m., Andres Freund wrote:


Hi,

On 2020-11-25 12:41:25 +0900, Michael Paquier wrote:

On Tue, Nov 24, 2020 at 03:32:38PM -0800, David Zhang wrote:

But, providing another option for the end user may not be a bad idea, and it
might make the tests easier at some points.

My first thought is that we have no need to complicate pgbench with
this option because there is a GUC able to do that, but we do that for
tablespaces, so...  No objections from here.

I think that objection is right. All that's needed to change this from
the client side is to do something like
PGOPTIONS='-c default_table_access_method=foo' pgbench ...
Yeah, this is a better solution for me too. Thanks a lot for all the 
feedback.

I don't think adding pgbench options for individual GUCs really is a
useful exercise?

Greetings,

Andres Freund

--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca




Re: Support for NSS as a libpq TLS backend

2021-01-19 Thread Jacob Champion
On Tue, 2021-01-19 at 21:21 +0100, Daniel Gustafsson wrote:
> There is something iffy with these certs (the test fails
> on mismatching ciphers and/or signature algorithms) that I haven't been able 
> to
> pin down, but to get more eyes on this I'm posting the patch with the test
> enabled.

Removing `--keyUsage keyEncipherment` from the native_server-* CSR
generation seems to let the tests pass for me, but I'm wary of just
pushing that as a solution because I don't understand why that would
have anything to do with the failure mode
(SSL_ERROR_NO_SUPPORTED_SIGNATURE_ALGORITHM).

> The NSS toolchain requires interactive input which makes the Makefile
> a bit hacky, ideas on cleaning that up are appreciated.

Hm. I got nothing, short of a feature request to NSS...

--Jacob


Re: New IndexAM API controlling index vacuum strategies

2021-01-19 Thread Peter Geoghegan
On Tue, Jan 19, 2021 at 2:57 PM Peter Geoghegan  wrote:
> * Maybe it would be better if you just changed the definition such
> that "MAXALIGN(SizeofHeapTupleHeader)" became "MAXIMUM_ALIGNOF", with
> no other changes? (Some variant of this suggestion might be better,
> not sure.)
>
> For some reason that feels a bit safer: we still have an "imaginary
> tuple header", but it's just 1 MAXALIGN() quantum now. This is still
> much less than the current 3 MAXALIGN() quantums (i.e. what
> MaxHeapTuplesPerPage treats as the tuple header size). Do you think
> that this alternative approach will be noticeably less effective
> within vacuumlazy.c?

BTW, I think that increasing MaxHeapTuplesPerPage will make it
necessary to consider tidbitmap.c. Comments at the top of that file
say that it is assumed that MaxHeapTuplesPerPage is about 256. So
there is a risk of introducing performance regressions affecting
bitmap scans here.

Apparently some other DB systems make the equivalent of
MaxHeapTuplesPerPage dynamically configurable at the level of heap
tables. It usually doesn't matter, but it can matter with on-disk
bitmap indexes, where the bitmap must be encoded from raw TIDs (this
must happen before the bitmap is compressed -- there must be a simple
mapping from every possible TID to some bit in a bitmap first). The
item offset component of each heap TID is not usually very large, so
there is a trade-off between keeping the representation of bitmaps
efficient and not unduly restricting the number of distinct heap
tuples on each heap page. I think that there might be a similar
consideration here, in tidbitmap.c (even though it's not concerned
about on-disk bitmaps).

-- 
Peter Geoghegan




Re: New IndexAM API controlling index vacuum strategies

2021-01-19 Thread Peter Geoghegan
On Tue, Jan 19, 2021 at 4:45 PM Peter Geoghegan  wrote:
> BTW, I think that increasing MaxHeapTuplesPerPage will make it
> necessary to consider tidbitmap.c. Comments at the top of that file
> say that it is assumed that MaxHeapTuplesPerPage is about 256. So
> there is a risk of introducing performance regressions affecting
> bitmap scans here.

More concretely, WORDS_PER_PAGE increases from 5 on the master branch
to 16 with the latest version of the patch series on most platforms
(while WORDS_PER_CHUNK is 4 with or without the patches).

-- 
Peter Geoghegan




RE: Add Nullif case for eval_const_expressions_mutator

2021-01-19 Thread Hou, Zhijie
Hi

Thanks for the review.

> It's a bit unfortunate now that between OpExpr, DistinctExpr, NullIfExpr,
> and to a lesser extent ScalarArrayOpExpr we will now have several different
> implementations of nearly the same thing, without any explanation why one
> approach was chosen here and another there.  We should at least document
> this.

I am not quiet sure where to document the difference.
Temporarily, I tried to add some comments for the Nullif to explain why this 
one is different.

+   /*
+* Since NullIf is not common enough to deserve 
specially
+* optimized code, use ece_generic_processing 
and
+* ece_evaluate_expr to simplify the code as 
much as possible.
+*/

Any suggestions ?

> Some inconsistencies I found: The code for DistinctExpr calls
> expression_tree_mutator() directly, but your code for NullIfExpr calls
> ece_generic_processing(), even though the explanation in the comment for
> DistinctExpr would apply there as well.
> 
> Your code for NullIfExpr doesn't appear to call set_opfuncid() anywhere.

IMO, we will call set_opfuncid in function ece_evaluate_expr.

Like the following flow:
ece_evaluate_expr-->evaluate_expr--> fix_opfuncids--> 
fix_opfuncids_walker--> set_opfuncid

And we do not need the opfuncid till we call ece_evaluate_expr.
So, to simplify the code as much as possible, I did not call set_opfuncid in 
eval_const_expressions_mutator again.


> I would move your new block for NullIfExpr after the block for DistinctExpr.
> That's the order in which these blocks appear elsewhere for generic node
> processing.
> 

Changed.


> Check your whitespace usage:
> 
>  if(!has_nonconst_input)
> 
> should have a space after the "if".  (It's easy to fix of course, but I
> figure I'd point it out here since you have submitted several patches with
> this style, so it's perhaps a habit to break.)

Changed.


> Perhaps add a comment to the tests like this so it's clear what they are
> for:
> 
> diff --git a/src/test/regress/sql/case.sql
> b/src/test/regress/sql/case.sql index 4742e1d0e0..98e3fb8de5 100644
> --- a/src/test/regress/sql/case.sql
> +++ b/src/test/regress/sql/case.sql
> @@ -137,6 +137,7 @@ CREATE TABLE CASE2_TBL (
> FROM CASE_TBL a, CASE2_TBL b
> WHERE COALESCE(f,b.i) = 2;
> 
> +-- Tests for constant subexpression simplification
>   explain (costs off)
>   SELECT * FROM CASE_TBL WHERE NULLIF(1, 2) = 2;

Added.

Attatching v3 patch, please consider it for further review.

Best regards,
houzj





v3_0001-add-nullif-case-for-eval_const_expressions.patch
Description: v3_0001-add-nullif-case-for-eval_const_expressions.patch


RE: Parallel INSERT (INTO ... SELECT ...)

2021-01-19 Thread tsunakawa.ta...@fujitsu.com
From: Tang, Haiying 
> Execute EXPLAIN on Patched:
>  Insert on public.test_part  (cost=0.00..15.00 rows=0 width=0) (actual
> time=44.139..44.140 rows=0 loops=1)
>Buffers: shared hit=1005 read=1000 dirtied=3000 written=2000
>->  Seq Scan on public.test_data1  (cost=0.00..15.00 rows=1000
> width=8) (actual time=0.007..0.201 rows=1000 loops=1)
>  Output: test_data1.a, test_data1.b
>  Buffers: shared hit=5

> Execute EXPLAIN on non-Patched:
>  Insert on public.test_part  (cost=0.00..15.00 rows=0 width=0) (actual
> time=72.656..72.657 rows=0 loops=1)
>Buffers: shared hit=22075 read=1000 dirtied=3000 written=2000
>->  Seq Scan on public.test_data1  (cost=0.00..15.00 rows=1000
> width=8) (actual time=0.010..0.175 rows=1000 loops=1)
>  Output: test_data1.a, test_data1.b
>  Buffers: shared hit=5

I don't know if this is related to this issue, but I felt "shared hit=5" for 
Seq Scan is strange.  This test case reads 1,000 rows from 1,000 partitions, 
one row per partition, so I thought the shared hit should be 1,000 in Seq Scan. 
 I wonder if the 1,000 is included in Insert node?


Regards
Takayuki Tsunakawa



  1   2   >