Re: Role Self-Administration

2021-10-08 Thread Mark Dilger



> On Oct 7, 2021, at 7:44 PM, Stephen Frost  wrote:
> 
> I don't actually think REVOKE ROLE CASCADE must not fail, nor do I see
> that as explicit in anything you quote above.

I don't see that myself, but I thought that you would, given your other 
statements about how we shouldn't take a spec requirement to do X and turn it 
into doing X+Y, because the user wouldn't be expecting Y.  So I thought that if 
DROP ROLE bob was defined in the spec to basically just do REVOKE bob FROM 
EVERYBODY, and if the CASCADE version of that wasn't supposed to fail, then 
you'd say that DROP ROLE bob CASCADE wasn't supposed to fail either.  (Failing 
is the unexpected action Y that I expected your rule to prohibit.)

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-08 Thread Bharath Rupireddy
On Sat, Oct 9, 2021 at 12:45 AM Stephen Frost  wrote:
>
> Greetings,
>
> * Bossart, Nathan (bossa...@amazon.com) wrote:
> > On 10/8/21, 12:01 AM, "Bharath Rupireddy" 
> >  wrote:
> > > I think we can remove the below revoke statements from
> > > system_views.sql and place the checks shown at (2) in the underlying
> > > functions pg_get_shmem_allocations, pg_get_backend_memory_contexts,
> > > also in pg_log_backend_memory_contexts.
> > >
> > > REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
> > > REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
> > > REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
> > > REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;
> > >
> > > Thoughts?
> >
> > This approach would add a restriction that a role must have SUPERUSER
> > or be a member of pg_monitor to use the views/functions.  I think
> > there is value in allowing any role to use them (if granted the proper
> > privileges).  In any case, users may already depend on being able to
> > do that.
> >
> > Instead, I think we should just grant privileges to pg_monitor.  I've
> > attached a (basically untested) patch to demonstrate what I'm
> > thinking.
>
> I'm not necessarily against this, but I will point out that we've stayed
> away, so far, from explicitly GRANT'ing privileges to pg_monitor itself,
> intending that to be a role which just combines privileges of certain
> other predefined roles together.
>
> I would think that these would fall under "pg_read_all_stats", in
> particular, which is explicitly documented as: Read all pg_stat_* views
> and use various statistics related extensions, even those normally
> visible only to superusers.
>
> (the last bit being particularly relevant in this case)

+1. I will prepare the patch with the pg_read_all_stats role.

Regards,
Bharath Rupireddy.




Re: Running tests under valgrind is getting slower at an alarming pace

2021-10-08 Thread Andres Freund
On 2021-10-08 15:41:09 -0400, Andrew Dunstan wrote:
> see
> 

Thanks!




Re: Running tests under valgrind is getting slower at an alarming pace

2021-10-08 Thread Andrew Dunstan


On 10/6/21 10:03 PM, Andres Freund wrote:
> Hi,
>
> On 2021-10-06 09:47:54 -0700, Andres Freund wrote:
>> I'll also try to figure out print a bit more detail about timing for each tap
>> test, looks like I need to figure out how to pass PROVE_TEST='--timer' 
>> through
>> the buildfarm. Shouldn't be too hard.
> Turns out that the buildfarm already adds --timer. I added -j4 to allow for
> some concurrency in tap tests, but unfortunately my animals fell over after
> that (thanks Michael for noticing).
>
> Looks like the buildfarm client code isn't careful enough quoting PROVE_FLAGS?
>
> my $pflags = "PROVE_FLAGS=--timer";
> if (exists $ENV{PROVE_FLAGS})
> {
> $pflags =
>   $ENV{PROVE_FLAGS}
>   ? "PROVE_FLAGS=$ENV{PROVE_FLAGS}"
>   : "";
> }
>
> @makeout =
>   run_log("cd $dir && $make NO_LOCALE=1 $pflags $instflags 
> $taptarget");
>
> Which doesn't work if pflags ends up as '-j4 --timer' or such...


see



cheers


andrew


-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Time to upgrade buildfarm coverage for some EOL'd OSes?

2021-10-08 Thread Thomas Munro
On Sat, Oct 9, 2021 at 5:55 AM Mikael Kjellström
 wrote:
> On 2021-10-08 18:12, Tom Lane wrote:
> > =?UTF-8?Q?Mikael_Kjellstr=c3=b6m?=  writes:
> >> Yes, it's getting long in the tooth.  I will upgrade the NetBSD 7
> >> (sidewinder) to 9.2.
> >> And I will also upgrade loach to 12.x if that's the version that is
> >> needed the most.
> >> I will upgrade conchuela to DragonFlyBSD 6.0.
> >> I will remove the 5.9 (curculio) and upgrade the 6.5 (morepork) to 6.9.
> >
> > +1 to all of that except retiring curculio.  That one has shown us issues
> > we've not seen on other animals (recent examples at [1][2]), so unless
> > you're feeling resource-constrained I'd vote for keeping it going.
>
> Sure I can keep curculio as is.  Will just upgrade morepork to OpenBSD
> 6.9 then.

Thanks very much for doing all these upgrades!

Here's a nice recording of a morepork, a kind of owl that we often
hear in the still of the night where I live.  Supposedly it's saying
'more pork!' but I don't hear that myself.

https://www.doc.govt.nz/nature/native-animals/birds/birds-a-z/morepork-ruru/




Re: RFC: compression dictionaries for JSONB

2021-10-08 Thread Alvaro Herrera
On 2021-Oct-08, Matthias van de Meent wrote:

> That's a good point, but if we're extending this syntax to allow the
> ability of including other types, then I'd instead extend the syntax
> that of below, so that the type of the dictionary entries is required
> in the syntax:
> 
> CREATE TYPE name AS DICTIONARY OF jsonb [ ( ...entries ) ] [ WITH (
> ...options ) ];

I don't think this gives you any guarantees of the sort you seem to
expect.  See CREATE AGGREGATE as a precedent where there are some
options in the parenthesized options list you cannot omit.

> > The pg_type entry would have to provide some support procedure that
> > makes use of the dictionary in some way.  This seems better than tying
> > the SQL object to a specific type.
> 
> Agreed, but this might mean that much more effort would be required to
> get such a useful quality-of-life feature committed.

I don't understand what you mean by that.  I'm not saying that the patch
has to provide support for any additional datatypes.  Its only
obligation would be to provide a new column in pg_type which is zero for
all rows except jsonb, and in that row it is the OID of a
jsonb_dictionary() function that's called from all the right places and
receives all the right arguments.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"No tengo por qué estar de acuerdo con lo que pienso"
 (Carlos Caszeli)




Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-08 Thread Stephen Frost
Greetings,

* Bossart, Nathan (bossa...@amazon.com) wrote:
> On 10/8/21, 12:01 AM, "Bharath Rupireddy" 
>  wrote:
> > I think we can remove the below revoke statements from
> > system_views.sql and place the checks shown at (2) in the underlying
> > functions pg_get_shmem_allocations, pg_get_backend_memory_contexts,
> > also in pg_log_backend_memory_contexts.
> >
> > REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
> > REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
> > REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
> > REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;
> >
> > Thoughts?
> 
> This approach would add a restriction that a role must have SUPERUSER
> or be a member of pg_monitor to use the views/functions.  I think
> there is value in allowing any role to use them (if granted the proper
> privileges).  In any case, users may already depend on being able to
> do that.
> 
> Instead, I think we should just grant privileges to pg_monitor.  I've
> attached a (basically untested) patch to demonstrate what I'm
> thinking.

I'm not necessarily against this, but I will point out that we've stayed
away, so far, from explicitly GRANT'ing privileges to pg_monitor itself,
intending that to be a role which just combines privileges of certain
other predefined roles together.

I would think that these would fall under "pg_read_all_stats", in
particular, which is explicitly documented as: Read all pg_stat_* views
and use various statistics related extensions, even those normally
visible only to superusers.

(the last bit being particularly relevant in this case)

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: storing an explicit nonce

2021-10-08 Thread Stephen Frost
Greetings,

* Antonin Houska (a...@cybertec.at) wrote:
> Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> > > On Thu, Oct 7, 2021 at 3:38 PM Stephen Frost  wrote:
> > > > While I certainly also appreciate that we want to get this as right as
> > > > we possibly can from the start, I strongly suspect we'll have one of two
> > > > reactions- either we'll be more-or-less ignored and it'll be crickets
> > > > from the security folks, or we're going to get beat up by them for
> > > > $reasons, almost regardless of what we actually do.  Best bet to
> > > > limit the risk ( ;) ) of the latter happening would be to try our best
> > > > to do what existing solutions already do- such as by using XTS.
> > > > There's things we can do to limit the risk of known-plaintext attacks,
> > > > like simply not encrypting empty pages, or about possible known-IV
> > > > risks, like using the LSN as part of the IV/tweak.  Will we get
> > > > everything?  Probably not, but I don't think that we're going to really
> > > > go wrong by using XTS as it's quite popularly used today and it's
> > > > explicitly used for cases where you haven't got a place to store the
> > > > extra nonce that you would need for AEAD encryption schemes.
> > > 
> > > I agree that using a popular approach is a good way to go. If we do
> > > what other people do, then hopefully our stuff won't be significantly
> > > more broken than their stuff, and whatever is can be fixed.
> > 
> > Right.
> > 
> > > > As long as we're clear that this initial version of TDE is with XTS then
> > > > I really don't think we'll end up with anyone showing up and saying we
> > > > screwed up by not generating a per-page nonce to store with it- the 
> > > > point
> > > > of XTS is that you don't need that.
> > > 
> > > I agree that we shouldn't really catch flack for any weaknesses of the
> > > underlying algorithm: if XTS turns out to be secure even when used
> > > properly, and we use it properly, the resulting weakness is somebody
> > > else's fault. On the other hand, if we use it improperly, that's our
> > > fault, so we need to be really sure that we understand what guarantees
> > > we need to provide from our end, and that we are providing them. Like
> > > if we pick an encryption mode that requires nonces to be unique, we
> > > will be at fault if they aren't; if it requires nonces to be
> > > unpredictable, we will be at fault if they aren't; and so on.
> > 
> > Sure, I get that.  Would be awesome if all these things were clearly
> > documented somewhere but I've never been able to find it quite as
> > explicitly laid out as one would like.
> > 
> > > So that's what is making me nervous here ... it doesn't seem likely we
> > > have complete unanimity about whether XTS is the right thing, though
> > > that does seem to be the majority position certainly, and it is not
> > > really clear to me that any of us can speak with authority about what
> > > the requirements are around the nonces in particular.
> > 
> > The authority to look at, in my view anyway, are NIST publications.
> > Following a bit more digging, I came across something which makes sense
> > to me as intuitive but explains it in a way that might help everyone
> > understand a bit better what's going on here:
> > 
> > https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-38G.pdf
> > 
> > specifically: Appendix C: Tweaks
> > 
> > Quoting a couple of paragraphs from that appendix:
> > 
> > """
> > In general, if there is information that is available and statically
> > associated with a plaintext, it is recommended to use that information
> > as a tweak for the plaintext. Ideally, the non-secret tweak associated
> > with a plaintext is associated only with that plaintext.
> > 
> > Extensive tweaking means that fewer plaintexts are encrypted under any
> > given tweak. This corresponds, in the security model that is described
> > in [1], to fewer queries to the target instance of the encryption.
> > """
> > 
> > The gist of this being- the more diverse the tweaking being used, the
> > better.  That's where I was going with my "limit the risk" comment.  If
> > we can make the tweak vary more for a given encryption invokation,
> > that's going to be better, pretty much by definition, and as explained
> > in publications by NIST.
> > 
> > That isn't to say that using the same tweak for the same block over and
> > over "breaks" the encryption (unlike with CTR/GCM, where IV reuse leads
> > directly to plaintext being recoverable), but it does mean that an
> > observer who can see the block writes over time could see what parts are
> > changing (and which aren't) and may be able to derive insight from that.
> 
> This reminds me of Joe Conway's response to me email earlier:
> 
> https://www.postgresql.org/message-id/50335f56-041b-1a1f-59ea-b5f7bf917352%40joeconway.com
> 
> In the document he recommended
> 
> 

Re: Query rewrite(optimization) using constraints

2021-10-08 Thread Tom Lane
Lily Liu  writes:
> I propose two new types of query rewrites using constraints here

> 1) Remove DISTINCT
> A simple example is SELECT DISTINCT(name) FROM R. If there is a unique
> constraint on the name column. The DISTINCT keyword can be removed safely.
> Query plans without the DISTINCT keyword might be much cheaper since
> DISTINCT is expensive.

There's already been a fair amount of effort in that direction, cf [1].
However, I can't avoid the impression that that patch series is adding
way too much overcomplicated infrastructure.  If you have an idea for
an easier way, let's hear it.

> 2) Add LIMIT 1
> An example of this optimization will be SELECT name from R WHERE name =
> ‘foo’. If there is a unique constraint on the name column, the selection
> result has at most one record. Therefore, we can add LIMIT 1 safely. If the
> original query plan performs a sequential scan on the R, adding LIMIT 1
> might speed up the query because of the early return.

I strongly suspect that this idea isn't actually useful.  If there is a
matching unique constraint, the planner will almost surely choose an
indexscan on the unique index, and adding an additional plan node to that
is likely to add more overhead than it removes.  For example,

regression=# explain analyze select * from tenk1 where unique1 = 42;
  QUERY PLAN
   
---
 Index Scan using tenk1_unique1 on tenk1  (cost=0.29..8.30 rows=1 width=244) 
(actual time=0.012..0.013 rows=1 loops=1)
   Index Cond: (unique1 = 42)
 Planning Time: 0.059 ms
 Execution Time: 0.030 ms
(4 rows)

regression=# explain analyze select * from tenk1 where unique1 = 42 limit 1;
 QUERY PLAN 
 
-
 Limit  (cost=0.29..8.30 rows=1 width=244) (actual time=0.013..0.013 rows=1 
loops=1)
   ->  Index Scan using tenk1_unique1 on tenk1  (cost=0.29..8.30 rows=1 
width=244) (actual time=0.012..0.012 rows=1 loops=1)
 Index Cond: (unique1 = 42)
 Planning Time: 0.067 ms
 Execution Time: 0.034 ms
(5 rows)

This test case shouldn't be taken too seriously, because it was just a
quick one-off check with little attempt to control for noise, besides
which it's a debug-enabled build.  Nonetheless, if you want to pursue
this idea I think you first need to prove that it actually is a win.
"Might speed up" won't cut it.

Another concern here is that the planner is famously bad about making
good decisions for queries involving LIMIT.  The cost estimates for
that require a bunch of assumptions that we don't need elsewhere, and
some of them aren't very trustworthy.  Now maybe for the restricted
cases where you want to add LIMIT, this isn't really a problem.  But
you haven't shown that to be true, so I'm afraid that this transformation
will sometimes lead us into worse plan choices than we'd make otherwise.
I'm pretty leery of adding LIMIT when we don't have to.

regards, tom lane

[1] https://commitfest.postgresql.org/35/2433/




Re: Query rewrite(optimization) using constraints

2021-10-08 Thread Justin Pryzby
On Fri, Oct 08, 2021 at 10:24:33AM -0700, Lily Liu wrote:
> 1) Remove DISTINCT
> 
> A simple example is SELECT DISTINCT(name) FROM R. If there is a unique
> constraint on the name column. The DISTINCT keyword can be removed safely.
> Query plans without the DISTINCT keyword might be much cheaper since
> DISTINCT is expensive.

There's an ongoing discussion and patches for this here.

Erase the distinctClause if the result is unique by definition
https://commitfest.postgresql.org/35/2433/

Perhaps you could help to test or review the patch ?

-- 
Justin




Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-10-08 Thread Andres Freund
Hi,

On 2021-10-01 16:05:31 -0400, Melanie Plageman wrote:
> From 40c809ad1127322f3462e85be080c10534485f0d Mon Sep 17 00:00:00 2001
> From: Melanie Plageman 
> Date: Fri, 24 Sep 2021 17:39:12 -0400
> Subject: [PATCH v13 1/4] Allow bootstrap process to beinit
>
> ---
>  src/backend/utils/init/postinit.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/src/backend/utils/init/postinit.c 
> b/src/backend/utils/init/postinit.c
> index 78bc64671e..fba5864172 100644
> --- a/src/backend/utils/init/postinit.c
> +++ b/src/backend/utils/init/postinit.c
> @@ -670,8 +670,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char 
> *username,
>   EnablePortalManager();
>
>   /* Initialize status reporting */
> - if (!bootstrap)
> - pgstat_beinit();
> + pgstat_beinit();
>
>   /*
>* Load relcache entries for the shared system catalogs.  This must 
> create
> --
> 2.27.0
>

I think it's good to remove more and more of these !bootstrap cases - they
really make it harder to understand the state of the system at various
points. Optimizing for the rarely executed bootstrap mode at the cost of
checks in very common codepaths...


> From a709ddb30b2b747beb214f0b13cd1e1816094e6b Mon Sep 17 00:00:00 2001
> From: Melanie Plageman 
> Date: Thu, 30 Sep 2021 16:16:22 -0400
> Subject: [PATCH v13 2/4] Add utility to make tuplestores for pg stat views
>
> Most of the steps to make a tuplestore for those pg_stat views requiring
> one are the same. Consolidate them into a single helper function for
> clarity and to avoid bugs.
> ---
>  src/backend/utils/adt/pgstatfuncs.c | 129 ++--
>  1 file changed, 44 insertions(+), 85 deletions(-)
>
> diff --git a/src/backend/utils/adt/pgstatfuncs.c 
> b/src/backend/utils/adt/pgstatfuncs.c
> index ff5aedc99c..513f5aecf6 100644
> --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -36,6 +36,42 @@
>
>  #define HAS_PGSTAT_PERMISSIONS(role)  (is_member_of_role(GetUserId(), 
> ROLE_PG_READ_ALL_STATS) || has_privs_of_role(GetUserId(), role))
>
> +/*
> + * Helper function for views with multiple rows constructed from a tuplestore
> + */
> +static Tuplestorestate *
> +pg_stat_make_tuplestore(FunctionCallInfo fcinfo, TupleDesc *tupdesc)
> +{
> + Tuplestorestate *tupstore;
> + ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
> + MemoryContext per_query_ctx;
> + MemoryContext oldcontext;
> +
> + /* check to see if caller supports us returning a tuplestore */
> + if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +  errmsg("set-valued function called in context 
> that cannot accept a set")));
> + if (!(rsinfo->allowedModes & SFRM_Materialize))
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +  errmsg("materialize mode required, but it is 
> not allowed in this context")));
> +
> + /* Build a tuple descriptor for our result type */
> + if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
> + elog(ERROR, "return type must be a row type");
> +
> + per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
> + oldcontext = MemoryContextSwitchTo(per_query_ctx);
> +
> + tupstore = tuplestore_begin_heap(true, false, work_mem);
> + rsinfo->returnMode = SFRM_Materialize;
> + rsinfo->setResult = tupstore;
> + rsinfo->setDesc = *tupdesc;
> + MemoryContextSwitchTo(oldcontext);
> + return tupstore;
> +}

Is pgstattuple the best place for this helper? It's not really pgstatfuncs
specific...

It also looks vaguely familiar - I wonder if we have a helper roughly like
this somewhere else already...




> From e9a5d2a021d429fdbb2daa58ab9d75a069f334d4 Mon Sep 17 00:00:00 2001
> From: Melanie Plageman 
> Date: Wed, 29 Sep 2021 15:39:45 -0400
> Subject: [PATCH v13 3/4] Add system view tracking IO ops per backend type
>

> diff --git a/src/backend/postmaster/checkpointer.c 
> b/src/backend/postmaster/checkpointer.c
> index be7366379d..0d18e7f71a 100644
> --- a/src/backend/postmaster/checkpointer.c
> +++ b/src/backend/postmaster/checkpointer.c
> @@ -1104,6 +1104,7 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType 
> type)
>*/
>   if (!AmBackgroundWriterProcess())
>   CheckpointerShmem->num_backend_fsync++;
> + pgstat_inc_ioop(IOOP_FSYNC, IOPATH_SHARED);
>   LWLockRelease(CheckpointerCommLock);
>   return false;
>   }

ISTM this doens't need to happen while holding CheckpointerCommLock?




> @@ -1461,7 +1467,25 @@ pgstat_reset_shared_counters(const char *target)
>errhint("Target must be \"archiver\", 
> \"bgwriter\", or \"wal\".")));
>
>   

Query rewrite(optimization) using constraints

2021-10-08 Thread Lily Liu
Hi, hackers


I notice that postgres use constraints to optimize/rewrite queries in
limited cases as
https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-CONSTRAINT-EXCLUSION
.


I propose two new types of query rewrites using constraints here

1) Remove DISTINCT

A simple example is SELECT DISTINCT(name) FROM R. If there is a unique
constraint on the name column. The DISTINCT keyword can be removed safely.
Query plans without the DISTINCT keyword might be much cheaper since
DISTINCT is expensive.


2) Add LIMIT 1

An example of this optimization will be SELECT name from R WHERE name =
‘foo’. If there is a unique constraint on the name column, the selection
result has at most one record. Therefore, we can add LIMIT 1 safely. If the
original query plan performs a sequential scan on the R, adding LIMIT 1
might speed up the query because of the early return.


We designed an algorithm to decide if 1), 2) can be performed safely.
Rewriting queries manually and experimenting on a table with 10K records
shows 2X ~ 3X improvement for both rewrites. We have some other rewrite
rules, but the two are most obvious ones. With this feature, the optimizer
can consider the query plans both before and after the rewrite and choose
the one with minimum cost.


Will that feature be useful? How hard to implement the feature in the
current system? Any thoughts or comments are highly appreciated!


Best,

Lily


Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-10-08 Thread Peter Geoghegan
On Fri, Oct 8, 2021 at 8:13 AM Alvaro Herrera  wrote:
> On 2021-Oct-06, Masahiko Sawada wrote:
> > A customer reported that during parallel index vacuum, the oldest xmin
> > doesn't advance. Normally, the calculation of oldest xmin
> > (ComputeXidHorizons()) ignores xmin/xid of processes having
> > PROC_IN_VACUUM flag in MyProc->statusFlags. But since parallel vacuum
> > workers don’t set their statusFlags, the xmin of the parallel vacuum
> > worker is considered to calculate the oldest xmin. This issue happens
> > from PG13 where the parallel vacuum was introduced. I think it's a
> > bug.
>
> Augh, yeah, I agree this is a pretty serious problem.

So is this comparable problem, which happens to be much older:
https://postgr.es/m/cah2-wzkjrk556envtflmyxedw91xguwiyzvep2kp5yqt_-3...@mail.gmail.com

In both cases we see bugs (or implementation deficiencies) that
accidentally block ComputeXidHorizons() for hours, when that isn't
truly necessary. Practically all users are not sure of whether or not
VACUUM behaves like a long running transaction already, in general, so
we shouldn't be surprised that it takes so long for us to hear about
issues like this.

I think that we should try to find a way of making this whole class of
problems easier to identify in production. There needs to be greater
visibility into what process holds back VACUUM, and how long that
lasts -- something easy to use, and *obvious*. That would be a very
useful feature in general. It would also make catching these issues
early far more likely. It's just *not okay* that you have to follow long
and complicated instructions [1] to get just some of this information.
How can something this important just be an afterthought?

[1] 
https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/

--
Peter Geoghegan




Re: RFC: compression dictionaries for JSONB

2021-10-08 Thread Matthias van de Meent
On Fri, 8 Oct 2021 at 17:19, Alvaro Herrera  wrote:
>
> On 2021-Oct-08, Matthias van de Meent wrote:
>
> > On Fri, 8 Oct 2021 at 11:47, Aleksander Alekseev
> >  wrote:
>
> > > In order to do this, the SQL syntax should be modified. The proposed
> > > syntax is based on Matthias van de Meent's idea [6]:
> >
> > Seems fine
> >
> > > ```
> > > CREATE TYPE  AS JSONB_DICTIONARY (
> > >   learn_by = { {"table_a", "column_a"}, {"table_b", "column_b"}, ... },
>
> Actually, why is it a JSONB_DICTIONARY and not like
>
> CREATE TYPE name AS DICTIONARY (
>   base_type = JSONB, ...
> );

That's a good point, but if we're extending this syntax to allow the
ability of including other types, then I'd instead extend the syntax
that of below, so that the type of the dictionary entries is required
in the syntax:

CREATE TYPE name AS DICTIONARY OF jsonb [ ( ...entries ) ] [ WITH (
...options ) ];

> so that it is possible to use the infrastructure for other things?  For
> example, perhaps PostGIS geometries could benefit from it -- or even
> text or xml columns.
>
> The pg_type entry would have to provide some support procedure that
> makes use of the dictionary in some way.  This seems better than tying
> the SQL object to a specific type.

Agreed, but this might mean that much more effort would be required to
get such a useful quality-of-life feature committed.

Kind regards,

Matthias




Re: Time to upgrade buildfarm coverage for some EOL'd OSes?

2021-10-08 Thread Mikael Kjellström

On 2021-10-08 18:12, Tom Lane wrote:

=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?=  writes:

Yes, it's getting long in the tooth.  I will upgrade the NetBSD 7
(sidewinder) to 9.2.
And I will also upgrade loach to 12.x if that's the version that is
needed the most.
I will upgrade conchuela to DragonFlyBSD 6.0.
I will remove the 5.9 (curculio) and upgrade the 6.5 (morepork) to 6.9.


+1 to all of that except retiring curculio.  That one has shown us issues
we've not seen on other animals (recent examples at [1][2]), so unless
you're feeling resource-constrained I'd vote for keeping it going.


Sure I can keep curculio as is.  Will just upgrade morepork to OpenBSD 
6.9 then.


/Mikael




Re: More business with $Test::Builder::Level in the TAP tests

2021-10-08 Thread Andrew Dunstan


On 10/6/21 9:53 PM, Michael Paquier wrote:
> On Wed, Oct 06, 2021 at 07:33:22AM -0400, Andrew Dunstan wrote:
>> We should probably state a requirement for this somewhere. Maybe in
>> src/test/perl/README. AIUI, the general rule is that any subroutine that
>> directly or indirectly calls ok() and friends should increase the level.
>> Such subroutines that don't increase it should probably contain a
>> comment stating why, so we can know in future that it's not just an
>> oversight.
> That makes sense.  How about something like that after the part about
> Test::More::like and qr// in the section about writing tests?  Here it
> is:
> +Test::Builder::Level controls how far up in the call stack a test will look
> +at when reporting a failure.  This should be incremented by any subroutine
> +calling test routines from Test::More, like ok() or is():
> +
> +local $Test::Builder::Level = $Test::Builder::Level + 1;


I think we need to be more explicit about it, especially w.r.t. indirect
calls. Every subroutine in the call stack below where you want to error
reported as coming from should contain this line.

Suppose we have


sub a { b();  }

sub b { c();  }

sub c { local $Test::Builder::Level = $Test::Builder::Level + 1;
ok(0,"should succeed"); }


AIUI a call to a() will show the call in b() as the error source. If we
want the error source to be the call to a() we need to add that
increment to both b() and a();


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Time to upgrade buildfarm coverage for some EOL'd OSes?

2021-10-08 Thread Tom Lane
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?=  writes:
> Yes, it's getting long in the tooth.  I will upgrade the NetBSD 7 
> (sidewinder) to 9.2.
> And I will also upgrade loach to 12.x if that's the version that is 
> needed the most.
> I will upgrade conchuela to DragonFlyBSD 6.0.
> I will remove the 5.9 (curculio) and upgrade the 6.5 (morepork) to 6.9.

+1 to all of that except retiring curculio.  That one has shown us issues
we've not seen on other animals (recent examples at [1][2]), so unless
you're feeling resource-constrained I'd vote for keeping it going.

regards, tom lane

[1] https://www.postgresql.org/message-id/ypadf9r5ajbdo...@paquier.xyz
[2] 
https://www.postgresql.org/message-id/caa4ek1+uw1ugdhdz-hwmhmen76mkp7njebotzn4uwbymjay...@mail.gmail.com




Re: pgsql: Adjust configure to insist on Perl version >= 5.8.3.

2021-10-08 Thread Tom Lane
Daniel Gustafsson  writes:
> On 8 Oct 2021, at 06:24, Noah Misch  wrote:
>> That's obvious from "cpanm install IPC::Run".  Surely if any other non-core
>> module were allowed, the recipe would list it in a similar way.

> The proposed changes talks about with core modules are allowed to use, I think
> that's a different thing.  The distinction between core and non-core modules
> may not be known/clear to people who haven't used Perl in the past.

Yeah, I don't really think that this recipe makes it plain that we have
a policy.  It certainly fails to explain that you're allowed to use
additional modules if you're willing to skip the relevant tests.

> This README isn't primarily targeting committers though IMO, but new 
> developers
> onboarding onto postgres who are trying to learn the dev environment.

Right.

>> If there's something to change, it's improving the actual recipe:

> That we should do as well.

You're not going to get far with "improving the recipe", because it's
just not possible.  To check this, I installed perlbrew on a Fedora 34
machine, and found that it actually can install a mostly-working 5.8.3
(nice!).  But as I suspected earlier, it can't reproduce the old module
configuration:

$ cpanm install Test::More@0.87
--> Working on install
Fetching http://www.cpan.org/authors/id/D/DA/DAGOLDEN/install-0.01.tar.gz ... OK
==> Found dependencies: ExtUtils::MakeMaker
--> Working on ExtUtils::MakeMaker
Fetching 
http://www.cpan.org/authors/id/B/BI/BINGOS/ExtUtils-MakeMaker-7.62.tar.gz ... OK
Configuring ExtUtils-MakeMaker-7.62 ... OK
Building and testing ExtUtils-MakeMaker-7.62 ... OK
Successfully installed ExtUtils-MakeMaker-7.62 (upgraded from 6.17)
Configuring install-0.01 ... OK
Building and testing install-0.01 ... OK
Successfully installed install-0.01
! Finding Test::More (== 0.87) on cpanmetadb failed.
Found Test::More 1.302188 which doesn't satisfy == 0.87.
2 distributions installed

Not only is that a fail on Test::More itself, but as un-asked-for side
effects, it upgraded ExtUtils::MakeMaker to current, and installed a
module that should not have been there (which kinda defeats the point
of the exercise).

I did find I could install IPC::Run 0.79, which matches prairiedog's
version of that module:

$ cpanm install IPC::Run@0.79  
install is up to date. (0.01)
! Finding IPC::Run (== 0.79) on cpanmetadb failed.
--> Working on IPC::Run
Fetching http://cpan.metacpan.org/authors/id/R/RS/RSOD/IPC-Run-0.79.tar.gz ... 
OK
Configuring IPC-Run-0.79 ... OK
Building and testing IPC-Run-0.79 ... OK
Successfully installed IPC-Run-0.79
1 distribution installed

However, this just reflects the fact that prairiedog's installation
is itself a bit Frankensteinan.  What it has for Test::More and
IPC::Run are just the oldest versions I could find on CPAN back in
2017 when I built that installation.  I can't claim that they have
any historical relevance.  They are, however, a lot more likely to
still be duplicatable from current CPAN than actually-old versions.

So while this recipe is a lot more useful than I thought, it can't
entirely reproduce the Perl environment of older buildfarm members.
I think we really ought to document that.  I also think it is
useful to explicitly state the policy and then give this recipe
as one way to (partially) test against the policy.

regards, tom lane




Re: pgbench bug candidate: negative "initial connection time"

2021-10-08 Thread Fujii Masao



On 2021/10/01 15:27, Michael Paquier wrote:

On Wed, Sep 29, 2021 at 10:11:53PM +0900, Fujii Masao wrote:

BTW, when logfile fails to be opened, pgbench gets stuck due to commit
aeb57af8e6. So even if we decided not to back-patch those changes,
we should improve the handling of logfile open failure, to fix the issue.


There is an entry in the CF for this thread:
https://commitfest.postgresql.org/34/3219/

I have moved that to the next one as some pieces are missing.  If you
are planning to handle the rest, could you register your name as a
committer?


Thanks for letting me know that!
I registered myself as a committer of the patch again.


pg_time_usec_t conn_duration;   /* cumulated connection and deconnection
 * 
delays */

BTW, while reading the patch, I found the above comment in pgbench.c.
"deconnection" seems a valid word in French (?), but isn't it better to
replace it with "disconnection"? Patch attached.


Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d6b31bbf53..81693e1765 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -509,7 +509,7 @@ typedef struct
pg_time_usec_t create_time; /* thread creation time */
pg_time_usec_t started_time;/* thread is running */
pg_time_usec_t bench_start; /* thread is benchmarking */
-   pg_time_usec_t conn_duration;   /* cumulated connection and deconnection
+   pg_time_usec_t conn_duration;   /* cumulated connection and 
disconnection
 * 
delays */
 
StatsData   stats;


Re: Fix pg_log_backend_memory_contexts() 's delay

2021-10-08 Thread Fujii Masao




On 2021/10/06 17:14, bt21tanigaway wrote:

Thanks for your review.


Thanks for the patch. Do we also need to do the change in
HandleMainLoopInterrupts, HandleCheckpointerInterrupts,
HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call
CHECK_FOR_INTERRUPTS() there?



Yeah, that's still some information that the user asked for.  Looking
at the things that have a PGPROC entry, should we worry about the main
loop of the logical replication launcher?


・Now, the target of “pg_log_backend_memory_contexts()” is “autovacuum launcher” 
and “logical replication launcher”.  I observed that the delay occurred only in 
“autovacuum launcher” not in “logical replication launcher”.
・”autovacuum launcher” used “HandleAutoVacLauncherInterrupts()” ( not including 
“ProcessLogMemoryContextInterrupt()” ) instead of “ProcessInterrupts() ( 
including “ProcessLogMemoryContextInterrupt()” ). 
“ProcessLogMemoryContextInterrupt()” will not be executed until the next 
“ProcessInterrupts()” is executed. So, I added 
“ProcessLogMemoryContextInterrupt()”.
・”logical replication launcher” uses only “ProcessInterrupts()”. So, We don’t 
have to fix it.


Yes.



IMHO, we can support all the processes which return a PGPROC entry by
both BackendPidGetProc and AuxiliaryPidGetProc where the
AuxiliaryPidGetProc would cover the following processes. I'm not sure
one is interested in the  memory context info of auxiliary processes.


I like this idea because it seems helpful at least for debug purpose.



・The purpose of this patch is to solve the delay problem, so I would like 
another patch to deal with “ BackendPidGetProc” and “AuxiliaryPidGetProc”.


+1 to improve those things separately.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: RFC: compression dictionaries for JSONB

2021-10-08 Thread Alvaro Herrera
On 2021-Oct-08, Matthias van de Meent wrote:

> On Fri, 8 Oct 2021 at 11:47, Aleksander Alekseev
>  wrote:

> > In order to do this, the SQL syntax should be modified. The proposed
> > syntax is based on Matthias van de Meent's idea [6]:
> 
> Seems fine
> 
> > ```
> > CREATE TYPE  AS JSONB_DICTIONARY (
> >   learn_by = { {"table_a", "column_a"}, {"table_b", "column_b"}, ... },

Actually, why is it a JSONB_DICTIONARY and not like

CREATE TYPE name AS DICTIONARY (
  base_type = JSONB, ...
);

so that it is possible to use the infrastructure for other things?  For
example, perhaps PostGIS geometries could benefit from it -- or even
text or xml columns.

The pg_type entry would have to provide some support procedure that
makes use of the dictionary in some way.  This seems better than tying
the SQL object to a specific type.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
#error "Operator lives in the wrong universe"
  ("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-10-08 Thread Alvaro Herrera
On 2021-Oct-06, Masahiko Sawada wrote:

> Hi all,
> 
> A customer reported that during parallel index vacuum, the oldest xmin
> doesn't advance. Normally, the calculation of oldest xmin
> (ComputeXidHorizons()) ignores xmin/xid of processes having
> PROC_IN_VACUUM flag in MyProc->statusFlags. But since parallel vacuum
> workers don’t set their statusFlags, the xmin of the parallel vacuum
> worker is considered to calculate the oldest xmin. This issue happens
> from PG13 where the parallel vacuum was introduced. I think it's a
> bug.

Augh, yeah, I agree this is a pretty serious problem.

> But ISTM it’d be better to copy the leader’s status flags to workers
> in ParallelWorkerMain(). I've attached a patch for HEAD.

Hmm, this affects not only PROC_IN_VACUUM and PROC_IN_SAFE_CIC (the bug
you're fixing), but also:

* PROC_IS_AUTOVACUUM.  That seems reasonable to me -- should a parallel
worker for autovacuum be considered autovacuum too?  AFAICS it's only
used by the deadlock detector, so it should be okay.  However, in the
normal path, that flag is set much earlier.

* PROC_VACUUM_FOR_WRAPAROUND.  Should be innocuous I think, since the
"parent" process already has this flag and thus shouldn't be cancelled.

* PROC_IN_LOGICAL_DECODING.  Surely not set for parallel vacuum workers,
so not a problem.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Re: dfmgr additional ABI version fields

2021-10-08 Thread Peter Eisentraut

On 07.10.21 21:15, Tom Lane wrote:

Chapman Flack  writes:

On 10/07/21 12:42, Tom Lane wrote:

Can we make the addition be a string not a number, so that we
could include something more useful than "1234" in the error
message?



Just using a string like "EDB v" + something would probably rule out
collisions in practice. To be more formal about it, something like
the tag URI scheme [0] could be recommended.


Hmm.  Personally I'm more interested in the string being comprehensible to
end users than in whether there's any formal rule guaranteeing uniqueness.
I really doubt that we will have any practical problem with collisions,
so I'd rather go with something like "EnterpriseDB v1.2.3" than with
something like "tag:enterprisedb.com,2021:1.2.3".


Yeah, just a string should be fine.




Re: RFC: compression dictionaries for JSONB

2021-10-08 Thread Matthias van de Meent
On Fri, 8 Oct 2021 at 11:47, Aleksander Alekseev
 wrote:
> This is a follow-up thread to `Add ZSON extension to /contrib/` [1].
> The ZSON extension introduces a new type called ZSON, which is 100%
> compatible with JSONB but uses a shared dictionary of strings most
> frequently used by given JSONB documents for compression. See the
> thread for more details.

Great to see that you're still working on this! It would be great if
we could get this into postgres. As such, I hope you can provide some
clarifications on my questions and comments.

> According to the feedback I got, the community generally liked the
> idea of adding an across-rows and across-tables compression capability
> to JSONB. What the community didn't like was:
>
> 1. Introducing a new data type in order to archive this;
> 2. Updating compression dictionaries manually;

Well, I for one would like access to manually add entries to the
dictionary. What I'm not interested in is being required to manually
update the dictionary; but the ability to manually insert into the
dictionary however is much appreciated.

> 3. Some implementation details of ZSON, such as limited dictionary
> size (2 ** 16 entries) and an extensive usage of gettimeofday() system
> call;
>
> There was also a request for proof of the usefulness of this feature
> in practice.

More compact JSONB is never a bad idea: one reason to stick with JSON
over JSONB is that JSON can use significantly less space than JSONB,
if stored properly. So, improving the disk usage of JSONB is not
really a bad idea.

>
> == Proposal ==
>
> The proposal is to add the support of compression dictionaries to JSONB.
>
> In order to do this, the SQL syntax should be modified. The proposed
> syntax is based on Matthias van de Meent's idea [6]:

Seems fine

> ```
> CREATE TYPE  AS JSONB_DICTIONARY (
>   learn_by = { {"table_a", "column_a"}, {"table_b", "column_b"}, ... },

I'm having trouble understanding how this learn_by field would be used:

If stored as strings, they would go out of date when tables or columns
are renamed or dropped.
Similarly, you'd want to update the dictionary with common values in
columns of that type; generally not columns of arbitrary other types.
You can't in advance know the names of tables and columns, so that
would add a burden of maintenance to the user when they add / change /
remove a column of the dictionary type. Instead of storing 'use update
data from table X column Y' in the type, I think that adding it as a
column option would be the better choice.

I agree with an option for auto-update, though I don't think we have
enough information to determine the default value (I'd err to the side
of caution, going with 'off').

>   autoupdate = false, -- true by default
>   -- optional: other arguments, min/max string lengths, etc
> );
> ```

For dump/restore I think it would be very useful to allow export &
import of these dictionaries, so that restored databases don't have
the problem of starting cold.

As such, `ALTER TYPE  jsondict ADD ENTRY entry_value` would probably
be useful, and maybe even `CREATE TYPE dict AS JSONB_DICTIONARY
('"entry_1"', '"entry_two"', '"entry_three"') WITH (option =
optional)`

> Basically, this is an equivalent of zson_learn [7]. It will create an
> id -> string dictionary in the PostgreSQL catalog. When the user
> chooses `autoupdate = true`, the dictionary will be updated
> automatically by PostgreSQL (e.g. during the VACUUM). This is the
> default value. The dictionary can also be updated manually:
>
> ```
> SELECT jsonb_update_dictionary("type-name");
> ```

I'm a bit on the fence about this. We do use this for sequences, but
alternatively we might want to use ALTER TYPE jsondict;

> Other than that, the type works like a regular one. All the usual
> ALTER TYPE / DROP TYPE semantics are applicable. All the operators
> available to JSONB are also available to .
>
> Internally  is represented similar to JSONB. However, the
> strings from the dictionary are replaced with varints.

How do you propose to differentiate actual integers with these keyed
strings, and / or actual strings with varints? Replacing _all_ strings
doesn't seem like such a great idea.

Related comment below.

> This idea was
> borrowed from Tomas Vondra [8]. The dictionary size is limited to
> 2**28 entries. The limit can be easily extended in the future if
> necessary. Also  stores the version of the dictionary used
> to compress the data. All in all, this is similar to how ZSON works.

I appreciate this idea, but using that varint implementation is not a
choice I'd make. In the jsonb storage format, we already encode the
length of each value, so varint shouldn't be necessary here. Next, as
everything in jsonb storage is 4-byte aligned, a uint32 should
suffice, or if we're being adventurous, we might even fit a uint29
identifier in the length field instead (at the cost of full backwards
incompatibility).

Lastly, we don't have a good format for varint now (numeric is 

Re: Compressing temporary files

2021-10-08 Thread Bharath Rupireddy
 On Sat, Sep 11, 2021, 6:01 PM Andrey Borodin  wrote:
>
> Hi hackers!
>
> There's a lot of compression discussions nowadays. And that's cool!
> Recently Naresh Chainani in private discussion shared with me the idea to 
> compress temporary files on disk.
> And I was thrilled to find no evidence of implementation of this interesting 
> idea.
>
> I've prototyped Random Access Compressed File for fun[0]. The code is very 
> dirty proof-of-concept.
> I compress Buffile by one block at a time. There are directory pages to store 
> information about the size of each compressed block. If any byte of the block 
> is changed - whole block is recompressed. Wasted space is never reused. If 
> compressed block is more then BLCSZ - unknown bad things will happen :)
>
> Here are some my observations.
>
> 0. The idea seems feasible. API of fd.c used by buffile.c can easily be 
> abstracted for compressed temporary files. Seeks are necessary, but they are 
> not very frequent. It's easy to make temp file compression GUC-controlled.
>
> 1. Temp file footprint can be easily reduced. For example query
> create unlogged table y as select random()::text t from 
> generate_series(0,999) g;
> uses for toast index build 14000 bytes of temp file. With patch this 
> value is reduced to 40841704 (x3.42 smaller).
>
> 2. I have not found any evidence of performance improvement. I've only 
> benchmarked patch on my laptop. And RAM (page cache) diminished any 
> difference between writing compressed block and uncompressed block.
>
> How do you think: does it worth to pursue the idea? OLTP systems rarely rely 
> on data spilled to disk.
> Are there any known good random access compressed file libs? So we could 
> avoid reinventing the wheel.
> Maybe someone tried this approach before?

Are you proposing to compress the temporary files being created by the
postgres processes under $PGDATA/base/pgsql_tmp? Are there any other
directories that postgres processes would write temporary files to?

Are you proposing to compress the temporary files that get generated
during the execution of queries? IIUC, the temp files under the
pgsql_tmp directory get cleaned up at the end of each txn right? In
what situations the temporary files under the pgsql_tmp directory
would remain even after the txns that created them are
committed/aborted? Here's one scenario: if a backend crashes while
executing a huge analytic query, I can understand that the temp files
would remain in pgsql_tmp and we have the commit [1] cleaning them on
restart. Any other scenarios that fill up the pgsql_tmp directory?

[1] commit cd91de0d17952b5763466cfa663e98318f26d357
Author: Tomas Vondra 
Date:   Thu Mar 18 16:05:03 2021 +0100

Remove temporary files after backend crash

Regards,
Bharath Rupireddy.




Re: Time to upgrade buildfarm coverage for some EOL'd OSes?

2021-10-08 Thread Mikael Kjellström

On 2021-10-08 00:15, Thomas Munro wrote:


I noticed that for NetBSD we only have one animal, and it's running
EOL'd release 7.  To give decent visibility of relevant portability
problems it'd be nice to have one of the current supported releases[1]
in there.  CC'ing owner; any interest in updating this animal to 9.x?


Yes, it's getting long in the tooth.  I will upgrade the NetBSD 7 
(sidewinder) to 9.2.




For FreeBSD the situation is better, we have HEAD (bleeding edge 14),
13.x, and then loach running 10.3 which is dead.  Given that 12.x and
13.x are supported[2] (well, 11.4 is just about done), perhaps it'd
make sense to cover 12.x rather than 10.x?


And I will also upgrade loach to 12.x if that's the version that is 
needed the most.




I don't know too much about DragonflyBSD, but I happened to be
surveying operating systems we support (by the "it's in the build farm
so we're going to keep it green" definition) in the context of some
AIO work, and I learned that they'd ripped the native AIO support out
of this one at some point, which caused me to focus on the versions.
Animal conchuela is running 4.4 (2016) while 6.0 is current[3].
Again, if we're going to have one example of a rare OS that someone
cares about, I think it'd be useful to have a current one?


I will upgrade conchuela to DragonFlyBSD 6.0.



For OpenBSD we have the current[4] and previous major releases
covered, so that's cool, and then there's a 5.9 system, which is long
dead and could probably be put to better use, but at least we don't
lack coverage there.


I will remove the 5.9 (curculio) and upgrade the 6.5 (morepork) to 6.9.

Would these changes be acceptable?

/Mikael




Re: document the need to analyze partitioned tables

2021-10-08 Thread Justin Pryzby
Cleaned up and attached as a .patch.

The patch implementing autoanalyze on partitioned tables should revert relevant
portions of this patch.
>From cec31df3772ca51bbf14ebee207bcfd22e498073 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Thu, 22 Jul 2021 16:06:18 -0500
Subject: [PATCH] documentation deficiencies for ANALYZE of partitioned tables

This is partially extracted from 1b5617eb844cd2470a334c1d2eec66cf9b39c41a,
which was reverted.
---
 doc/src/sgml/maintenance.sgml | 27 ++
 doc/src/sgml/perform.sgml |  2 ++
 doc/src/sgml/ref/analyze.sgml | 36 ++-
 3 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5..b7c806cc90 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -290,6 +290,14 @@
 to meaningful statistical changes.

 
+   
+Tuples changed in partitions and inheritance children do not count towards
+analyze on the parent table.  If the parent table is empty or rarely
+changed, it may never be processed by autovacuum.  It is necessary to
+periodically manual run ANALYZE on the parent table to
+keep the statistics of its table hierarchy up to date.
+   
+

 As with vacuuming for space recovery, frequent updates of statistics
 are more useful for heavily-updated tables than for seldom-updated
@@ -347,6 +355,18 @@
  ANALYZE commands on those tables on a suitable schedule.
 

+
+   
+
+ The autovacuum daemon does not issue ANALYZE commands for
+ partitioned tables.  Inheritance parents will only be analyzed if the
+ parent itself is changed - changes to child tables do not trigger
+ autoanalyze on the parent table.  It is necessary to periodically run a
+ manual ANALYZE to keep the statistics of the table
+ hierarchy up to date.
+
+   
+
   
 
   
@@ -819,6 +839,13 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
 since the last ANALYZE.

 
+   
+Partitioned tables are not processed by autovacuum.  Statistics
+should be collected by running a manual ANALYZE when it is
+first populated, and updated whenever the distribution of data in its
+partitions changes significantly.
+   
+

 Temporary tables cannot be accessed by autovacuum.  Therefore,
 appropriate vacuum and analyze operations should be performed via
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 9cf8ebea80..caf703129d 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1765,9 +1765,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
Run ANALYZE Afterwards
 

+
 Whenever you have significantly altered the distribution of data
 within a table, running ANALYZE is strongly recommended. This
 includes bulk loading large amounts of data into the table.  Running
+
 ANALYZE (or VACUUM ANALYZE)
 ensures that the planner has up-to-date statistics about the
 table.  With no statistics or obsolete statistics, the planner might
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index c423aeeea5..9a904262df 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -250,22 +250,32 @@ ANALYZE [ VERBOSE ] [ table_and_columns
 
   
-If the table being analyzed has one or more children,
-ANALYZE will gather statistics twice: once on the
-rows of the parent table only, and a second time on the rows of the
-parent table with all of its children.  This second set of statistics
-is needed when planning queries that traverse the entire inheritance
-tree.  The autovacuum daemon, however, will only consider inserts or
-updates on the parent table itself when deciding whether to trigger an
-automatic analyze for that table.  If that table is rarely inserted into
-or updated, the inheritance statistics will not be up to date unless you
-run ANALYZE manually.
+If the table being analyzed is partitioned, ANALYZE
+will gather statistics by sampling blocks randomly from its partitions;
+in addition, it will recurse into each partition and update its statistics.
+(However, in multi-level partitioning scenarios, each leaf partition
+will only be analyzed once.)
+By constrast, if the table being analyzed has inheritance children,
+ANALYZE will gather statistics for it twice:
+once on the rows of the parent table only, and a second time on the
+rows of the parent table with all of its children.  This second set of
+statistics is needed when planning queries that traverse the entire
+inheritance tree.  The child tables themselves are not individually
+analyzed in this case.
   
 
   
-If any of the child tables are foreign tables whose foreign data wrappers
-do not support ANALYZE, those child tables are 

RE: Skipping logical replication transactions on subscriber side

2021-10-08 Thread osumi.takami...@fujitsu.com
On Thursday, September 30, 2021 2:45 PM Masahiko Sawada  
wrote:
> I've attached updated patches that incorporate all comments I got so far. 
> Please
> review them.
Hello


Minor two comments for v15-0001 patch.

(1) a typo in pgstat_vacuum_subworker_stat()

+   /*
+* This subscription is live.  The next step is that we search 
errors
+* of the table sync workers who are already in sync state. 
These
+* errors should be removed.
+*/

This subscription is "alive" ?


(2) Suggestion to add one comment next to '0' in ApplyWorkerMain()

+   /* report the table sync error */
+   pgstat_report_subworker_error(MyLogicalRepWorker->subid,
+   
  MyLogicalRepWorker->relid,
+   
  MyLogicalRepWorker->relid,
+   
  0,
+   
  InvalidTransactionId,
+   
  errdata->message);

How about writing /* no corresponding message type for table synchronization */ 
or something ?


Best Regards,
Takamichi Osumi



Re: psql - add SHOW_ALL_RESULTS option

2021-10-08 Thread Peter Eisentraut

On 02.10.21 16:31, Fabien COELHO wrote:

Attached v9 integrates your tests and makes them work.


Attached v11 is a rebase.


This patch still has a few of the problems reported earlier this year.

In [0], it was reported that certain replication commands result in 
infinite loops because of faulty error handling.  This still happens.  I 
wrote a test for it, attached here.  (I threw in a few more basic tests, 
just to have some more coverage that was lacking, and to have a file to 
put the new test in.)


In [1], it was reported that server crashes produce duplicate error 
messages.  This also still happens.  I didn't write a test for it, maybe 
you have an idea.  (Obviously, we could check whether the error message 
is literally there twice in the output, but that doesn't seem very 
general.)  But it's easy to test manually: just have psql connect, shut 
down the server, then run a query.


Additionally, I looked into the Coverity issue reported in [2].  That 
one is fixed, but I figured it would be good to be able to check your 
patches with a static analyzer in a similar way.  I don't have the 
ability to run Coverity locally, so I looked at scan-build and fixed a 
few minor warnings, also attached as a patch.  Your current patch 
appears to be okay in that regard.



[0]: 
https://www.postgresql.org/message-id/69c0b369-570c-4524-8ee4-bccacecb6...@amazon.com


[1]: https://www.postgresql.org/message-id/2902362.1618244...@sss.pgh.pa.us

[2]: https://www.postgresql.org/message-id/2680034.1618157...@sss.pgh.pa.us
>From e8dbe137737f94a2eaff86dc1676f9df39c60d00 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 7 Oct 2021 22:12:40 +0200
Subject: [PATCH] psql: More tests

---
 src/bin/psql/t/001_basic.pl | 42 +
 1 file changed, 42 insertions(+)
 create mode 100644 src/bin/psql/t/001_basic.pl

diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl
new file mode 100644
index 00..cd899e851e
--- /dev/null
+++ b/src/bin/psql/t/001_basic.pl
@@ -0,0 +1,42 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+
+use PostgresNode;
+use TestLib;
+use Test::More tests => 25;
+
+program_help_ok('psql');
+program_version_ok('psql');
+program_options_handling_ok('psql');
+
+my ($stdout, $stderr);
+my $result;
+
+# test --help=foo, analogous to program_help_ok()
+foreach my $arg (qw(commands variables))
+{
+   $result = IPC::Run::run [ 'psql', "--help=$arg" ], '>', \$stdout, '2>', 
\$stderr;
+   ok($result, "psql --help=$arg exit code 0");
+   isnt($stdout, '', "psql --help=$arg goes to stdout");
+   is($stderr, '', "psql --help=$arg nothing to stderr");
+}
+
+my $node = PostgresNode->new('main');
+$node->init;
+$node->append_conf(
+   'postgresql.conf', q{
+wal_level = 'logical'
+max_replication_slots = 4
+max_wal_senders = 4
+});
+$node->start;
+
+$node->command_like([ 'psql', '-c', '\copyright' ], qr/Copyright/, 
'\copyright');
+$node->command_like([ 'psql', '-c', '\help' ], qr/ALTER/, '\help without 
arguments');
+$node->command_like([ 'psql', '-c', '\help SELECT' ], qr/SELECT/, '\help');
+
+$node->command_fails_like([ 'psql', 'replication=database', '-c', 
'START_REPLICATION 123/456' ],
+   qr/^unexpected PQresultStatus: 8$/, 'handling of unexpected 
PQresultStatus');
-- 
2.33.0

>From 96634e3dc5f775b73a4142e9a5d83190bd9aecbb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 8 Oct 2021 13:30:15 +0200
Subject: [PATCH] psql: Fix scan-build warnings

---
 src/bin/psql/common.c   | 32 ++--
 src/bin/psql/copy.c |  1 -
 src/bin/psql/describe.c |  1 -
 3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 5640786678..1b224bf9e4 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -594,6 +594,7 @@ PSQLexec(const char *query)
 int
 PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE 
*printQueryFout)
 {
+   booltiming = pset.timing;
PGresult   *res;
double  elapsed_msec = 0;
instr_time  before;
@@ -608,7 +609,7 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt, 
FILE *printQueryFout)
 
SetCancelConn(pset.db);
 
-   if (pset.timing)
+   if (timing)
INSTR_TIME_SET_CURRENT(before);
 
res = PQexec(pset.db, query);
@@ -621,7 +622,7 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt, 
FILE *printQueryFout)
return 0;
}
 
-   if (pset.timing)
+   if (timing)
{
INSTR_TIME_SET_CURRENT(after);
INSTR_TIME_SUBTRACT(after, before);
@@ -674,7 +675,7 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt, 
FILE *printQueryFout)
fflush(fout);
 
/* Possible microtiming output */
-   if (pset.timing)
+   if (timing)
PrintTiming(elapsed_msec);
 
return 

Re: Skipping logical replication transactions on subscriber side

2021-10-08 Thread Greg Nancarrow
On Thu, Sep 30, 2021 at 3:45 PM Masahiko Sawada  wrote:
>
> I've attached updated patches that incorporate all comments I got so
> far. Please review them.
>

Some comments about the v15-0001 patch:

(1) patch adds a whitespace error

Applying: Add a subscription errors statistics view
"pg_stat_subscription_errors".
.git/rebase-apply/patch:1656: new blank line at EOF.
+
warning: 1 line adds whitespace errors.

(2) Patch comment says "This commit adds a new system view
pg_stat_logical_replication_errors ..."
BUT this is the wrong name, it should be "pg_stat_subscription_errors".


doc/src/sgml/monitoring.sgml

(3)
"Message of the error" doesn't sound right. I suggest just saying "The
error message".

(4) view column "last_failed_time"
I think it would be better to name this "last_error_time".


src/backend/postmaster/pgstat.c

(5) pgstat_vacuum_subworker_stats()

Spelling mistake in the following comment:

/* Create a map for mapping subscriptoin OID and database OID */

subscriptoin -> subscription

(6)
In the following functions:

pgstat_read_statsfiles
pgstat_read_db_statsfile_timestamp

The following comment should say "... struct describing subscription
worker statistics."
(i.e. need to remove the "a")

+ * 'S' A PgStat_StatSubWorkerEntry struct describing a
+ * subscription worker statistics.


(7) pgstat_get_subworker_entry

Suggest comment change:

BEFORE:
+ * Return the entry of subscription worker entry with the subscription
AFTER:
+ * Return subscription worker entry with the given subscription

(8) pgstat_recv_subworker_error

+ /*
+ * Update only the counter and timestamp if we received the same error
+ * again
+ */
+ if (wentry->relid == msg->m_relid &&
+ wentry->command == msg->m_command &&
+ wentry->xid == msg->m_xid &&
+ strncmp(wentry->message, msg->m_message, strlen(wentry->message)) == 0)
+ {

Is there a reason that the above check uses strncmp() with
strlen(wentry->message), instead of just strcmp()?
msg->m_message is treated as the same error message if it is the same
up to strlen(wentry->message)?
Perhaps if that is intentional, then the comment should be updated.

src/tools/pgindent/typedefs.list

(9)
The added "PgStat_SubWorkerError" should be removed from the
typedefs.list (as there is no such new typedef).


Regards,
Greg Nancarrow
Fujitsu Australia




logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

2021-10-08 Thread Bharath Rupireddy
Hi,

At times, users want to know what are the files (snapshot and mapping
files) that are available under pg_logical directory and also the
spill files that are under pg_replslot directory and how much space
they occupy. This will help to better know the storage usage pattern
of these directories. Can we have two new functions pg_ls_logicaldir
and pg_ls_replslotdir on the similar lines of pg_ls_logdir,
pg_ls_logdir,pg_ls_tmpdir, pg_ls_archive_statusdir [1]?

[1] - https://www.postgresql.org/docs/devel/functions-admin.html

Regards,
Bharath Rupireddy.




Re: Added schema level support for publication.

2021-10-08 Thread Amit Kapila
On Thu, Oct 7, 2021 at 5:19 PM Amit Kapila  wrote:
>
> On Wed, Oct 6, 2021 at 11:12 AM vignesh C  wrote:
> >
> > Attached v37 patch has the changes for the same.
> >
>
> Few comments:
> ==
>

Few more comments:

v37-0002-Client-side-changes-to-support-FOR-ALL-TABLES-IN
1.
+ else if (Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL",
"TABLES", "IN", "SCHEMA"))
+ COMPLETE_WITH_QUERY(Query_for_list_of_schemas
+ " UNION SELECT 'CURRENT_SCHEMA' "
+ "UNION SELECT 'WITH ('");

What is the need to display WITH here? It will be displayed after
Schema name with the below rule:
+ else if (Matches("CREATE", "PUBLICATION", MatchAny,  "FOR", "ALL",
"TABLES", "IN", "SCHEMA", MatchAny))
+ COMPLETE_WITH("WITH (");

2. It seems tab-completion happens incorrectly for the below case:
create publication pub for all tables in schema s1,

If I press the tab after above, it completes with below which is wrong
because it will lead to incorrect syntax.

create publication pub for all tables in schema s1, WITH (

v37-0003-Tests-for-FOR-ALL-TABLES-IN-SCHEMA-publication
3.
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
..
+ 'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test' => {
+ create_order => 51,
+ create_sql =>
+   'ALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;',
+ regexp => qr/^
+ \QALTER PUBLICATION pub3 ADD ALL TABLES IN SCHEMA dump_test;\E
+ /xm,
+ like   => { %full_runs, section_post_data => 1, },
+ unlike => { exclude_dump_test_schema => 1, },

In this test, won't it exclude the schema dump_test because of unlike?
If so, then we don't have coverage for "ALL Tables In Schema" except
for public schema?

4.
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
..
+-- fail - can't add schema to for all tables publication
+ALTER PUBLICATION testpub_foralltables ADD ALL TABLES IN SCHEMA pub_test;

In the above and all similar comments, it is better to either quote
'for all tables' or write in CAPS FOR ALL TABLE or both 'FOR ALL
TABLE'.

5.
+\dRp+ testpub1_forschema
+   Publication testpub1_forschema
+  Owner   | All tables | Inserts | Updates | Deletes
| Truncates | Via root
+--++-+-+-+---+--
+ regress_publication_user | f  | t   | t   | t
| t | f
+Tables from schemas:
+"pub_test1"
+
+SELECT p.pubname FROM pg_catalog.pg_publication p,
pg_catalog.pg_namespace n, pg_catalog.pg_publication_namespace pn
WHERE n.oid = pn.pnnspid AND p.oid = pn.pnpubid AND n.nspname =
'pub_test1' ORDER BY 1;
+  pubname
+
+ testpub1_forschema
+(1 row)

I don't think in the above and similar tests, we need to separately
check the presence of publication via Select query, if we have tested
it via psql command. Let's try to keep the meaningful tests.

6.
+INSERT INTO pub_test1.tbl VALUES(1, 'test');
+-- fail
+UPDATE pub_test1.tbl SET id = 2;
+ERROR:  cannot update table "tbl" because it does not have a replica
identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
+ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1;
+-- success
+UPDATE pub_test1.tbl SET id = 2;
+ALTER PUBLICATION testpub1_forschema SET ALL TABLES IN SCHEMA pub_test1;
+-- fail
+UPDATE pub_test1.tbl SET id = 2;
+ERROR:  cannot update table "tbl" because it does not have a replica
identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.
+ALTER PUBLICATION testpub1_forschema DROP ALL TABLES IN SCHEMA pub_test1;
+-- success
+UPDATE pub_test1.tbl SET id = 2;
+ALTER PUBLICATION testpub1_forschema ADD ALL TABLES IN SCHEMA pub_test1;
+-- fail
+UPDATE pub_test1.tbl SET id = 2;
+ERROR:  cannot update table "tbl" because it does not have a replica
identity and publishes updates
+HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER TABLE.

I think here we don't need to test both SET and ADD variants, one of
them is sufficient.

7.
+-- verify invalidation of partition table having partition on different schema

I think this comment is not very clear to me. Can we change it to:
"verify invalidation of partition table having parent and child tables
in different schema"?

8.
+-- verify invalidation of schema having partition parent table and
partition child table

Similarly, let's change this to: "verify invalidation of partition
tables for schema publication that has parent and child tables of
different partition hierarchies". Keep comments line boundary as 80
chars, that way they look readable.

9.
+++ b/src/test/subscription/t/025_rep_changes_for_schema.pl
..
+# Basic logical replication test

Let's change this comment to: "Logical replication tests for schema
publications"


-- 
With Regards,
Amit Kapila.




Reword docs of feature "Remove temporary files after backend crash"

2021-10-08 Thread Bharath Rupireddy
Hi,

The commit [1] for the feature "Remove temporary files after backend
crash" introduced following in the docs:
+   
+When set to on, which is the default,
+PostgreSQL will automatically remove
+temporary files after a backend crash. If disabled, the files will be
+retained and may be used for debugging, for example. Repeated crashes
+may however result in accumulation of useless files.
+   

The term backend means the user sessions (see from the glossary, at
[2]). It looks like the code introduced by the commit [1] i.e. the
temp table removal gets hit not only after the backend crash, but also
after checkpointer, bg writer, wal writer, auto vac launcher, logical
repl launcher and so on. It is sort of misleading to the normal users.
With the commit [3] clarifying these processes in master branch [4],
do we also need to modify the doc added by commit [1] in PG master at
least?

[1] commit cd91de0d17952b5763466cfa663e98318f26d357
Author: Tomas Vondra 
Date:   Thu Mar 18 16:05:03 2021 +0100

Remove temporary files after backend crash

[2] PG 14 - 
https://www.postgresql.org/docs/current/glossary.html#GLOSSARY-BACKEND

[3] commit d3014fff4cd4dcaf4b2764d96ad038f3be7413b0
Author: Alvaro Herrera 
Date:   Mon Sep 20 12:22:02 2021 -0300

Doc: add glossary term for "auxiliary process"

[4] PG master - https://www.postgresql.org/docs/devel/glossary.html

Regards,
Bharath Rupireddy.




Re: Capitalization of localized month and day names (to_char() with 'TMmonth', 'TMday', etc.)

2021-10-08 Thread Magnus Holmgren
onsdag 6 oktober 2021 kl. 16:01:49 CEST skrev du:
> =?UTF-8?Q?Juan_Jos=C3=A9_Santamar=C3=ADa_Flecha?= 
 writes:
> > On Wed, Oct 6, 2021 at 11:09 AM Magnus Holmgren
> > > 
> > wrote:
> >> There's just this tiny but seemingly obvious issue that I can't believe I
> >> haven't noticed until now: to_date(now(), 'TMmonth') returns 'october' in
> >> an
> >> English locale (en_US.UTF-8 at least). Names of months and weekdays are
> >> proper
> >> nouns and as such *always* capitalized in English, so that seems wrong to
> >> me.
> > 
> > IMHO, the patterns of TO_CHAR() do as promised in the documentation [1]:
> > MONTH full upper case month name (blank-padded to 9 chars)
> > Month full capitalized month name (blank-padded to 9 chars)
> > month full lower case month name (blank-padded to 9 chars)
> > 
> > What you are proposing looks more like a new feature than a bug.
> 
> Yeah, this is operating as designed and documented.  The idea that
> there should be a way to get "month name as it'd be spelled mid-sentence"
> is an interesting one, but I really doubt that anyone would thank us for
> changing TMmonth to act that way.  (Perhaps a new format code or modifier
> would be easier to swallow?)

Yes, I see that it's working as designed and documented, but I contend that 
the design is flawed for the reason I gave. I mean, you can't deny that names 
of months and weekdays are always capitalized in English and certain other 
languages, whereas in another set of languages they are not, can you? Perhaps 
this is a conscious design choice with some reason behind it, but if so, 
neither the PostgreSQL nor the Oracle documentation (https://docs.oracle.com/
cd/B12037_01/server.101/b10759/sql_elements004.htm#i34510) reveal it. What is 
the use case for linguistically incorrectly lowercased localized month and day 
names? What would such a change break?

I still suspect that whoever designed this didn't consider locale switching. 
(Interestingly, "month", "mon", "day", and "dy" are locale-specific by 
themselves; there is no "TM" prefix needed.

> I also wonder exactly how the code would figure out what to do ---
> language-specific conventions for this are not information available
> from the libc locale APIs, AFAIR.

I checked the code, and it looks like cache_locale_time() in src/backend/
utils/adt/pg_locale.c uses strftime(3) to produce the correctly capitalized 
day and month names and abbreviations (format codes %A, %B, %a, and %b). All 
that would be needed is not to force them to lowercase in DCH_to_char() in 
src/backend/utils/adt/formatting.c.

What could a new, separate format code that doesn't do this look like?

-- 
Magnus Holmgren, developer
MILLNET AB



-- 
Vid e-postkontakt med Millnet är det normalt att åtminstone vissa 
personuppgifter sparas om dig. Du kan läsa mer om vilka uppgifter som 
sparas och hur vi hanterar dem på https://www.millnet.se/integritetspolicy/ 
.




Re: Add ZSON extension to /contrib/

2021-10-08 Thread Aleksander Alekseev
Hi hackers,

Many thanks for all the great feedback!

Please see the follow-up thread `RFC: compression dictionaries for JSONB`:

https://www.postgresql.org/message-id/CAJ7c6TPx7N-bVw0dZ1ASCDQKZJHhBYkT6w4HV1LzfS%2BUUTUfmA%40mail.gmail.com

-- 
Best regards,
Aleksander Alekseev
Open-Source PostgreSQL Contributor at Timescale


RFC: compression dictionaries for JSONB

2021-10-08 Thread Aleksander Alekseev
Hi hackers,

== Background ==

This is a follow-up thread to `Add ZSON extension to /contrib/` [1].
The ZSON extension introduces a new type called ZSON, which is 100%
compatible with JSONB but uses a shared dictionary of strings most
frequently used by given JSONB documents for compression. See the
thread for more details.

According to the feedback I got, the community generally liked the
idea of adding an across-rows and across-tables compression capability
to JSONB. What the community didn't like was:

1. Introducing a new data type in order to archive this;
2. Updating compression dictionaries manually;
3. Some implementation details of ZSON, such as limited dictionary
size (2 ** 16 entries) and an extensive usage of gettimeofday() system
call;

There was also a request for proof of the usefulness of this feature
in practice.

To be honest with you I don't have solid proof that many users require
this feature, and how many users that would be exactly. ZSON was
originally developed because a customer of Postgres Professional
requested it back in 2016. People approach me with questions from time
to time. E.g. one user asked me recently how the extension can be
compiled on Windows [2].

Andrew Dunstan reported that 2nd Quadrant (now EDB) has a fork of ZSON
with some enhancements [3]. Konstantin Knizhnik reported that he
worked on a similar extension [4]. Unfortunately, Andrew and
Konstantin didn't give any more details (hopefully they will). But all
in all, this indicates some demand.

Speaking of performance, some time ago I benchmarked ZSON on data
similar to the one that the customer had [5]. The benchmark showed
~10% performance improvements in terms of TPS and ~50% of saved disk
space. The extension saved the memory as well, which was known from
the implementation. The exact amount of saved memory was not measured.
This benchmark shouldn't be considered as proof that all users will
necessarily benefit from such a feature. But it indicates that some
users could.

== Proposal ==

The proposal is to add the support of compression dictionaries to JSONB.

In order to do this, the SQL syntax should be modified. The proposed
syntax is based on Matthias van de Meent's idea [6]:

```
CREATE TYPE  AS JSONB_DICTIONARY (
  learn_by = { {"table_a", "column_a"}, {"table_b", "column_b"}, ... },
  autoupdate = false, -- true by default
  -- optional: other arguments, min/max string lengths, etc
);
```

Basically, this is an equivalent of zson_learn [7]. It will create an
id -> string dictionary in the PostgreSQL catalog. When the user
chooses `autoupdate = true`, the dictionary will be updated
automatically by PostgreSQL (e.g. during the VACUUM). This is the
default value. The dictionary can also be updated manually:

```
SELECT jsonb_update_dictionary("type-name");
```

Other than that, the type works like a regular one. All the usual
ALTER TYPE / DROP TYPE semantics are applicable. All the operators
available to JSONB are also available to .

Internally  is represented similar to JSONB. However, the
strings from the dictionary are replaced with varints. This idea was
borrowed from Tomas Vondra [8]. The dictionary size is limited to
2**28 entries. The limit can be easily extended in the future if
necessary. Also  stores the version of the dictionary used
to compress the data. All in all, this is similar to how ZSON works.

The first implementation always decompresses  entirely.
Partial compression and decompression can always be added
transparently to the user.

== Looking for a feedback ===

I would appreciate your feedback on this RFC.

Is anything missing in the description of the feature? Do you think
users need it? Can you think of a better user interface? Are there any
corner cases worth considering? Any other comments and questions are
welcome too!

I would like to implement this when the consensus will be reached on
how the feature should look like (and whether we need it). Any help
(from co-authors, REVIEWERS!!!, technical writers, ...) would be much
appreciated.

== Links ==

[1]: 
https://www.postgresql.org/message-id/flat/CAJ7c6TP3fCC9TNKJBQAcEf4c%3DL7XQZ7QvuUayLgjhNQMD_5M_A%40mail.gmail.com
[2]: https://github.com/postgrespro/zson/issues?q=is%3Aissue+is%3Aclosed
[3]: 
https://www.postgresql.org/message-id/6f3944ad-6924-5fed-580c-e72477733f04%40dunslane.net
[4]: https://github.com/postgrespro/jsonb_schema
[5]: https://github.com/postgrespro/zson/blob/master/docs/benchmark.md
[6]: 
https://www.postgresql.org/message-id/CAEze2WheMusc73UZ5TpfiAGQ%3DrRwSSgr0y3j9DEVAQgQFwneRA%40mail.gmail.com
[7]: https://github.com/postgrespro/zson#usage
[8]: 
https://www.postgresql.org/message-id/77356556-0634-5cde-f55e-cce739dc09b9%40enterprisedb.com

-- 
Best regards,
Aleksander Alekseev
Open-Source PostgreSQL Contributor at Timescale




Re: Improve the HINT message of the ALTER command for postgres_fdw

2021-10-08 Thread Daniel Gustafsson
> On 8 Oct 2021, at 09:38, Bharath Rupireddy 
>  wrote:
> 
> On Fri, Oct 8, 2021 at 12:48 PM bt21masumurak
>  wrote:
>> 
>> Hi
>> 
>> When 'ALTER FOREIGN DATA WRAPPER OPTIONS' is executed against
>> postgres_fdw, the HINT message is printed as shown below, even though
>> there are no valid options in this context.
>> 
>> =# ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (format 'csv');
>> ERROR: invalid option "format"
>> HINT: Valid options in this context are:
>> 
>> I made a patch for this problem.
> 
> Good catch.

+1

> It seems like the change proposed for postgres_fdw_validator is similar to 
> what
> file_fdw is doing in file_fdw_validator.  I think we also need to do the same
> change in dblink_fdw_validator and postgresql_fdw_validator as well.

Agreed.

> While on this, it's better to add test cases for the error message
> "There are no valid options in this context." for all the three fdws
> i.e. file_fdw, postgres_fdw and dblink_fdw may be in their respective
> contrib modules test files, and for postgresql_fdw_validator in
> src/test/regress/sql/foreign_data.sql.

+1.

--
Daniel Gustafsson   https://vmware.com/





Re: Map WAL segment files on PMEM as WAL buffers

2021-10-08 Thread Takashi Menjo
Hello Matthias,

Thank you for your comment!

> > [ v3-0002-Add-wal_pmem_map-to-GUC.patch ]
> > +extern bool wal_pmem_map;
>
> A lot of the new code in these patches is gated behind this one flag,
> but the flag should never be true on !pmem systems. Could you instead
> replace it with something like the following?
>
> +#ifdef USE_LIBPMEM
> +extern bool wal_pmem_map;
> +#else
> +#define wal_pmem_map false
> +#endif
>
> A good compiler would then eliminate all the dead code from being
> generated on non-pmem builds (instead of the compiler needing to keep
> that code around just in case some extension decides to set
> wal_pmem_map to true on !pmem systems because it has access to that
> variable).

That sounds good. I will introduce it in the next update.

> > [ v3-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch ]
> > +if ((uintptr_t) addr & ~PG_DAX_HUGEPAGE_MASK)
> > +elog(WARNING,
> > + "file not mapped on DAX hugepage boundary: path \"%s\" addr 
> > %p",
> > + path, addr);
>
> I'm not sure that we should want to log this every time we detect the
> issue; It's likely that once it happens it will happen for the next
> file as well. Maybe add a timeout, or do we generally not deduplicate
> such messages?

Let me give it some thought.  I have believed this WARNING is most
unlikely to happen, and is mutually independent from other happenings.
I will try to find a case where the WARNING happens repeatedly; or I
will de-duplicate the messages if it is easier.

Best regards,
Takashi

-- 
Takashi Menjo 




Re: More business with $Test::Builder::Level in the TAP tests

2021-10-08 Thread Daniel Gustafsson
> On 8 Oct 2021, at 09:51, Michael Paquier  wrote:
> 
> On Fri, Oct 08, 2021 at 09:28:04AM +0200, Daniel Gustafsson wrote:
>> LGTM.  Maybe it should be added that it *must* be called before any 
>> Test::More
>> function is called, it's sort of self-explanatory but not everyone writing 
>> TAP
>> tests will be deeply familiar with Perl.
> 
> I think that "must" is too strong in this context, as in some cases it
> does not really make sense to increment the level, when using for
> example a rather long routine that's labelled with one of the
> routine arguments like for pg_rewind.  So I would stick with
> "should".

Fair enough.

--
Daniel Gustafsson   https://vmware.com/





Re: pgsql: Adjust configure to insist on Perl version >= 5.8.3.

2021-10-08 Thread Daniel Gustafsson
> On 8 Oct 2021, at 06:24, Noah Misch  wrote:
> 
> On Thu, Oct 07, 2021 at 11:39:11PM -0400, Tom Lane wrote:
>> Noah Misch  writes:
>>> On Thu, Oct 07, 2021 at 03:44:48PM -0400, Tom Lane wrote:
> (1) I'm distrustful of the idea that perl 5.8.x will compile
> cleanly, or at all, on modern platforms.  Certainly Postgres
> releases of similar vintage won't.
>> 
>>> perlbrew uses the patchperl system to build old Perl in modern environments.
>>> This year, I used it to get 5.8.0.  Building unpatched 5.8.0 does fail.
>> 
>> Oh, cool.
>> 
 I propose that what might be more useful than the existing last
 section of src/test/perl/README is something along the lines of:
>> 
>>> -1.  This would replace a useful recipe with, essentially, a restatement of
>>> that recipe in English words.  That just leaves the user to rediscover the
>>> actual recipe.
>> 
>> Well, I think the existing text does the reader a disservice
>> by stating a specific recipe without any context.  Notably,
>> it says nothing about restricting which Perl modules you use.
> 
> That's obvious from "cpanm install IPC::Run".  Surely if any other non-core
> module were allowed, the recipe would list it in a similar way.

The proposed changes talks about with core modules are allowed to use, I think
that's a different thing.  The distinction between core and non-core modules
may not be known/clear to people who haven't used Perl in the past.

> This is a source tree README; it shouldn't try to hold the reader's hand like
> the user-facing docs do.  We've not had commits add usage of other modules, so
> there's no evidence of actual doubt on this point.


This README isn't primarily targeting committers though IMO, but new developers
onboarding onto postgres who are trying to learn the dev environment.

>> What do you think of using my proposed text followed by
>> 
>>One way to test against an old Perl version is to use
>>perlbrew.
>><< more or less the existing text here >>
>>Bear in mind that you will still need to install IPC::Run,
>>and what you will get is a current version not the one
>>distributed with Perl 5.8.3.  You will also need to update
>>Test::More because the version distributed with Perl 5.8.3
>>is too old to run our TAP tests.  So this recipe does not create
>>a perfect reproduction of a back-in-the-day Perl installation,
>>but it will probably catch any problems that might surface in
>>the buildfarm.
> 
> I don't see an improvement in there.

I respectfully disagree, the current text reads as if 5.8.0 is required for
running the test, not that using perlbrew is a great way to verify that your
tests pass in all supported Perl versions.

> If there's something to change, it's improving the actual recipe:

That we should do as well.

--
Daniel Gustafsson   https://vmware.com/





Re: More business with $Test::Builder::Level in the TAP tests

2021-10-08 Thread Michael Paquier
On Fri, Oct 08, 2021 at 09:28:04AM +0200, Daniel Gustafsson wrote:
> LGTM.  Maybe it should be added that it *must* be called before any Test::More
> function is called, it's sort of self-explanatory but not everyone writing TAP
> tests will be deeply familiar with Perl.

I think that "must" is too strong in this context, as in some cases it
does not really make sense to increment the level, when using for
example a rather long routine that's labelled with one of the
routine arguments like for pg_rewind.  So I would stick with
"should".
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2021-10-08 Thread Kyotaro Horiguchi
At Thu, 7 Oct 2021 13:20:14 +0900, Masahiko Sawada  
wrote in 
> Another idea to fix this problem would be that before calling
> SnapBuildCommitTxn() we create transaction entries in ReorderBuffer
> for (sub)transactions whose COMMIT record has XACT_XINFO_HAS_INVALS,
> and then mark all of them as catalog-changed by calling
> ReorderBufferXidSetCatalogChanges(). I've attached a PoC patch for
> this idea. What the patch does is essentially the same as what the
> proposed patch does. But the patch doesn't modify the
> SnapBuildCommitTxn(). And we remember the list of last running
> transactions in reorder buffer and the list is periodically purged
> during decoding RUNNING_XACTS records, eventually making it empty.

I came up with the third way.  SnapBuildCommitTxn already properly
handles the case where a ReorderBufferTXN with
RBTXN_HAS_CATALOG_CHANGES.  So this issue can be resolved by create
such ReorderBufferTXNs in SnapBuildProcessRunningXacts.

One problem with this is that change creates the case where multiple
ReorderBufferTXNs share the same first_lsn.  I haven't come up with a
clean idea to avoid relaxing the restriction of AssertTXNLsnOrder..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/replication/logical/reorderbuffer.c 
b/src/backend/replication/logical/reorderbuffer.c
index 46e66608cf..503116764f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -887,9 +887,14 @@ AssertTXNLsnOrder(ReorderBuffer *rb)
if (cur_txn->end_lsn != InvalidXLogRecPtr)
Assert(cur_txn->first_lsn <= cur_txn->end_lsn);
 
-   /* Current initial LSN must be strictly higher than previous */
+   /*
+* Current initial LSN must be strictly higher than previous. 
except
+* this transaction is created by XLOG_RUNNING_XACTS.  If one
+* XLOG_RUNNING_XACTS creates multiple transactions, they share 
the
+* same LSN. See SnapBuildProcessRunningXacts.
+*/
if (prev_first_lsn != InvalidXLogRecPtr)
-   Assert(prev_first_lsn < cur_txn->first_lsn);
+   Assert(prev_first_lsn <= cur_txn->first_lsn);
 
/* known-as-subtxn txns must not be listed */
Assert(!rbtxn_is_known_subxact(cur_txn));
diff --git a/src/backend/replication/logical/snapbuild.c 
b/src/backend/replication/logical/snapbuild.c
index a549a8..58859112dc 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1097,6 +1097,20 @@ SnapBuildProcessRunningXacts(SnapBuild *builder, 
XLogRecPtr lsn, xl_running_xact
 */
if (builder->state < SNAPBUILD_CONSISTENT)
{
+   /*
+* At the time we passed the first XLOG_RUNNING_XACTS record, 
the
+* transactions notified by the record may have updated
+* catalogs. Register the transactions with marking them as 
having
+* caused catalog changes.  The worst misbehavior here is some 
spurious
+* invalidation at decoding start.
+*/
+   if (builder->state == SNAPBUILD_START)
+   {
+   for (int i = 0 ; i < running->xcnt + running->subxcnt ; 
i++)
+   
ReorderBufferXidSetCatalogChanges(builder->reorder,
+   
  running->xids[i], lsn);
+   }
+
/* returns false if there's no point in performing cleanup just 
yet */
if (!SnapBuildFindSnapshot(builder, lsn, running))
return;


Re: Improve the HINT message of the ALTER command for postgres_fdw

2021-10-08 Thread Bharath Rupireddy
On Fri, Oct 8, 2021 at 12:48 PM bt21masumurak
 wrote:
>
> Hi
>
> When 'ALTER FOREIGN DATA WRAPPER OPTIONS' is executed against
> postgres_fdw, the HINT message is printed as shown below, even though
> there are no valid options in this context.
>
> =# ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (format 'csv');
> ERROR: invalid option "format"
> HINT: Valid options in this context are:
>
> I made a patch for this problem.

Good catch. It seems like the change proposed for
postgres_fdw_validator is similar to what file_fdw is doing in
file_fdw_validator. I think we also need to do the same change in
dblink_fdw_validator and postgresql_fdw_validator as well.

While on this, it's better to add test cases for the error message
"There are no valid options in this context." for all the three fdws
i.e. file_fdw, postgres_fdw and dblink_fdw may be in their respective
contrib modules test files, and for postgresql_fdw_validator in
src/test/regress/sql/foreign_data.sql.

Regards,
Bharath Rupireddy.




Re: Logical replication timeout problem

2021-10-08 Thread Fabrice Chapuis
Thanks Tang for your script.
Our debugging environment will be ready soon. I will test your script and
we will try to reproduce the problem by integrating the patch provided by
Amit. As soon as I have results I will let you know.

Regards

Fabrice

On Thu, Sep 30, 2021 at 3:15 AM Tang, Haiying/唐 海英 
wrote:

> On Friday, September 24, 2021 12:04 AM, Fabrice Chapuis <
> fabrice636...@gmail.com> wrote:
>
> >
>
> > Thanks for your patch, we are going to set up a lab in order to debug
> the function.
>
>
>
> Hi
>
>
>
> I tried to reproduce this timeout problem on version10.18 but failed.
>
> In my trial, I inserted large amounts of data at publisher, which took
> more than 1 minute to replicate.
>
> And with the patch provided by Amit, I saw that the frequency of invoking
>
> WalSndKeepaliveIfNecessary function is raised after I inserted data.
>
>
>
> The test script is attached. Maybe you can try it on your machine and
> check if this problem could happen.
>
> If I miss something in the script, please let me know.
>
> Of course, it will be better if you can provide your script to reproduce
> the problem.
>
>
>
> Regards
>
> Tang
>
>


Re: More business with $Test::Builder::Level in the TAP tests

2021-10-08 Thread Daniel Gustafsson
> On 7 Oct 2021, at 03:53, Michael Paquier  wrote:
> 
> On Wed, Oct 06, 2021 at 07:33:22AM -0400, Andrew Dunstan wrote:
>> We should probably state a requirement for this somewhere. Maybe in
>> src/test/perl/README. AIUI, the general rule is that any subroutine that
>> directly or indirectly calls ok() and friends should increase the level.
>> Such subroutines that don't increase it should probably contain a
>> comment stating why, so we can know in future that it's not just an
>> oversight.
> 
> That makes sense.  How about something like that after the part about
> Test::More::like and qr// in the section about writing tests?  Here it
> is:
> +Test::Builder::Level controls how far up in the call stack a test will look
> +at when reporting a failure.  This should be incremented by any subroutine
> +calling test routines from Test::More, like ok() or is():
> +
> +local $Test::Builder::Level = $Test::Builder::Level + 1;

LGTM.  Maybe it should be added that it *must* be called before any Test::More
function is called, it's sort of self-explanatory but not everyone writing TAP
tests will be deeply familiar with Perl.

--
Daniel Gustafsson   https://vmware.com/





Improve the HINT message of the ALTER command for postgres_fdw

2021-10-08 Thread bt21masumurak

Hi

When 'ALTER FOREIGN DATA WRAPPER OPTIONS' is executed against 
postgres_fdw, the HINT message is printed as shown below, even though 
there are no valid options in this context.


=# ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (format 'csv');
ERROR: invalid option "format"
HINT: Valid options in this context are:

I made a patch for this problem.


regards,
Kosei Masumuradiff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 5bb1af4084..19edf98360 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -107,8 +107,10 @@ postgres_fdw_validator(PG_FUNCTION_ARGS)
 			ereport(ERROR,
 	(errcode(ERRCODE_FDW_INVALID_OPTION_NAME),
 	 errmsg("invalid option \"%s\"", def->defname),
-	 errhint("Valid options in this context are: %s",
-			 buf.data)));
+	 buf.len > 0
+ ? errhint("Valid options in this context are: %s",
+ buf.data)
+ : errhint("There are no valid options in this context.")));
 		}
 
 		/*


RE: Skipping logical replication transactions on subscriber side

2021-10-08 Thread osumi.takami...@fujitsu.com
On Thursday, September 30, 2021 2:45 PM Masahiko Sawada  
wrote:
> I've attached updated patches that incorporate all comments I got so far. 
> Please
> review them.
Hi


Sorry, if I misunderstand something but
did someone check what happens when we
execute ALTER SUBSCRIPTION ... RESET (streaming)
in the middle of one txn which has several streaming of data to the sub,
especially after some part of txn has been already streamed.
My intention of this is something like *if* we can find an actual harm of this,
I wanted to suggest the necessity of a safeguard or some measure into the patch.

An example)

Set the logical_decoding_work_mem = 64kB on the pub.
and create a table and subscription with streaming = true.
In addition, log_min_messages = DEBUG1 on the sub
is helpful to check the LOG on the sub in stream_open_file().

 connect to the publisher

BEGIN;
INSERT INTO tab VALUES (generate_series(1, 1000)); -- this exceeds the memory 
limit
SELECT * FROM pg_stat_replication_slots;  -- check the actual streaming 
bytes just in case

 connect to the subscriber
-- after checking some logs of "open file  for streamed changes" on the sub
ALTER SUBSCRIPTION mysub RESET (streaming)


INSERT INTO tab VALUES (generate_series(1001, 2000)); -- again, exceeds the 
limit
COMMIT;


I observed that the subscriber doesn't
accept STREAM_COMMIT in this case but gets BEGIN instead at the end.
I couldn't find any apparent and immediate issue from those steps
but is that no problem ?
Probably, this kind of situation applies to other reset target options ?


Best Regards,
Takamichi Osumi



Re: PATCH: psql tab completion for SELECT

2021-10-08 Thread David Fetter
On Sat, Sep 22, 2018 at 06:53:55PM +1200, Edmund Horner wrote:
> Hi all,
> 
> Here are some rebased versions of the last two patches.  No changes in
> functionality, except a minor case sensitivity fix in the "completion
> after commas" patch.
> 
> Edmund

I've rebased and updated these to be more principled about what
functions could be tab completed.

Still missing: tests.

What precisely is this supposed to do?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
>From 87fb63d70f5e08f853a2b8aa94e03c0d665f5000 Mon Sep 17 00:00:00 2001
From: David Fetter 
Date: Thu, 7 Oct 2021 15:59:06 -0700
Subject: [PATCH v9 1/2] Infrastructure for tab completion in the SELECT list

---
 src/bin/psql/tab-complete.c | 65 ++---
 1 file changed, 54 insertions(+), 11 deletions(-)

diff --git src/bin/psql/tab-complete.c src/bin/psql/tab-complete.c
index ecae9df8ed..5db143523b 100644
--- src/bin/psql/tab-complete.c
+++ src/bin/psql/tab-complete.c
@@ -1017,6 +1017,45 @@ static const VersionedQuery Query_for_list_of_subscriptions[] = {
 	{0, NULL}
 };
 
+static const SchemaQuery Query_for_list_of_selectable_functions[] = {
+	{
+		/* min_server_version */
+		70400,
+		/* catname */
+		"pg_catalog.pg_proc p",
+		/* Disallow functions which take or return pseudotypes which are other than all* or *record */
+		"NOT((ARRAY[p.prorettype] || coalesce(p.proallargtypes, p.proargtypes)) "
+		"&& ARRAY(SELECT oid FROM pg_catalog.pg_type WHERE typtype='p' AND typname !~ '(^all|record$)'))"
+		/* Disallow stored procedures XXX this may change later. */
+		" AND p.prokind <> 'p'"
+		/* viscondition */
+		"pg_catalog.pg_function_is_visible(p.oid)",
+		/* namespace */
+		"p.pronamespace",
+		/* result */
+		"pg_catalog.quote_ident(p.proname)||'('",
+		/* qualresult */
+		NULL
+	},
+	{0, NULL}
+};
+
+/*
+ * This addon is used to find (unqualified) column names to include
+ * alongside the function names from the query above.
+ */
+static const VersionedQuery Query_addon_for_list_of_selectable_attributes[] = {
+	{70400,
+		"UNION ALL "
+		"   SELECT DISTINCT pg_catalog.quote_ident(attname) "
+		" FROM pg_catalog.pg_attribute "
+		"WHERE attnum > 0 "
+		"  AND NOT attisdropped "
+		"  AND substring(pg_catalog.quote_ident(attname),1,%d)='%s'"
+	},
+	{0, NULL}
+};
+
 /*
  * This is a list of all "things" in Pgsql, which can show up after CREATE or
  * DROP; and there is also a query to get a list of them.
@@ -3768,7 +3807,9 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("IS");
 
 /* SELECT */
-	/* naah . . . */
+	else if (TailMatches("SELECT") || TailMatches("SELECT", "ALL|DISTINCT"))
+		COMPLETE_WITH_VERSIONED_SCHEMA_QUERY(Query_for_list_of_selectable_functions,
+			 Query_addon_for_list_of_selectable_attributes);
 
 /* SET, RESET, SHOW */
 	/* Complete with a variable name */
@@ -4432,7 +4473,8 @@ complete_from_versioned_schema_query(const char *text, int state)
  * where %d is the string length of the text and %s the text itself.
  *
  * If both simple_query and schema_query are non-NULL, then we construct
- * a schema query and append the (uninterpreted) string simple_query to it.
+ * a schema query and append the simple_query to it, replacing the %d and %s
+ * as described above.
  *
  * It is assumed that strings should be escaped to become SQL literals
  * (that is, what is in the query is actually ... '%s' ...)
@@ -4579,21 +4621,22 @@ _complete_from_query(const char *simple_query,
 			  " WHERE substring(pg_catalog.quote_ident(nspname) || '.',1,%d) ="
 			  " substring('%s',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) = 1",
 			  char_length, e_text);
-
-			/* If an addon query was provided, use it */
-			if (simple_query)
-appendPQExpBuffer(_buffer, "\n%s", simple_query);
 		}
 		else
 		{
 			Assert(simple_query);
-			/* simple_query is an sprintf-style format string */
-			appendPQExpBuffer(_buffer, simple_query,
-			  char_length, e_text,
-			  e_info_charp, e_info_charp,
-			  e_info_charp2, e_info_charp2);
 		}
 
+		/*
+		 * simple_query is an sprintf-style format string (or it could be NULL, but
+		 * only if this is a schema query with no addon).
+		 */
+		if (simple_query)
+			appendPQExpBuffer(_buffer, simple_query,
+	  char_length, e_text,
+	  e_info_charp, e_info_charp,
+	  e_info_charp2, e_info_charp2);
+
 		/* Limit the number of records in the result */
 		appendPQExpBuffer(_buffer, "\nLIMIT %d",
 		  completion_max_records);
-- 
2.31.1

>From 3764b9fe211db1fa322fa83103d39356928942ae Mon Sep 17 00:00:00 2001
From: David Fetter 
Date: Thu, 7 Oct 2021 18:56:52 -0700
Subject: [PATCH v9 2/2] SELECT completion after commas

---
 src/bin/psql/tab-complete.c | 64 -
 1 file changed, 63 insertions(+), 1 deletion(-)

diff --git 

Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-08 Thread Bharath Rupireddy
On Fri, Oct 8, 2021 at 12:27 AM Bossart, Nathan  wrote:
>
> On 10/7/21, 10:42 AM, "Bharath Rupireddy" 
>  wrote:
> > In a typical production environment, the user (not necessarily a
> > superuser) sometimes wants to analyze the memory usage via
> > pg_backend_memory_contexts view or pg_log_backend_memory_contexts
> > function which are accessible to only superusers. Isn't it better to
> > allow non-superusers with an appropriate predefined role (I'm thinking
> > of pg_monitor) to access them?
>
> It looks like this was discussed previously [0].  From the description
> of pg_monitor [1], I think it's definitely arguable that this view and
> function should be accessible by roles that are members of pg_monitor.
>
> The pg_monitor, pg_read_all_settings, pg_read_all_stats and
> pg_stat_scan_tables roles are intended to allow administrators
> to easily configure a role for the purpose of monitoring the
> database server. They grant a set of common privileges
> allowing the role to read various useful configuration
> settings, statistics and other system information normally
> restricted to superusers.

Hm.

> AFAICT the current permissions were chosen as a safe default, but
> maybe it can be revisited.  The view and function appear to only
> reveal high level information about the memory contexts in use (e.g.,
> name, size, amount used), so I'm not seeing any obvious reason why
> they should remain superuser-only.  pg_log_backend_memory_contexts()
> directly affects the server log, which might be a bit beyond what
> pg_monitor should be able to do.  My currently thinking is that we
> should give pg_monitor access to pg_backend_memory_contexts (and maybe
> even pg_shmem_allocations).

pg_shmem_allocations is also a good candidate.

> However, one interesting thing I see is
> that there is no mention of any predefined roles in system_views.sql.
> Instead, the convention seems to be to add hard-coded checks for
> predefined roles in the backing functions.  I don't know if that's a
> hard and fast rule, but I do see that predefined roles are given
> special privileges in system_functions.sql.

There are two things: 1) We revoke the permissions for nonsuperuser in
system_views.sql with the below
REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;

2) We don't revoke any permissions in the system_views.sql, but we
have the following kind checks in the underlying function:
/*
* Only superusers or members of pg_monitor can <>.
*/
if (!superuser() || !is_member_of_role(GetUserId(), ROLE_PG_MONITOR))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be a superuser or a member of the pg_monitor role to
<>")));

I think we can remove the below revoke statements from
system_views.sql and place the checks shown at (2) in the underlying
functions pg_get_shmem_allocations, pg_get_backend_memory_contexts,
also in pg_log_backend_memory_contexts.

REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
REVOKE ALL ON pg_backend_memory_contexts FROM PUBLIC;
REVOKE EXECUTE ON FUNCTION pg_get_backend_memory_contexts() FROM PUBLIC;

Thoughts?

Regards,
Bharath Rupireddy.




Re: storing an explicit nonce

2021-10-08 Thread Antonin Houska
Stephen Frost  wrote:

> Greetings,
> 
> * Robert Haas (robertmh...@gmail.com) wrote:
> > On Thu, Oct 7, 2021 at 3:38 PM Stephen Frost  wrote:
> > > While I certainly also appreciate that we want to get this as right as
> > > we possibly can from the start, I strongly suspect we'll have one of two
> > > reactions- either we'll be more-or-less ignored and it'll be crickets
> > > from the security folks, or we're going to get beat up by them for
> > > $reasons, almost regardless of what we actually do.  Best bet to
> > > limit the risk ( ;) ) of the latter happening would be to try our best
> > > to do what existing solutions already do- such as by using XTS.
> > > There's things we can do to limit the risk of known-plaintext attacks,
> > > like simply not encrypting empty pages, or about possible known-IV
> > > risks, like using the LSN as part of the IV/tweak.  Will we get
> > > everything?  Probably not, but I don't think that we're going to really
> > > go wrong by using XTS as it's quite popularly used today and it's
> > > explicitly used for cases where you haven't got a place to store the
> > > extra nonce that you would need for AEAD encryption schemes.
> > 
> > I agree that using a popular approach is a good way to go. If we do
> > what other people do, then hopefully our stuff won't be significantly
> > more broken than their stuff, and whatever is can be fixed.
> 
> Right.
> 
> > > As long as we're clear that this initial version of TDE is with XTS then
> > > I really don't think we'll end up with anyone showing up and saying we
> > > screwed up by not generating a per-page nonce to store with it- the point
> > > of XTS is that you don't need that.
> > 
> > I agree that we shouldn't really catch flack for any weaknesses of the
> > underlying algorithm: if XTS turns out to be secure even when used
> > properly, and we use it properly, the resulting weakness is somebody
> > else's fault. On the other hand, if we use it improperly, that's our
> > fault, so we need to be really sure that we understand what guarantees
> > we need to provide from our end, and that we are providing them. Like
> > if we pick an encryption mode that requires nonces to be unique, we
> > will be at fault if they aren't; if it requires nonces to be
> > unpredictable, we will be at fault if they aren't; and so on.
> 
> Sure, I get that.  Would be awesome if all these things were clearly
> documented somewhere but I've never been able to find it quite as
> explicitly laid out as one would like.
> 
> > So that's what is making me nervous here ... it doesn't seem likely we
> > have complete unanimity about whether XTS is the right thing, though
> > that does seem to be the majority position certainly, and it is not
> > really clear to me that any of us can speak with authority about what
> > the requirements are around the nonces in particular.
> 
> The authority to look at, in my view anyway, are NIST publications.
> Following a bit more digging, I came across something which makes sense
> to me as intuitive but explains it in a way that might help everyone
> understand a bit better what's going on here:
> 
> https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-38G.pdf
> 
> specifically: Appendix C: Tweaks
> 
> Quoting a couple of paragraphs from that appendix:
> 
> """
> In general, if there is information that is available and statically
> associated with a plaintext, it is recommended to use that information
> as a tweak for the plaintext. Ideally, the non-secret tweak associated
> with a plaintext is associated only with that plaintext.
> 
> Extensive tweaking means that fewer plaintexts are encrypted under any
> given tweak. This corresponds, in the security model that is described
> in [1], to fewer queries to the target instance of the encryption.
> """
> 
> The gist of this being- the more diverse the tweaking being used, the
> better.  That's where I was going with my "limit the risk" comment.  If
> we can make the tweak vary more for a given encryption invokation,
> that's going to be better, pretty much by definition, and as explained
> in publications by NIST.
> 
> That isn't to say that using the same tweak for the same block over and
> over "breaks" the encryption (unlike with CTR/GCM, where IV reuse leads
> directly to plaintext being recoverable), but it does mean that an
> observer who can see the block writes over time could see what parts are
> changing (and which aren't) and may be able to derive insight from that.

This reminds me of Joe Conway's response to me email earlier:

https://www.postgresql.org/message-id/50335f56-041b-1a1f-59ea-b5f7bf917352%40joeconway.com

In the document he recommended

https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38a.pdf

specifically, in the Appendix C I read:

"""
For the CBC and CFB modes, the IVs must be unpredictable.  In particular, for
any given plaintext, it must not be possible to predict the IV that will be
associated to the