Re: persist logical slots to disk during shutdown checkpoint

2023-08-19 Thread Amit Kapila
On Sat, Aug 19, 2023 at 12:46 PM Julien Rouhaud  wrote:
>
> On Sat, 19 Aug 2023, 14:16 Amit Kapila,  wrote:
>>
>
>> The idea discussed in the thread [1] is to always persist logical
>> slots to disk during the shutdown checkpoint. I have extracted the
>> patch to achieve the same from that thread and attached it here. This
>> could lead to some overhead during shutdown (checkpoint) if there are
>> many slots but it is probably a one-time work.
>>
>> I couldn't think of better ideas but another possibility is to mark
>> the slot as dirty when we update the confirm_flush LSN (see
>> LogicalConfirmReceivedLocation()). However, that would be a bigger
>> overhead in the running server as it could be a frequent operation and
>> could lead to more writes.
>
>
> Yeah I didn't find any better option either at that time. I still think that 
> forcing persistence on shutdown is the best compromise. If we tried to always 
> mark the slot as dirty, we would be sure to add regular overhead but we would 
> probably end up persisting the slot on disk on shutdown anyway most of the 
> time, so I don't think it would be a good compromise.
>

The other possibility is that we introduce yet another dirty flag for
slots, say dirty_for_shutdown_checkpoint which will be set when we
update confirmed_flush LSN. The flag will be cleared each time we
persist the slot but we won't persist if only this flag is set. We can
then use it during the shutdown checkpoint to decide whether to
persist the slot.

-- 
With Regards,
Amit Kapila.




Re: PostgreSQL 16 release announcement draft

2023-08-19 Thread jian he
>
Additionally, this release adds a new field to the pg_stat_all_tables view,
capturing a timestamp representing when a table or index was last scanned.
PostgreSQL also makes auto_explain more readable by logging values passed into
parameterized statements, and improves accuracy of pg_stat_activity’s
normalization algorithm.
>

I am not sure if it's "capturing a timestamp representing" or
"capturing the timestamp representing".
"pg_stat_activity’s normalization algorithm", I think you are
referring to "pg_stat_statements"?




Re: ci: Improve macos startup using a cached macports installation

2023-08-19 Thread Andres Freund
On 2023-08-19 11:47:33 -0700, Andres Freund wrote:
> Given how significant an improvement this is in test time, and the limited
> blast radius, I am planning to push this soon, unless somebody opposes that
> soon.

And done.




PostgreSQL 16 release announcement draft

2023-08-19 Thread Jonathan S. Katz

Hi,

Attached is the first draft of the PostgreSQL 16 release announcement, 
authored by Chelsea Dole & myself.


To frame this up, the goal of the GA release announcement is to help 
folks discover the awesome new features of PostgreSQL. It's impossible 
to list out every single feature in the release and still have a 
coherent announcement, so we try to target features that have the 
broadest range of impact.


It's possible we missed or incorrectly stated something, so please 
provide feedback if we did so.


(Note I have not added in all of the links etc. to the Markdown yet, as 
I want to wait for the first pass of feedback to come through).


**Please provide feedback by August 26, 12:00 UTC**. After that point, 
we need to freeze all changes so we can begin the release announcement 
translation effort.


Thanks,

Jonathan
September 14, 2023 - The PostgreSQL Global Development Group today announced the
release of PostgreSQL 16, the latest version of the world’s most advanced open
source database.

PostgreSQL 16 raises its performance, with notable improvements to query
parallelism, bulk data loading, and logical replication. There are many features
in this release for developers and administrators alike, including more SQL/JSON
syntax, new monitoring stats for your workloads, and greater flexibility in
defining access control rules for large scale workloads.



PostgreSQL, an innovative data management system known for its reliability and
robustness, benefits from over 25 years of open source development from a global
developer community and has become the preferred open source relational database
for organizations of all sizes.

### Performance Improvements

PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, utilize incremental sorts for
`SELECT DISTINCT` queries, and execute window functions more efficiently. It
also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify
rows not present in a joined table.

This release includes improvements for bulk loading using `COPY` in both single
and concurrent operations, with tests showing up to a 300% performance
improvement in some cases. PostgreSQL adds support for load balancing in clients
that use `libpq`, and improvements to vacuum strategy that reduce the necessity
of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration
using `SIMD` in both x86 and ARM architectures, resulting in performance gains
when processing ASCII and JSON strings, and performing array and subtransaction
searches.

### Logical replication 

Logical replication lets PostgreSQL users stream data to other PostgreSQL
instances or subscribers that can interpret the PostgreSQL logical replication
protocol. In PostgreSQL 16, users can perform logical decoding from a standby
instance, meaning a standby can publish logical changes to other servers. This
provides developers with new workload distribution options – for example, using
a standby rather than the busier primary to logically replicate changes to
downstream systems.

Additionally, there are several performance improvements in PostgreSQL 16 to
logical replication. Subscribers can now apply large transactions using parallel
workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree
indexes instead of sequential scans to find rows. Under certain conditions,
users can also speed up initial table synchronization using the binary format.

There are several access control improvements to logical replication in
PostgreSQL 16, including the new predefined role pg_create_subscription, which
grants users the ability to create a new logical subscription. Finally, this
release begins adding support for bidirectional logical replication, introducing
functionality to replicate data between two tables from different publishers.

### Developer Experience

PostgreSQL 16 adds more syntax from the SQL/JSON standard, including
constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and
`IS JSON`. This release also introduces the ability to use underscores for
thousands separators (e.g. `5_432_000`) and non-decimal integer literals, such
as `0x1538`, `0o12470`, and `0b1010100111000`.

Developers using PostgreSQL 16 will also benefit from the addition of multiple
commands to `psql` client protocol, including the `\bind` command, which allows
users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind`
to substitute the variables. 

PostgreSQL 16 improves general support for text collations, which provide rules
for how text is sorted. PostgreSQL 16 builds with ICU support by default,
determines the default ICU locale from the environment, and allows users to
define custom ICU collation rules.

### Monitoring

A key aspect of tuning the performance of database workloads is understanding
the impact of your 

Re: postgres_fdw: wrong results with self join + enable_nestloop off

2023-08-19 Thread Andres Freund
Hi,

On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
> Maybe my explanation was not enough, so let me explain:
> 
> * I think you could use the set_join_pathlist_hook hook as you like at
> your own responsibility, but typical use cases of the hook that are
> designed to support in the core system would be just add custom paths
> for replacing joins with scans, as described in custom-scan.sgml (this
> note is about set_rel_pathlist_hook, but it should also apply to
> set_join_pathlist_hook):
> 
> Although this hook function can be used to examine, modify, or remove
> paths generated by the core system, a custom scan provider will typically
> confine itself to generating CustomPath
> objects and adding
> them to rel using add_path.

That supports citus' use more than not: "this hook function can be used to
examine ... paths generated by the core system".


> * The problem we had with the set_join_pathlist_hook hook is that in
> such a typical use case, previously, if the replaced joins had any
> pseudoconstant clauses, the planner would produce incorrect query
> plans, due to the lack of support for handling such quals in
> createplan.c.  We could fix the extensions side, as you proposed, but
> the cause of the issue is 100% the planner's deficiency, so it would
> be unreasonable to force the authors to do so, which would also go
> against our policy of ABI compatibility.  So I fixed the core side, as
> in the FDW case, so that extensions created for such a typical use
> case, which I guess are the majority of the hook extensions, need not
> be modified/recompiled.  I think it is unfortunate that that breaks
> the use case of the Citus extension, though.

I'm not neutral - I don't work on citus, but work in the same Unit as
Onder. With that said: I don't think that's really a justification for
breaking a pre-existing, not absurd, use case in a minor release.

Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
Skimming the thread there wasn't really any analysis done about breaking
extensions etc - and that ought to be done before a substantial semantics
change in a somewhat commonly used hook.  I'm inclined to think that that
might still be the right path.


> BTW: commit 9e9931d2b removed the restriction on the call to the hook
> extensions, so you might want to back-patch it.

Citus is an extension, not a fork, there's not really a way to just backpatch
a random commit.


> Though, I think it would be better if the hook was well implemented from the
> beginning.

Sure, but that's neither here nor there.

Greetings,

Andres Freund




Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-19 Thread Andres Freund
Hi,

On 2023-08-14 12:25:30 -0700, Jeff Davis wrote:
> On Sat, 2023-08-12 at 11:25 -0700, Andres Freund wrote:
> >
> > I'm not sure that anything based, directly or indirectly, on
> > search_path
> > really is a realistic way to get there.
>
> Can you explain a little more? I see what you mean generally, that
> search_path is an imprecise thing, and that it leaves room for
> ambiguity and mistakes.

It just doesn't seem to provide enough control and it's really painful for
users to manage. If you install a bunch of extensions into public - very very
common from what I have seen - you can't really remove public from the search
path. Which then basically makes all approaches of resolving any of the
security issues via search path pretty toothless.


> > I think that'd be pretty painful from a UX perspective. Having to
> > write
> > e.g. operators as operator(schema, op) just sucks as an experience.
>
> I'm not suggesting that the user fully-qualify everything; I'm
> suggesting that the include a "SET search_path" clause if they depend
> on anything other than pg_catalog.

I don't think that really works in practice, due to the very common practice
of installing extensions into the same schema as the application. Then that
schema needs to be in search path (if you don't want to schema qualify
everything), which leaves you wide open.


> > And with
> > extensions plenty of operators will live outside of pg_catalog, so
> > there is
> > plenty things that will need qualifying.
>
> In my proposal, that would still involve a "SET search_path TO
> myextension, pg_catalog, pg_temp".

myextension is typically public. Which means that there's zero protection due
to such a search path.


> >   And because of things like type
> > coercion search, which prefers "bettering fitting" coercions over
> > search path
> > order, you can't just put "less important" things later in search
> > path.
>
> I understand this introduces some ambiguity, but you just can't include
> schemas in the search_path that you don't trust, for similar reasons as
> $PATH. If you have a few objects you'd like to access in another user's
> schema, fully-qualify them.

I think the more common attack paths are things like tricking extension
scripts into evaluating arbitrary code, to gain "real superuser" privileges.


> > We can't just store the oids at the time, because that'd end up very
> > fragile -
> > tables/functions/... might be dropped and recreated etc and thus
> > change their
> > oid.
>
> Robert suggested something along those lines[1]. I won't rule it out,
> but I agree that there are quite a few things left to figure out.
>
> >  But we could change the core PLs to rewrite all the queries (*) so
> > that
> > they schema qualify absolutely everything, including operators and
> > implicit
> > type casts.
>
> So not quite like "SET search_path FROM CURRENT": you resolve it to a
> specific "schemaname.objectname", but stop just short of resolving to a
> specific OID?
>
> An interesting compromise, but I'm not sure what the benefit is vs. SET
> search_path FROM CURRENT (or some defined search_path).

Search path does not reliably protect things involving "type matching". If you
have a better fitting cast, or a function call with parameters that won't need
coercion, later in search path, they'll win, even if there's another fit
earlier on.

IOW, search path is a bandaid for this kind of thing, at best.

If we instead store something that avoids the need for such search, the
"better fitting cast" logic wouldn't add these kind of security issues
anymore.


> > That way objects referenced by functions can still be replaced, but
> > search
> > path can't be used to "inject" objects in different schemas.
> > Obviously it
> > could lead to errors on some schema changes - e.g. changing a column
> > type
> > might mean that a relevant cast lives in a different place than with
> > the old
> > type - but I think that'll be quite rare. Perhaps we could offer a
> > ALTER
> > FUNCTION ... REFRESH REFERENCES; or such?
>
> Hmm. I feel like that's making things more complicated. I'd find it
> more straightforward to use something like Robert's approach of fully
> parsing something, and then have the REFRESH command reparse it when
> something needs updating. Or perhaps just create all of the dependency
> entries more like a view query and then auto-refresh.

Hm, I'm not quite sure I follow on what exactly you see as different here.

Greetings,

Andres Freund




Re: ci: Improve macos startup using a cached macports installation

2023-08-19 Thread Andres Freund
Hi,

On 2023-08-05 13:25:39 -0700, Andres Freund wrote:
> We have some issues with CI on macos and windows being too expensive (more on
> that soon in a separate email). For macos most of the obviously wasted time is
> spent installing packages with homebrew. Even with the package downloads being
> cached, it takes about 1m20s to install them.  We can't just cache the whole
> homebrew installation, because it contains a lot of pre-installed packages.
> 
> After a bunch of experimenting, I found a way to do this a lot faster: The
> attached patch installs macports and installs our dependencies from
> that. Because there aren't pre-existing macports packages, we can just cache
> the whole thing.  Doing so naively wouldn't yield that much of a speedup,
> because it takes a while to unpack a tarball (or whatnot) with as many files
> as our dependencies have - that's a lot of filesystem metadata
> operations. Instead the script creates a HFS+ filesystem in a file and caches
> that - that's mounted within a few seconds. To further keep the size in check,
> that file is compressed with zstd in the cache.
> 
> As macports has a package for IPC::Run and IO::Pty, we can use those instead
> of the separate cache we had for the perl installation.
> 
> After the patch, the cached case takes ~5s to "install" and the cache is half
> the size than the one for homebrew.
> 
> The comparison isn't entirely fair, because I used the occasion to not install
> 'make' (since meson is used for building) and llvm (we didn't enable it for
> the build anyway). That gain is a bit smaller without that, but still
> significant.
> 
> 
> An alternative implementation would be to create the "base" .dmg file outside
> of CI and download it onto the CI instances. But I didn't want to figure out
> the hosting situation for such files, so I thought this was the best
> near-medium term path.

Given how significant an improvement this is in test time, and the limited
blast radius, I am planning to push this soon, unless somebody opposes that
soon.

Greetings,

Andres Freund




Re: BUG #18059: Unexpected error 25001 in stored procedure

2023-08-19 Thread Tom Lane
[ redirected to -hackers ]

PG Bug reporting form  writes:
> Steps to reproduce:
> 1. Create stored procedure

> create or replace procedure test_proc()
> language plpgsql as $procedure$
> begin
>commit;
>set transaction isolation level repeatable read;
>-- here follows some useful code which is omitted for brevity
> end
> $procedure$;

> 2. Open new connection

> 3. Execute the following 3 queries one by one:
> a) call test_proc();
> b) create temporary table "#tmp"(c int) on commit drop;
> c) call test_proc();
> On step c) you'll get an error
> [25001]: ERROR: SET TRANSACTION ISOLATION LEVEL must be called before any
> query

Thanks for the report!

I looked into this.  The issue is that the plancache decides it needs
to revalidate the plan for the SET command, and that causes a
transaction start (or at least acquisition of a snapshot), which then
causes check_transaction_isolation to complain.  The weird sequence
that you have to go through to trigger the failure is conditioned by
the need to get the plancache entry into the needs-revalidation state
at the right time.  This wasn't really a problem when the plancache
code was written, but now that we have procedures it's not good.

We could imagine trying to terminate the new transaction once we've
finished revalidating the plan, but that direction seems silly to me.
A SET command has no plan to rebuild, while for commands that do need
that, terminating and restarting the transaction adds useless overhead.
So the right fix seems to be to just do nothing.  plancache.c already
knows revalidation should do nothing for TransactionStmts, but that
amount of knowledge is insufficient, as shown by this report.

One reasonable precedent is found in PlannedStmtRequiresSnapshot:
we could change plancache.c to exclude exactly the same utility
commands that does, viz

if (IsA(utilityStmt, TransactionStmt) ||
IsA(utilityStmt, LockStmt) ||
IsA(utilityStmt, VariableSetStmt) ||
IsA(utilityStmt, VariableShowStmt) ||
IsA(utilityStmt, ConstraintsSetStmt) ||
/* efficiency hacks from here down */
IsA(utilityStmt, FetchStmt) ||
IsA(utilityStmt, ListenStmt) ||
IsA(utilityStmt, NotifyStmt) ||
IsA(utilityStmt, UnlistenStmt) ||
IsA(utilityStmt, CheckPointStmt))
return false;

However, this feels unsatisfying.  "Does it require a snapshot?" is not
the same question as "does it have a plan that could need rebuilding?".
The vast majority of utility statements do not have any such plan:
they are left untouched by parse analysis, rewriting, and planning.

What I'm inclined to propose, therefore, is that we make revalidation
be a no-op for every statement type for which transformStmt() reaches
its default: case.  (When it does so, the resulting CMD_UTILITY Query
will not get any processing from the rewriter or planner either.)
That gives us this list of statements requiring revalidation:

case T_InsertStmt:
case T_DeleteStmt:
case T_UpdateStmt:
case T_MergeStmt:
case T_SelectStmt:
case T_ReturnStmt:
case T_PLAssignStmt:
case T_DeclareCursorStmt:
case T_ExplainStmt:
case T_CreateTableAsStmt:
case T_CallStmt:

For maintainability's sake I'd suggest writing a new function along
the line of RawStmtRequiresParseAnalysis() and putting it beside
transformStmt(), rather than allowing plancache.c to know directly
which statement types require analysis.

Thoughts?

regards, tom lane




Re: PG 16 draft release notes ready

2023-08-19 Thread Bruce Momjian
On Thu, Aug 17, 2023 at 08:37:28AM +0300, Pavel Luzanov wrote:
> On 17.08.2023 05:36, Bruce Momjian wrote:
> > On Wed, Aug  9, 2023 at 08:35:21PM -0400, Bruce Momjian wrote:
> > > On Sat, Aug  5, 2023 at 04:08:47PM -0700, Noah Misch wrote:
> > > > > Author: Robert Haas 
> > > > > 2022-08-25 [e3ce2de09] Allow grant-level control of role inheritance 
> > > > > behavior.
> > > > > -->
> > > > > 
> > > > > 
> > > > > 
> > > > > Allow GRANT to control role inheritance behavior (Robert Haas)
> > > > > 
> > > > > 
> > > > > 
> > > > > By default, role inheritance is controlled by the inheritance status 
> > > > > of the member role.  The new GRANT clauses WITH INHERIT and WITH 
> > > > > ADMIN can now override this.
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Allow roles that create other roles to automatically inherit the new 
> > > > > role's rights or SET ROLE to the new role (Robert Haas, Shi Yu)
> > > > > 
> > > > > 
> > > > > 
> > > > > This is controlled by server variable createrole_self_grant.
> > > > > 
> > > > > 
> > > > Similarly, v16 radically changes the CREATE ROLE ... WITH INHERIT 
> > > > clause.  The
> > > > clause used to "change the behavior of already-existing grants."  Let's 
> > > > merge
> > > > these two and move the combination to the incompatibilities section.
> > > I need help with this.  I don't understand how they can be combined, and
> > > I don't understand the incompatibility text in commit e3ce2de09d:
> > > 
> > >  If a GRANT does not specify WITH INHERIT, the behavior based on
> > >  whether the member role is marked INHERIT or NOINHERIT. This means
> > >  that if all roles are marked INHERIT or NOINHERIT before any role
> > >  grants are performed, the behavior is identical to what we had 
> > > before;
> > >  otherwise, it's different, because ALTER ROLE [NO]INHERIT now only
> > >  changes the default behavior of future grants, and has no effect on
> > >  existing ones.
> > I am waiting for an answer to this question, or can I assume the release
> > notes are acceptable?
> 
> I can try to explain how I understand it myself.
> 
> In v15 and early, inheritance of granted to role privileges depends on
> INHERIT attribute of a role:
> 
> create user alice;
> grant pg_read_all_settings to alice;
> 
> By default privileges inherited:
> \c - alice
> show data_directory;
>    data_directory
> -
>  /var/lib/postgresql/15/main
> (1 row)
> 
> After disabling the INHERIT attribute, privileges are not inherited:
> 
> \c - postgres
> alter role alice noinherit;
> 
> \c - alice
> show data_directory;
> ERROR:  must be superuser or have privileges of pg_read_all_settings to
> examine "data_directory"
> 
> In v16 changing INHERIT attribute on alice role doesn't change inheritance
> behavior of already granted roles.
> If we repeat the example, Alice still inherits pg_read_all_settings
> privileges after disabling the INHERIT attribute for the role.
> 
> Information for making decisions about role inheritance has been moved from
> the role attribute to GRANT role TO role [WITH INHERIT|NOINHERIT] command
> and can be viewed by the new \drg command:
> 
> \drg
>     List of role grants
>  Role name |  Member of   |   Options    | Grantor
> ---+--+--+--
>  alice | pg_read_all_settings | INHERIT, SET | postgres
> (1 row)
> 
> Changing the INHERIT attribute for a role now will affect (as the default
> value) only future GRANT commands without an INHERIT clause.

I was able to create this simple example to illustrate it:

CREATE ROLE a1;
CREATE ROLE a2;
CREATE ROLE a3;
CREATE ROLE a4;
CREATE ROLE b INHERIT;

GRANT a1 TO b WITH INHERIT TRUE;
GRANT a2 TO b WITH INHERIT FALSE;

GRANT a3 TO b;
ALTER USER b NOINHERIT;
GRANT a4 TO b;

\drg
   List of role grants
 Role name | Member of |   Options| Grantor
---+---+--+--
 b | a1| INHERIT, SET | postgres
 b | a2| SET  | postgres
 b | a3| INHERIT, SET | postgres
 b | a4| SET  | postgres

I will work on the relase notes adjustments for this and reply in a few
days.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: WIP: new system catalog pg_wait_event

2023-08-19 Thread Drouvot, Bertrand

Hi,

On 8/19/23 12:00 PM, Michael Paquier wrote:

On Sat, Aug 19, 2023 at 11:35:01AM +0200, Drouvot, Bertrand wrote:

Hi,

On 8/18/23 12:31 PM, Michael Paquier wrote:
Thanks! Please find attached v9 fixing this typo.


I was looking at v8, and this looks pretty good to me. 


Great!


I have
spotted a few minor things.

+  proretset => 't', provolatile => 's', prorettype => 'record',
This function should be volatile.  


Oh right, I copied/pasted this and should have paid more attention to
that part. Fixed in v10 attached.


By the way, shouldn't the results of GetWaitEventExtensionNames() in
the SQL function be freed?  I mean that:
 for (int idx = 0; idx < nbextwaitevents; idx++)
 pfree(waiteventnames[idx]);
 pfree(waiteventnames);



I don't think it's needed. The reason is that they are palloc in a
short-lived memory context while executing pg_get_wait_events().


+   /* Handling extension custom wait events */
Nit warning: s/Handling/Handle/, or "Handling of".


Thanks, fixed in v10.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 40eae29b80920d3859ec78bd2485036ca4ab4f6d Mon Sep 17 00:00:00 2001
From: bdrouvotAWS 
Date: Sat, 5 Aug 2023 12:39:42 +
Subject: [PATCH v10] Add catalog pg_wait_events

Adding a new system view, namely pg_wait_events, that describes the wait
events.
---
 doc/src/sgml/monitoring.sgml  | 13 ++-
 doc/src/sgml/system-views.sgml| 64 +
 src/backend/Makefile  |  3 +-
 src/backend/catalog/system_views.sql  |  3 +
 src/backend/utils/activity/.gitignore |  1 +
 src/backend/utils/activity/Makefile   |  8 +-
 .../activity/generate-wait_event_types.pl | 54 ++-
 src/backend/utils/activity/meson.build|  1 +
 src/backend/utils/activity/wait_event.c   | 44 +
 src/backend/utils/activity/wait_event_funcs.c | 93 +++
 src/include/catalog/pg_proc.dat   |  6 ++
 src/include/utils/meson.build |  4 +-
 src/include/utils/wait_event.h|  1 +
 .../modules/worker_spi/t/001_worker_spi.pl|  6 ++
 src/test/regress/expected/rules.out   |  4 +
 src/test/regress/expected/sysviews.out| 16 
 src/test/regress/sql/sysviews.sql |  4 +
 src/tools/msvc/Solution.pm|  3 +-
 src/tools/msvc/clean.bat  |  1 +
 19 files changed, 318 insertions(+), 11 deletions(-)
  21.1% doc/src/sgml/
  59.5% src/backend/utils/activity/
   3.5% src/include/catalog/
   4.5% src/test/regress/expected/
   4.5% src/test/
   3.0% src/tools/msvc/
   3.6% src/

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 70511a2388..6c53c1ed91 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1103,7 +1103,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
   _event_types;
 

- Here is an example of how wait events can be viewed:
+ Here are examples of how wait events can be viewed:
 
 
 SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event 
is NOT NULL;
@@ -1112,6 +1112,17 @@ SELECT pid, wait_event_type, wait_event FROM 
pg_stat_activity WHERE wait_event i
  2540 | Lock| relation
  6644 | LWLock  | ProcArray
 (2 rows)
+
+
+
+SELECT a.pid, a.wait_event, w.description
+FROM pg_stat_activity a JOIN pg_wait_events w
+ON (a.wait_event_type = w.type AND a.wait_event = w.name)
+WHERE wait_event is NOT NULL and a.state = 'active';
+-[ RECORD 1 ]--
+pid | 686674
+wait_event  | WALInitSync
+description | Waiting for a newly initialized WAL file to reach durable storage
 

 
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 57b228076e..2b35c2f91b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -221,6 +221,11 @@
   views
  
 
+ 
+  pg_wait_events
+  wait events
+ 
+
 

   
@@ -4825,4 +4830,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
  
 
+
+ 
+  pg_wait_events
+
+  
+   pg_wait_events
+  
+
+  
+   The view pg_wait_events provides description about 
the
+   wait events.
+  
+
+  
+   pg_wait_events Columns
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   type text
+  
+  
+   Wait event type
+  
+ 
+
+ 
+  
+   name text
+  
+  
+   Wait event name
+  
+ 
+
+ 
+  
+   description text
+  
+  
+   Wait event description
+  
+ 
+
+   
+  
+ 
+
 
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 1c929383c4..3e275ac759 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ 

Re: pg_upgrade fails with in-place tablespace

2023-08-19 Thread Rui Zhao
On Sat, Aug 19, 2023 at 12:45 PM +0800, Michael Paquier wrote:
> I am not sure to follow the meaning of this logic. There could be
> in-place tablespaces that got created with the GUC enabled during a
> session, while the GUC has disabled the GUC. Shouldn't this stuff be
> more restrictive and not require a lookup at the GUC, only looking at
> the paths returned by pg_tablespace_location()?
There are two cases when we check in-place tablespaces:
1. The GUC allow_in_place_tablespaces is turned on in the old node.
In this case, we assume that the caller is aware of what he is doing
and allow the check to pass.
2. The GUC allow_in_place_tablespaces is turned off in the old node.
In this case, we will fail the check if we find any in-place tablespaces.
However, we provide an option to pass the check by adding
--old-options="-c allow_in_place_tablespaces=on" in pg_upgrade.
This option will turn on allow_in_place_tablespaces like GUC does.
So I lookup at the GUC and check if it is turned on.
We add this check of in-place tablespaces, to deal with the situation
where one would want to be warned if these are in-place tablespaces
when doing upgrades. I think we can take it a step further by providing
an option that allows the check to pass if the caller explicitly adds
--old-options="-c allow_in_place_tablespaces=on".
Please refer to the TAP test I have included for a better understanding
of my suggestion.
--
Best regards,
Rui Zhao


Re: Fix typo in src/interfaces/libpq/po/zh_CN.po

2023-08-19 Thread Zhang Mingli


> On Aug 16, 2023, at 22:24, Peter Eisentraut  wrote:
> 
> On 16.08.23 09:34, Zhang Mingli wrote:
>> The Chinese words there are ok,  but the `Unix-domian` should be 
>> `Unix-domain`.
> 
> fixed, thanks
> 


Hi, Peter, thanks and  just want to make sure that it is pushed?


Zhang Mingli
HashData https://www.hashdata.xyz



Re: postgres_fdw: wrong results with self join + enable_nestloop off

2023-08-19 Thread Etsuro Fujita
Hi Onder,

On Wed, Aug 16, 2023 at 10:58 PM Önder Kalacı  wrote:

>> Maybe we could do so by leaving to extensions the decision whether
>> they replace joins with pseudoconstant clauses, but I am not sure that
>> that is a good idea, because that would require the authors to modify
>> and recompile their extensions to fix the issue...

> I think I cannot easily follow this argument. The decision to push down the 
> join
> (or not) doesn't seem to be related to calling set_join_pathlist_hook. It 
> seems like the
> extension should decide what to do with the hook.
>
> That seems the generic theme of the hooks that Postgres provides. For 
> example, the extension
> is allowed to even override the whole planner/executor, and there is no 
> condition that would
> prevent it from happening. In other words, an extension can easily return 
> wrong results with the
> wrong actions taken with the hooks, and that should be responsibility of the 
> extension, not Postgres

>> I am not familiar with the Citus extension, but such pseudoconstant
>> clauses are handled within the Citus extension?

> As I noted earlier, Citus relies on this hook for collecting information 
> about all the joins that Postgres
> knows about, there is nothing specific to pseudoconstants. Some parts of 
> creating the (distributed)
> plan relies on the information gathered from this hook. So, if information 
> about some of the joins
> are not passed to the extension, then the decisions that the extension gives 
> are broken (and as a result
> the queries are broken).

Thanks for the explanation!

Maybe my explanation was not enough, so let me explain:

* I think you could use the set_join_pathlist_hook hook as you like at
your own responsibility, but typical use cases of the hook that are
designed to support in the core system would be just add custom paths
for replacing joins with scans, as described in custom-scan.sgml (this
note is about set_rel_pathlist_hook, but it should also apply to
set_join_pathlist_hook):

Although this hook function can be used to examine, modify, or remove
paths generated by the core system, a custom scan provider will typically
confine itself to generating CustomPath
objects and adding
them to rel using add_path.

* The problem we had with the set_join_pathlist_hook hook is that in
such a typical use case, previously, if the replaced joins had any
pseudoconstant clauses, the planner would produce incorrect query
plans, due to the lack of support for handling such quals in
createplan.c.  We could fix the extensions side, as you proposed, but
the cause of the issue is 100% the planner's deficiency, so it would
be unreasonable to force the authors to do so, which would also go
against our policy of ABI compatibility.  So I fixed the core side, as
in the FDW case, so that extensions created for such a typical use
case, which I guess are the majority of the hook extensions, need not
be modified/recompiled.  I think it is unfortunate that that breaks
the use case of the Citus extension, though.

BTW: commit 9e9931d2b removed the restriction on the call to the hook
extensions, so you might want to back-patch it.  Though, I think it
would be better if the hook was well implemented from the beginning.

Best regards,
Etsuro Fujita




Re: pgbench: allow to exit immediately when any client is aborted

2023-08-19 Thread Tatsuo Ishii
> I have tested the v4 patch with default_transaction_isolation =
> 'repeatable read'.
> 
> pgbench --exit-on-abort -N -p 11002 -c 10 -T 30 test
> pgbench (17devel, server 15.3)
> starting vacuum...end.
> transaction type: 
> scaling factor: 1
> query mode: simple
> number of clients: 10
> number of threads: 1
> maximum number of tries: 1
> duration: 30 s
> number of transactions actually processed: 64854
> number of failed transactions: 4 (0.006%)
> latency average = 4.628 ms (including failures)
> initial connection time = 49.526 ms
> tps = 2160.827556 (without initial connection time)
> 
> Depite the 4 failed transactions (could not serialize access due to
> concurrent update) pgbench ran normally, rather than aborting the
> test. Is this an expected behavior?

I start to think this behavior is ok and consistent with previous
behavior of pgbench because serialization (and dealock) errors have
been treated specially from other types of errors, such as accessing
non existing tables. However, I suggest to add more sentences to the
explanation of this option so that users are not confused by this.

+ 
+  --exit-on-abort
+  
+   
+Exit immediately when any client is aborted due to some error. Without
+this option, even when a client is aborted, other clients could 
continue
+their run as specified by -t or -T 
option,
+and pgbench will print an incomplete results
+in this case.
+   
+  
+ 
+

What about inserting "Note that serialization failures or deadlock
failures will not abort client.  See  for more information." into the end
of this paragraph?

BTW, I think:
Exit immediately when any client is aborted due to some error. Without

should be:
Exit immediately when any client is aborted due to some errors. Without

(error vs. erros)

Also:
+ --exit-on-abort is specified . Otherwise in the worst

There is an extra space between "specified" and ".".

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-19 Thread Amit Kapila
On Fri, Aug 18, 2023 at 7:21 PM Hayato Kuroda (Fujitsu)
 wrote:
>

Few comments on new patches:
1.
+ ALTER
SUBSCRIPTION ... DISABLE.
+ After the upgrade is complete, execute the
+ ALTER SUBSCRIPTION ... CONNECTION command to update the
+ connection string, and then re-enable the subscription.

Why does one need to update the connection string?

2.
+ /*
+ * Checking for logical slots must be done before
+ * check_new_cluster_is_empty() because the slot_arr attribute of the
+ * new_cluster will be checked in that function.
+ */
+ if (count_logical_slots(_cluster))
+ {
+ get_logical_slot_infos(_cluster, false);
+ check_for_logical_replication_slots(_cluster);
+ }
+
  check_new_cluster_is_empty();

Can't we simplify this checking by simply querying
pg_replication_slots for any usable slot something similar to what we
are doing in check_for_prepared_transactions()? We can add this check
in the function check_for_logical_replication_slots(). Also, do we
need a count function, or instead can we have a simple function like
is_logical_slot_present() where we return even if there is one slot
present?

Apart from this, (a) I have made a few changes (changed comments) in
patch 0001 as shared in the email [1]; (b) some modifications in the
docs as you can see in the attached. Please include those changes in
the next version if you think they are okay.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1JzJagMmb_E8D4au%3DGYQkxox0AfNBm1FbP7sy7t4YWXPQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.


mod_amit_1.patch
Description: Binary data


Re: WIP: new system catalog pg_wait_event

2023-08-19 Thread Michael Paquier
On Sat, Aug 19, 2023 at 11:35:01AM +0200, Drouvot, Bertrand wrote:
> Hi,
> 
> On 8/18/23 12:31 PM, Michael Paquier wrote:
> Thanks! Please find attached v9 fixing this typo.

I was looking at v8, and this looks pretty good to me.  I have
spotted a few minor things.

+  proretset => 't', provolatile => 's', prorettype => 'record',
This function should be volatile.  For example, a backend could add a
new wait event while a different backend queries twice this view in
the same transaction, resulting in a different set of rows sent back.

By the way, shouldn't the results of GetWaitEventExtensionNames() in
the SQL function be freed?  I mean that:
for (int idx = 0; idx < nbextwaitevents; idx++)
pfree(waiteventnames[idx]);
pfree(waiteventnames);

+   /* Handling extension custom wait events */
Nit warning: s/Handling/Handle/, or "Handling of".
--
Michael


signature.asc
Description: PGP signature


Re: WIP: new system catalog pg_wait_event

2023-08-19 Thread Drouvot, Bertrand

Hi,

On 8/18/23 12:31 PM, Michael Paquier wrote:

On Fri, Aug 18, 2023 at 10:56:55AM +0200, Drouvot, Bertrand wrote:

Okay, using the plural form in v8 attached.


Noting in passing:
- Here is an example of how wait events can be viewed:
+ Here is are examples of how wait events can be viewed:
s/is are/are/.


Thanks! Please find attached v9 fixing this typo.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 430cc8cb79c7cf3ea811760e8aadd53cab7ff9e3 Mon Sep 17 00:00:00 2001
From: bdrouvotAWS 
Date: Sat, 5 Aug 2023 12:39:42 +
Subject: [PATCH v9] Add catalog pg_wait_events

Adding a new system view, namely pg_wait_events, that describes the wait
events.
---
 doc/src/sgml/monitoring.sgml  | 13 ++-
 doc/src/sgml/system-views.sgml| 64 +
 src/backend/Makefile  |  3 +-
 src/backend/catalog/system_views.sql  |  3 +
 src/backend/utils/activity/.gitignore |  1 +
 src/backend/utils/activity/Makefile   |  8 +-
 .../activity/generate-wait_event_types.pl | 54 ++-
 src/backend/utils/activity/meson.build|  1 +
 src/backend/utils/activity/wait_event.c   | 44 +
 src/backend/utils/activity/wait_event_funcs.c | 93 +++
 src/include/catalog/pg_proc.dat   |  6 ++
 src/include/utils/meson.build |  4 +-
 src/include/utils/wait_event.h|  1 +
 .../modules/worker_spi/t/001_worker_spi.pl|  6 ++
 src/test/regress/expected/rules.out   |  4 +
 src/test/regress/expected/sysviews.out| 16 
 src/test/regress/sql/sysviews.sql |  4 +
 src/tools/msvc/Solution.pm|  3 +-
 src/tools/msvc/clean.bat  |  1 +
 19 files changed, 318 insertions(+), 11 deletions(-)
  21.0% doc/src/sgml/
  59.5% src/backend/utils/activity/
   3.5% src/include/catalog/
   4.5% src/test/regress/expected/
   4.5% src/test/
   3.0% src/tools/msvc/
   3.6% src/

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 70511a2388..6c53c1ed91 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1103,7 +1103,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
   _event_types;
 

- Here is an example of how wait events can be viewed:
+ Here are examples of how wait events can be viewed:
 
 
 SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event 
is NOT NULL;
@@ -1112,6 +1112,17 @@ SELECT pid, wait_event_type, wait_event FROM 
pg_stat_activity WHERE wait_event i
  2540 | Lock| relation
  6644 | LWLock  | ProcArray
 (2 rows)
+
+
+
+SELECT a.pid, a.wait_event, w.description
+FROM pg_stat_activity a JOIN pg_wait_events w
+ON (a.wait_event_type = w.type AND a.wait_event = w.name)
+WHERE wait_event is NOT NULL and a.state = 'active';
+-[ RECORD 1 ]--
+pid | 686674
+wait_event  | WALInitSync
+description | Waiting for a newly initialized WAL file to reach durable storage
 

 
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 57b228076e..2b35c2f91b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -221,6 +221,11 @@
   views
  
 
+ 
+  pg_wait_events
+  wait events
+ 
+
 

   
@@ -4825,4 +4830,63 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
  
 
+
+ 
+  pg_wait_events
+
+  
+   pg_wait_events
+  
+
+  
+   The view pg_wait_events provides description about 
the
+   wait events.
+  
+
+  
+   pg_wait_events Columns
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   type text
+  
+  
+   Wait event type
+  
+ 
+
+ 
+  
+   name text
+  
+  
+   Wait event name
+  
+ 
+
+ 
+  
+   description text
+  
+  
+   Wait event description
+  
+ 
+
+   
+  
+ 
+
 
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 1c929383c4..3e275ac759 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -134,7 +134,7 @@ storage/lmgr/lwlocknames.h: 
storage/lmgr/generate-lwlocknames.pl storage/lmgr/lw
$(MAKE) -C storage/lmgr lwlocknames.h lwlocknames.c
 
 utils/activity/wait_event_types.h: utils/activity/generate-wait_event_types.pl 
utils/activity/wait_event_names.txt
-   $(MAKE) -C utils/activity wait_event_types.h pgstat_wait_event.c
+   $(MAKE) -C utils/activity wait_event_types.h pgstat_wait_event.c 
wait_event_funcs_data.c
 
 # run this unconditionally to avoid needing to know its dependencies here:
 submake-catalog-headers:
@@ -311,6 +311,7 @@ maintainer-clean: distclean
  storage/lmgr/lwlocknames.c \
  

Re: [PATCH] Add function to_oct

2023-08-19 Thread Dean Rasheed
On Thu, 17 Aug 2023 at 16:26, Nathan Bossart  wrote:
>
> Works for me.  I did it that way in v7.
>

I note that there are no tests for negative inputs.

Doing a quick test, shows that this changes the current behaviour,
because all inputs are now treated as 64-bit:

HEAD:

select to_hex((-1234)::int);
  to_hex
--
 fb2e

With patch:

select to_hex((-1234)::int);
  to_hex
--
 fb2e

The way that negative inputs are handled really should be documented,
or at least it should include a couple of examples.

Regards,
Dean




Re: persist logical slots to disk during shutdown checkpoint

2023-08-19 Thread Julien Rouhaud
On Sat, 19 Aug 2023, 14:16 Amit Kapila,  wrote:

> It's entirely possible for a logical slot to have a confirmed_flush
> LSN higher than the last value saved on disk while not being marked as
> dirty.  It's currently not a problem to lose that value during a clean
> shutdown / restart cycle but to support the upgrade of logical slots
> [1] (see latest patch at [2]), we seem to rely on that value being
> properly persisted to disk. During the upgrade, we need to verify that
> all the data prior to shudown_checkpoint for the logical slots has
> been consumed, otherwise, the downstream may miss some data. Now, to
> ensure the same, we are planning to compare the confirm_flush LSN
> location with the latest shudown_checkpoint location which means that
> the confirm_flush LSN should be updated after restart.
>
> I think this is inefficient even without an upgrade because, after the
> restart, this may lead to decoding some data again. Say, we process
> some transactions for which we didn't send anything downstream (the
> changes got filtered) but the confirm_flush LSN is updated due to
> keepalives. As we don't flush the latest value of confirm_flush LSN,
> it may lead to processing the same changes again.
>

In most cases there shouldn't be a lot of records to decode after restart,
but I agree it's better to avoid decoding those again.

The idea discussed in the thread [1] is to always persist logical
> slots to disk during the shutdown checkpoint. I have extracted the
> patch to achieve the same from that thread and attached it here. This
> could lead to some overhead during shutdown (checkpoint) if there are
> many slots but it is probably a one-time work.
>
> I couldn't think of better ideas but another possibility is to mark
> the slot as dirty when we update the confirm_flush LSN (see
> LogicalConfirmReceivedLocation()). However, that would be a bigger
> overhead in the running server as it could be a frequent operation and
> could lead to more writes.
>

Yeah I didn't find any better option either at that time. I still think
that forcing persistence on shutdown is the best compromise. If we tried to
always mark the slot as dirty, we would be sure to add regular overhead but
we would probably end up persisting the slot on disk on shutdown anyway
most of the time, so I don't think it would be a good compromise.

My biggest concern was that some switchover scenario might be a bit slower
in some cases, but if that really is a problem it's hard to imagine what
workload would be possible without having to persist them anyway due to
continuous activity needing to be sent just before the shutdown.

>


persist logical slots to disk during shutdown checkpoint

2023-08-19 Thread Amit Kapila
It's entirely possible for a logical slot to have a confirmed_flush
LSN higher than the last value saved on disk while not being marked as
dirty.  It's currently not a problem to lose that value during a clean
shutdown / restart cycle but to support the upgrade of logical slots
[1] (see latest patch at [2]), we seem to rely on that value being
properly persisted to disk. During the upgrade, we need to verify that
all the data prior to shudown_checkpoint for the logical slots has
been consumed, otherwise, the downstream may miss some data. Now, to
ensure the same, we are planning to compare the confirm_flush LSN
location with the latest shudown_checkpoint location which means that
the confirm_flush LSN should be updated after restart.

I think this is inefficient even without an upgrade because, after the
restart, this may lead to decoding some data again. Say, we process
some transactions for which we didn't send anything downstream (the
changes got filtered) but the confirm_flush LSN is updated due to
keepalives. As we don't flush the latest value of confirm_flush LSN,
it may lead to processing the same changes again.

The idea discussed in the thread [1] is to always persist logical
slots to disk during the shutdown checkpoint. I have extracted the
patch to achieve the same from that thread and attached it here. This
could lead to some overhead during shutdown (checkpoint) if there are
many slots but it is probably a one-time work.

I couldn't think of better ideas but another possibility is to mark
the slot as dirty when we update the confirm_flush LSN (see
LogicalConfirmReceivedLocation()). However, that would be a bigger
overhead in the running server as it could be a frequent operation and
could lead to more writes.

Thoughts?

[1]  - 
https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2] - 
https://www.postgresql.org/message-id/TYAPR01MB5866562EF047F2C9DDD1F9DEF51BA%40TYAPR01MB5866.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.


v1-0001-Always-persist-to-disk-logical-slots-during-a-sh.patch
Description: Binary data