Re: [Doc] Glossary Term Definitions Edits

2023-10-13 Thread Erik Wienhold
On 2023-10-14 06:16 +0200, Andrew Atkinson write:
>- When describing options for a command, changed to “option of” instead
>of “option to”

I think "option to" is not wrong (maybe less common).  I've seen this
in other texts and took it as "the X option [that applies] to Y".

>- “system- or user-supplied”, removed the dash after system. Or I’d
>suggest system-supplied or user-supplied, to hyphenate both.

That's a suspended hyphen and is common usage.

>- Changed “volume of records has been written” to “volume of records
>were written”

"Has been" means that something happened just now.  This is perfectly
fine when talking about checkpoints IMO.

>- Many examples of “an SQL”. I changed those to “a SQL...”. For example
>I changed “An SQL command which” to “A SQL command that”. I'm not an
>English major so maybe I'm missing something here.

Depends on how you pronounce SQL (ess-cue-el or sequel).  "An SQL"
is more common in the docs whereas "a SQL" is more common in code
comments.

-- 
Erik




Re: [Doc] Glossary Term Definitions Edits

2023-10-13 Thread David Rowley
On Sat, 14 Oct 2023, 5:20 pm Andrew Atkinson, 
wrote:

>
>- Many examples of “an SQL”. I changed those to “a SQL...”. For
>example I changed “An SQL command which” to “A SQL command that”. I'm not
>an English major so maybe I'm missing something here.
>
> It would depend on how you pronounce SQL.  For those that say es-que-el,
"An" is the correct article. If you say sequel then it's "a". We've
standardised our docs to use "an SQL", so any changes we make would be the
opposite way.

David

>


RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-13 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

Here is a new patch.

Previously I wrote:
> Based on above idea, I made new version patch which some functionalities were
> exported from pg_resetwal. In this approach, pg_upgrade itself removed WALs 
> and
> then create logical slots, then pg_resetwal would be called with new option
> --no-switch, which avoid to switch a WAL segment file. The option is only used
> for the upgrading purpose so it is not written in doc and usage(). This option
> is not required if pg_resetwal -o does not discard WAL records. Please see the
> fork thread [1].

But for now, these changes were reverted because changing pg_resetwal -o stuff
may be a bit risky. This has been located more than ten years so that we should
be more careful for modifying.
Also, I cannot come up with problems if slots are created after the pg_resetwal.
Background processes would not generate decodable changes (listed in [1]), and
BGworkers by extensions could be ignored [2].
Based on the discussion on forked thread [3] and if it is accepted, we will 
apply
again.

Also. some comments and function name was improved.

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB58660273EACEFC5BF256B133F50DA%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2]: 
https://www.postgresql.org/message-id/CAA4eK1L4JB%2BKH_4EQryDEhyaLBPW6V20LqjdzOxCWyL7rbxqsA%40mail.gmail.com
[3]: 
https://www.postgresql.org/message-id/flat/CAA4eK1KRyPMiY4fW98qFofsYrPd87Oc83zDNxSeHfTYh_asdBg%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v50-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description:  v50-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch


[Doc] Glossary Term Definitions Edits

2023-10-13 Thread Andrew Atkinson
Hello. I started reading through the Glossary[^1] terms to learn from the
definitions, and to double check them against what I'd written elsewhere. I
found myself making edits. :)

I've put the edits together into a patch. My goal was to focus on wording
simplifications that are smoother to read, without big changes.

I realized I should check with others though, so this is a mid-point
check-in. For now I went through terms from "A" through "I".


Here's a recap of the changes:



   - Changed places like “to make” to use the verb directly, i.e. “make”
   - When describing options for a command, changed to “option of” instead
   of “option to”
   - “system- or user-supplied”, removed the dash after system. Or I’d
   suggest system-supplied or user-supplied, to hyphenate both.
   - Changed “will access”  to “access”
   - Changed “helps to prevent” to “helps prevent”
   - Changed “volume of records has been written” to “volume of records
   were written”
   - “It is required that this user exist” changed to “This user is
   required to exist...” (I’d also suggest “This user must exist before”) as a
   simplification, but that’s a bigger difference from what’s there now.
   - Changed “operating-system” to remove the hyphen, which is how it’s
   written elsewhere in the Glossary.
   - Many examples of “an SQL”. I changed those to “a SQL...”. For example
   I changed “An SQL command which” to “A SQL command that”. I'm not an
   English major so maybe I'm missing something here.
   - I often thought “that” was easier to read than “which”, and there are
   several examples in the patch. For example “Space in data pages that…”
   replaced “Space in data pages which…”
   - Simplifications like: “There also exist two secondary forks” to “There
   are two secondary forks”

I was able to build the documentation locally and preview the HTML version.


If these types of changes are helpful, and can continue a consistent style
through all the terms and provide a new (larger) v2 patch.



Thanks for taking a look.

Andrew Atkinson

[^1]: https://www.postgresql.org/docs/current/glossary.html


glossary_terms_grammar_edits_v1.patch
Description: Binary data


Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-10-13 Thread Amit Kapila
On Wed, Oct 11, 2023 at 4:35 PM Dilip Kumar  wrote:
>
> The small size of the SLRU buffer pools can sometimes become a
> performance problem because it’s not difficult to have a workload
> where the number of buffers actively in use is larger than the
> fixed-size buffer pool. However, just increasing the size of the
> buffer pool doesn’t necessarily help, because the linear search that
> we use for buffer replacement doesn’t scale, and also because
> contention on the single centralized lock limits scalability.
>
> There is a couple of patches proposed in the past to address the
> problem of increasing the buffer pool size, one of the patch [1] was
> proposed by Thomas Munro where we make the size of the buffer pool
> configurable.  And, in order to deal with the linear search in the
> large buffer pool,  we divide the SLRU buffer pool into associative
> banks so that searching in the buffer pool doesn’t get affected by the
> large size of the buffer pool.  This does well for the workloads which
> are mainly impacted by the frequent buffer replacement but this still
> doesn’t stand well with the workloads where the centralized control
> lock is the bottleneck.
>
> So I have taken this patch as my base patch (v1-0001) and further
> added 2 more improvements to this 1) In v1-0002, Instead of a
> centralized control lock for the SLRU I have introduced a bank-wise
> control lock 2)In v1-0003, I have removed the global LRU counter and
> introduced a bank-wise counter.  The second change (v1-0003) is in
> order to avoid the CPU/OS cache invalidation due to frequent updates
> of the single variable, later in my performance test I will show how
> much gain we have gotten because of these 2 changes.
>
> Note: This is going to be a long email but I have summarised the main
> idea above this point and now I am going to discuss more internal
> information in order to show that the design idea is valid and also
> going to show 2 performance tests where one is specific to the
> contention on the centralized lock and other is mainly contention due
> to frequent buffer replacement in SLRU buffer pool. We are getting ~2x
> TPS compared to the head by these patches and in later sections, I am
> going discuss this in more detail i.e. exact performance numbers and
> analysis of why we are seeing the gain.
>
...
>
> Performance Test:
> Exp1: Show problems due to CPU/OS cache invalidation due to frequent
> updates of the centralized lock and a common LRU counter.  So here we
> are running a parallel transaction to pgbench script which frequently
> creates subtransaction overflow and that forces the visibility-check
> mechanism to access the subtrans SLRU.
> Test machine: 8 CPU/ 64 core/ 128 with HT/ 512 MB RAM / SSD
> scale factor: 300
> shared_buffers=20GB
> checkpoint_timeout=40min
> max_wal_size=20GB
> max_connections=200
>
> Workload: Run these 2 scripts parallelly:
> ./pgbench -c $ -j $ -T 600 -P5 -M prepared postgres
> ./pgbench -c 1 -j 1 -T 600 -f savepoint.sql postgres
>
> savepoint.sql (create subtransaction overflow)
> BEGIN;
> SAVEPOINT S1;
> INSERT INTO test VALUES(1)
> ← repeat 70 times →
> SELECT pg_sleep(1);
> COMMIT;
>
> Code under test:
> Head: PostgreSQL head code
> SlruBank: The first patch applied to convert the SLRU buffer pool into
> the bank (0001)
> SlruBank+BankwiseLockAndLru: Applied 0001+0002+0003
>
> Results:
> Clients Head SlruBank  SlruBank+BankwiseLockAndLru
> 1  457  491   475
> 8  375338193782
> 32  14594   14328  17028
> 64  1560016243  25944
> 1281595716272  31731
>
> So we can see that at 128 clients, we get ~2x TPS(with SlruBank +
> BankwiseLock and bankwise LRU counter) as compared to HEAD.
>

This and other results shared by you look promising. Will there be any
improvement in workloads related to clog buffer usage? BTW, I remember
that there was also a discussion of moving SLRU into a regular buffer
pool [1]. You have not provided any explanation as to whether that
approach will have any merits after we do this or whether that
approach is not worth pursuing at all.

[1] - https://commitfest.postgresql.org/43/3514/

-- 
With Regards,
Amit Kapila.




Re: JSON Path and GIN Questions

2023-10-13 Thread Erik Wienhold
On 2023-10-09 01:13 +0200, David E. Wheeler write:
> On Sep 12, 2023, at 21:00, Erik Wienhold  wrote:
> 
> >> I posted this question on Stack Overflow 
> >> (https://stackoverflow.com/q/77046554/79202),
> >> and from the suggestion I got there, it seems that @@ expects a boolean to 
> >> be
> >> returned by the path query, while @? wraps it in an implicit exists(). Is 
> >> that
> >> right?
> > 
> > That's also my understanding.  We had a discussion about the docs on @@, 
> > @?, and
> > jsonb_path_query on -general a while back [1].  Maybe it's useful also.
> 
> Hi, finally getting back to this, still fiddling to figure out the
> differences. From the thread you reference [1], is the point that @@
> and jsonb_path_match() can only be properly used with a JSON Path
> expression that’s a predicate check?

I think so.  That's also supported by the existing docs which only
mention "JSON path predicate" for @@ and jsonb_path_match().

> If so, as far as I can tell, only exists() around the entire path
> query, or the deviation from the SQL standard that allows an
> expression to be a predicate?

Looks like that.  But note that exists() is also a filter expression.
So wrapping the entire jsonpath in exists() is also a deviation from the
SQL standard which only allows predicates in filter expressions, i.e.
' ? ()'.

> This suggest to me that the "Only the first item of the result is
> taken into account” bit from the docs may not be quite right.

Yes, this was also the issue in the referenced thread[1].  I think my
suggesstion in [2] explains it (as far as I understand it).

> Consider this example:
> 
> david=#  select jsonb_path_query('{"a":[false,true,false]}',  '$.a ?(@[*] == 
> false)');
>  jsonb_path_query
> --
>  false
>  false
> (2 rows)
> 
> david=#  select jsonb_path_match('{"a":[false,true,false]}',  '$.a ?(@[*] == 
> false)');
> ERROR:  single boolean result is expected
> 
> jsonb_path_match(), it turns out, only wants a single result. But
> furthermore perhaps the use of a filter predicate rather than a
> predicate expression for the entire path query is an error?

Yes, I think @@ and jsonb_path_match() should not be used with filter
expressions because the jsonpath returns whatever the path expression
yields (which may be an actual boolean value in the jsonb).  The filter
expression only filters (as the name suggests) what the path expression
yields.

> Curiously, @@ seems okay with it:
> 
> david=#  select '{"a":[false,true,false]}'@@ '$.a ?(@[*] == false)';
>  ?column? 
> --
>  t
> 
> Not a predicate query, and somehow returns true even though the first
> item of the result is false? Is that how it should be?

Your example does a text search equivalent to:

select to_tsvector('{"a":[false,true,false]}') @@ plainto_tsquery('$.a 
? (@[*] == true)')

You forgot the cast to jsonb.  jsonb @@ jsonpath actually returns null:

test=# select '{"a":[false,true,false]}'::jsonb @@ '$.a ? (@[*] == 
false)';
 ?column?
--
 
(1 row)

This matches the note right after the docs for @@:

"The jsonpath operators @? and @@ suppress the following errors: missing
 object field or array element, unexpected JSON item type, datetime and
 numeric errors. The jsonpath-related functions described below can also
 be told to suppress these types of errors. This behavior might be
 helpful when searching JSON document collections of varying structure."

That would be the silent argument of jsonb_path_match():

test=# select jsonb_path_match('{"a":[false,true,false]}', '$.a ? (@[*] 
== false)', silent => true);
 jsonb_path_match 
--
 
(1 row)

[1] 
https://www.postgresql.org/message-id/CACJufxE01sxgvtG4QEvRZPzs_roggsZeVvBSGpjM5tzE5hMCLA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/880194083.579916.1680598906819%40office.mailbox.org

-- 
Erik




Re: Included xid in restoring reorder buffer changes from disk log message

2023-10-13 Thread vignesh C
On Tue, 10 Oct 2023 at 06:59, Kyotaro Horiguchi  wrote:
>
> At Fri, 6 Oct 2023 14:58:13 +0530, vignesh C  wrote in
> > On Fri, 30 Apr 2021 at 11:53, Dilip Kumar  wrote:
> > > It makes sense to include xid in the debug message, earlier I thought
> > > that will it be a good idea to also print the toplevel_xid.  But I
> > > think it is for debug purposes and only we have the xid we can fetch
> > > the other parameters so maybe adding xid is good enough.
>
> +1
>
> > While having a look at the reorderbuffer code, I noticed that this
> > changes were still not committed.
> > Here is a rebased version of the patch.
>
> While this patch makes the following change on the de-serializing side;
>
> -   elog(DEBUG2, "restored %u/%u changes from disk",
> +   elog(DEBUG2, "restored %u/%u changes of XID %u from 
> disk",
>
> the counter part ReorderBufferSerializeTXN() has the following
> message.
>
> >   elog(DEBUG2, "spill %u changes in XID %u to disk",
> >(uint32) txn->nentries_mem, txn->xid);
>
> It might be preferable for those two messages to have a corresponding
> appearance.

We cannot include nentries in ReorderBufferSerializeTXN as the number
of entries will not be known at that time. Modified to keep it
consistent  by changing it to "... changes in XID ...". Attached v3
patch has the changes for the same.

Regards,
Vignesh
From bbd3ea2376b76a14d5e59e22e3f36c4637193cab Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 6 Oct 2023 14:21:32 +0530
Subject: [PATCH v3] Include xid in restoring reorder buffer changes from disk
 log message.

Include xid in restoring reorder buffer changes from disk log message.
---
 src/backend/replication/logical/reorderbuffer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 12edc5772a..6ebda5cc5f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -1442,9 +1442,10 @@ ReorderBufferIterTXNNext(ReorderBuffer *rb, ReorderBufferIterTXNState *state)
 dlist_head_element(ReorderBufferChange, node,
    >txn->changes);
 
-			elog(DEBUG2, "restored %u/%u changes from disk",
+			elog(DEBUG2, "restored %u/%u changes in XID %u from disk",
  (uint32) entry->txn->nentries_mem,
- (uint32) entry->txn->nentries);
+ (uint32) entry->txn->nentries,
+ entry->txn->xid);
 
 			Assert(entry->txn->nentries_mem);
 			/* txn stays the same */
-- 
2.34.1



Re: Performance degradation on concurrent COPY into a single relation in PG16.

2023-10-13 Thread Andres Freund
Hi,

On 2023-10-13 11:30:35 -0700, Andres Freund wrote:
> On 2023-10-13 10:39:10 -0700, Andres Freund wrote:
> > On 2023-10-12 09:24:19 -0700, Andres Freund wrote:
> > > I kind of had hoped somebody would comment on the approach.  Given that 
> > > nobody
> > > has, I'll push the minimal fix of resetting the state in
> > > ReleaseBulkInsertStatePin(), even though I think architecturally that's 
> > > not
> > > great.
> > 
> > I spent some time working on a test that shows the problem more cheaply than
> > the case upthread. I think it'd be desirable to have a test that's likely to
> > catch an issue like this fairly quickly. We've had other problems in this
> > realm before - there's only a single test that fails if I remove the
> > ReleaseBulkInsertStatePin() call, and I don't think that's guaranteed at 
> > all.
> > 
> > I'm a bit on the fence on how large to make the relation. For me the bug
> > triggers when filling both relations up to the 3rd block, but how many rows
> > that takes is somewhat dependent on space utilization on the page and stuff.
> > 
> > Right now the test uses data/desc.data and ends up with 328kB and 312kB in 
> > two
> > partitions. Alternatively I could make the test create a new file to load 
> > with
> > copy that has fewer rows than data/desc.data - I didn't see another data 
> > file
> > that works conveniently and has fewer rows.  The copy is reasonably fast, 
> > even
> > under something as expensive as rr (~60ms). So I'm inclined to just go with
> > that?
> 
> Patch with fix and test attached (0001).

Pushed that.

Greetings,

Andres




Re: PostgreSQL domains and NOT NULL constraint

2023-10-13 Thread Vik Fearing

On 10/13/23 06:37, Tom Lane wrote:

Vik Fearing  writes:

Regardless of what the spec may or may not say about v1.d, it still
remains that nulls should not be allowed in a *base table* if the domain
says nulls are not allowed.  Not mentioned in this thread but the
constraints are also applied when CASTing to the domain.


Hmph.  The really basic problem here, I think, is that the spec
wants to claim that a domain is a data type, but then it backs
off and limits where the domain's constraints need to hold.



I don't think that is an accurate depiction of domains.

First of all, I am not seeing where it says that a domain is a data 
type.  It allows domains to be used in some places where a data type is 
used, but that is not equivalent to a domain /being/ a data type.


Section 4.14 says, "A domain is a set of permissible values." and then 
goes on to say that that is a combination of a predefined type and zero 
or more search conditions.  It can also have a default value, but it 
does not seem relevant to talk about that in this discussion.


Section 4.25.4, "Domain constraints" has this to say (emphasis mine):

- A domain constraint is satisfied by SQL-data *if and only if*, for 
every *table* T that has a column named C based on that domain, the 
applicable  recorded in the appropriate domain 
constraint usage evaluates to True or Unknown.


- A domain constraint is satisfied by the result of a specification> if and only if the specified template , 
with each occurrence of the  VALUE replaced 
by that result, evaluates to True or Unknown.


This tells me that the constraints should only be checked at those two 
points.


Secondly, why are you so concerned about outer join nulls here and not 
for any other column marked NOT NULL?




That's fundamentally inconsistent.  It's like claiming that
'foobarbaz' is a valid value of type numeric as long as it's
only in flight within a query and you haven't tried to store it
into a table.



It's like claiming that null is a valid value of type numeric as long as 
it's only in flight within a query and you haven't tried to store it 
into a table with that column marked NOT NULL.




Practical problems with this include:

* If a function declares its argument as being of a domain type,
can it expect that the passed value obeys the constraints?

* If a function declares its result as being of a domain type,
is it required to return a result that obeys the constraints?
(This has particular force for RETURNS NULL ON NULL INPUT
functions, for which we just automatically return NULL given
a NULL input without any consideration of whether the result
type nominally prohibits that.)

* If a plpgsql function has a variable that is declared to be of
domain type, do we enforce the domain's constraints when assigning?



Routines are not allowed to have domains in their parameters or result 
types.


I am all for PostgreSQL expanding the spec wherever we can, but in the 
above cases we have to define things ourselves.




* If a composite type has a column of a domain type, do we enforce
the domain's constraints when assigning or casting to that?



I don't see that a composite type is able to have a member of a domain. 
As for what PostgreSQL should do in this case, my opinion is "yes".




AFAICS, the spec's position leaves all of these as judgment calls,
or else you might claim that none of the above cases are even allowed
to be declared per spec.  I don't find either of those satisfactory,
so I reiterate my position that the committee hasn't thought this
through.



My claim is indeed that these cases are not allowed per-spec and 
therefore the spec doesn't *need* to think about them.  We do.




As you know, I am more than happy to (try to) amend the spec where
needed, but Erki's complaint of a null value being allowed in a base
table is clearly a bug in our implementation regardless of what we do
with views.


I agree it's not a good behavior, but I still say it's traceable
to schizophenia in the spec.  If the result of a sub-select is
nominally of a domain type, we should not have to recheck the
domain constraints in order to assign it to a domain-typed target.



Well, yes, we should.

Allowing a null to be stored in a column where the user has specified 
NOT NULL, no matter how the user did that, is unacceptable and I am 
frankly surprised that you are defending it.




If it's not nominally of a domain type, please cite chapter and
verse that says it isn't.


I don't see anything for or against this, I just see that the domain 
constraints are only checked on storage or casting.


And therefore, I think with these definitions:

CREATE DOMAIN dom AS INTEGER CHECK (VALUE >= 0);
CREATE TABLE t (d dom);
INSERT INTO t (d) VALUES (1);

this should be valid according to the spec:

SELECT -d FROM t;

and this should error:

SELECT CAST(-d AS dom) FROM t;
--
Vik Fearing





Re: Fix output of zero privileges in psql

2023-10-13 Thread Erik Wienhold
On 2023-10-09 22:34 +0200, David G. Johnston write:
> On Mon, Oct 9, 2023 at 12:13 PM Tom Lane  wrote:
> > Yeah.  There is a lot of attraction in having \pset null affect these
> > displays just like all other ones.  The fact that they act differently
> > now is a wart, not something we should replace with a different special
> > case behavior.
> >
> > Also, I'm fairly concerned about not changing the default behavior here.
> > The fact that this behavior has stood for a couple dozen years without
> > many complaints indicates that there's not all that big a problem to be
> > solved.  I doubt that a new default behavior will be well received,
> > even if it's arguably better.
> >
> 
> My position is that the default behavior should be changed such that the
> distinction these reports need to make between empty privileges and default
> privileges is impossible for the user to remove.  I suppose the best direct
> solution given that goal is for the query to simply do away with any
> reliance on NULL being an indicator.  Output an i18n'd "no permissions
> present" line in the rare empty permissions situation and leave the empty
> string for built-in default.

My initial patch does exactly that.

> If the consensus is to not actually fix these views to make them
> environment invariant in their accuracy then so be it.  Having them obey
> \pset null seems like a half-measure but it at least is an improvement over
> the status quo such that users are capable of altering their system to make
> the reports distinguish the two cases if they realize the need.

I agree.

-- 
Erik




Re: Fix output of zero privileges in psql

2023-10-13 Thread Erik Wienhold
On 2023-10-09 10:29 +0200, Laurenz Albe write:
> On Sun, 2023-10-08 at 19:58 -0700, David G. Johnston wrote:
> > We probably should add the two terms to the glossary:
> > 
> > empty privileges: all privileges explicitly revoked from the owner and 
> > PUBLIC
> > (where applicable), and none otherwise granted.
> > 
> > built-in default privileges: owner having all privileges and no privileges
> > granted or removed via ALTER DEFAULT PRIVILEGES
> 
> "Empty privileges" are too unimportant to warrant an index entry.
> 
> I can see the value of an index entry
> 
> 
>  privilege
>  default
> 
> 
> Done in the attached v5 of the patch, even though this is not really
> the business of this patch.

+1

-- 
Erik




Re: Fix output of zero privileges in psql

2023-10-13 Thread Erik Wienhold
On 2023-10-09 09:54 +0200, Laurenz Albe write:
> 
> I tinkered a bit with your documentation.  For example, the suggestion to
> "\pset null" seemed to be in an inappropriate place.  Tell me what you think.

+1  You're right to put that sentence right after the explanation of
empty privileges.

-- 
Erik




Re: On login trigger: take three

2023-10-13 Thread Alexander Korotkov
On Fri, Oct 13, 2023 at 11:26 AM Alexander Korotkov
 wrote:
> On Fri, Oct 13, 2023 at 4:18 AM Robert Haas  wrote:
> > On Thu, Oct 12, 2023 at 6:54 PM Alexander Korotkov  
> > wrote:
> > > On Thu, Oct 12, 2023 at 8:35 PM Robert Haas  wrote:
> >
> > > > Doesn't that mean that if you create the first login trigger in a
> > > > database and leave the transaction open, nobody can connect to that
> > > > database until the transaction ends?
> > >
> > > It doesn't mean that, because when trying to reset the flag v44 does
> > > conditional lock.  So, if another transaction is holding the log we
> > > will just skip resetting the flag.  So, the flag will be cleared on
> > > the first connection after that transaction ends.
> >
> > But in the scenario I am describing the flag is being set, not reset.
>
> Sorry, it seems I just missed some logical steps.  Imagine, that
> transaction A created the first login trigger and hangs open.  Then
> the new connection B sees no visible triggers yet, but dathasloginevt
> flag is set.  Therefore, connection B tries conditional lock but just
> gives up because the lock is held by transaction A.
>
> Also, note that the lock has been just some lock with a custom tag.
> It doesn't effectively block the database.  You may think about it as
> of custom advisory lock.

I've revised the comments about the lock a bit.  I've also run some
tests regarding the connection time (5 runs).

v45, event_triggers=on: avg=3.081ms, min=2.992ms, max=3.146ms
v45, event_triggers=off: avg=3.132ms, min=3.048ms, max=3.186ms
master: 3.080ms, min=3.023ms, max=3.167ms

So, no measurable overhead (not surprising since no extra catalog lookup).
I think this patch is in the commitable shape.  Any objections?

--
Regards,
Alexander Korotkov
From 1d59d7fe6f510a4606600b9bf41b5dbaa6d2b283 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov 
Date: Mon, 9 Oct 2023 15:00:26 +0300
Subject: [PATCH] Add support event triggers on authenticated login

This commit introduces trigger on login event, allowing to fire some actions
right on the user connection.  This can be useful for logging or connection
check purposes as well as for some personalization of environment.  Usage
details are described in the documentation included, but shortly usage is
the same as for other triggers: create function returning event_trigger and
then create event trigger on login event.

In order to prevent the connection time overhead when there are no triggers
the commit introduces pg_database.dathasloginevt flag, which indicates database
has active login triggers.  This flag is set by CREATE/ALTER EVENT TRIGGER
command, and unset at connection time when no active triggers found.

Author: Konstantin Knizhnik, Mikhail Gribkov
Discussion: https://postgr.es/m/0d46d29f-4558-3af9-9c85-7774e14a7709%40postgrespro.ru
Reviewed-by: Pavel Stehule, Takayuki Tsunakawa, Greg Nancarrow, Ivan Panchenko
Reviewed-by: Daniel Gustafsson, Teodor Sigaev, Robert Haas, Andres Freund
Reviewed-by: Tom Lane, Andrey Sokolov, Zhihong Yu, Sergey Shinderuk
Reviewed-by: Gregory Stark, Nikita Malakhov, Ted Yu
---
 doc/src/sgml/bki.sgml |   2 +-
 doc/src/sgml/catalogs.sgml|  13 ++
 doc/src/sgml/ecpg.sgml|   2 +
 doc/src/sgml/event-trigger.sgml   |  94 +
 src/backend/commands/dbcommands.c |  17 +-
 src/backend/commands/event_trigger.c  | 179 +-
 src/backend/storage/lmgr/lmgr.c   |  38 
 src/backend/tcop/postgres.c   |   4 +
 src/backend/utils/cache/evtcache.c|   2 +
 src/backend/utils/init/globals.c  |   2 +
 src/backend/utils/init/postinit.c |   1 +
 src/bin/pg_dump/pg_dump.c |   5 +
 src/bin/psql/tab-complete.c   |   4 +-
 src/include/catalog/pg_database.dat   |   2 +-
 src/include/catalog/pg_database.h |   3 +
 src/include/commands/event_trigger.h  |   1 +
 src/include/miscadmin.h   |   2 +
 src/include/storage/lmgr.h|   2 +
 src/include/tcop/cmdtaglist.h |   1 +
 src/include/utils/evtcache.h  |   3 +-
 .../authentication/t/005_login_trigger.pl | 157 +++
 src/test/recovery/t/001_stream_rep.pl |  23 +++
 src/test/regress/expected/event_trigger.out   |  45 +
 src/test/regress/sql/event_trigger.sql|  26 +++
 24 files changed, 608 insertions(+), 20 deletions(-)
 create mode 100644 src/test/authentication/t/005_login_trigger.pl

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index f71644e3989..315ba819514 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -184,7 +184,7 @@
   descr => 'database\'s default template',
   datname => 'template1', encoding => 'ENCODING',
   datlocprovider => 'LOCALE_PROVIDER', datistemplate => 't',
-  datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
+  

Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Thomas Munro
On Sat, Oct 14, 2023 at 3:56 AM Andres Freund  wrote:
> On 2023-10-13 16:44:13 +0200, Dmitry Dolgov wrote:
> > Here is what I had in mind (only this part in the second patch was changed).
>
> Makes sense to me. I think we'll likely eventually want to use a custom
> pipeline anyway, and I think we should consider using an optimization level
> inbetween "not at all" "as hard as possible"...

Thanks Dmitry and Andres.  I'm planning to commit these today if there
are no further comments.




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Thomas Munro
On Wed, Oct 11, 2023 at 10:31 PM Ronan Dunklau  wrote:
> Le mercredi 11 octobre 2023, 10:59:50 CEST Thomas Munro a écrit :
> > The back-patch to 12 was a little trickier than anticipated, but after
> > taking a break and trying again I now have PG 12...17 patches that
> > I've tested against LLVM 10...18 (that's 54 combinations), in every
> > case only with the clang corresponding to LLVM.
>
> Thank you Thomas for those patches, and the extensive testing, I will run my
> own and let you know.

Thanks!  No news is good news, I hope?  I'm hoping to commit this today.

> > I've attached only the patches for master, but the 12-16 versions are
> > available at https://github.com/macdice/postgres/tree/llvm16-$N in
> > case anyone has comments on those.
>
> For PG13 and PG12, it looks like the ExecEvalBoolSubroutineTemplate is not
> used anywhere, as ExecEvalBoolSubroutine was introduced in PG14 if I'm not
> mistaken.

Right, looks like I can remove that in those branches.




Re: should frontend tools use syncfs() ?

2023-10-13 Thread Nathan Bossart
On Mon, Oct 09, 2023 at 02:34:27PM -0500, Nathan Bossart wrote:
> On Mon, Oct 09, 2023 at 11:14:39AM -0500, Nathan Bossart wrote:
>> Thanks.  I've made a couple of small changes, but otherwise I think this
>> one is just about ready.
> 
> I forgot to rename one thing.  Here's a v13 with that fixed.

Committed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Questions about the new subscription parameter: password_required

2023-10-13 Thread Jeff Davis
On Fri, 2023-10-13 at 11:18 +0200, Benoit Lobréau wrote:
> I tried adding a section in "Logical Replication > Subscription" with
> the text you suggested and links in the CREATE / ALTER SUBSRIPTION
> commands.
> 
> Is it better ?


Minor comments:

 * Use possessive "its" instead of the contraction, i.e. "before
transferring its ownership".
 * I like that docs cover the case where a password is specified, but
the remote server doesn't require one. But the warning is the wrong
place to explain that, it should be in the main behavioral description
in 31.2.2.
 * The warning feels like it has too many negatives and confused me at
first. I struggled myself a bit to come up with something less
confusing, but perhaps less is more: "Ensure that password_required is
properly set before transferring ownership of a subscription to a non-
superuser, otherwise the subscription may start to fail."
 * Missing space in the warning after "password_required = true"
 * Mention that a non-superuser-owned subscription with
password_required = false is partially locked down, e.g. the owner
can't change the connection string any more.
 * 31.2.2 could probably be in the CREATE SUBSCRIPTION docs instead,
and linked from the ALTER docs. That's fairly normal for other commands
and I'm not sure there needs to be a separate section in logical
replication. I don't have a strong opinion here.

I like the changes; this is a big improvement. I'll leave it to Robert
to commit it, so that he can ensure it matches how he expected the
feature to be used and sufficiently covers the behavioral aspects.

Regards,
Jeff Davis





Re: Performance degradation on concurrent COPY into a single relation in PG16.

2023-10-13 Thread Andres Freund
Hi,

On 2023-10-13 10:39:10 -0700, Andres Freund wrote:
> On 2023-10-12 09:24:19 -0700, Andres Freund wrote:
> > I kind of had hoped somebody would comment on the approach.  Given that 
> > nobody
> > has, I'll push the minimal fix of resetting the state in
> > ReleaseBulkInsertStatePin(), even though I think architecturally that's not
> > great.
> 
> I spent some time working on a test that shows the problem more cheaply than
> the case upthread. I think it'd be desirable to have a test that's likely to
> catch an issue like this fairly quickly. We've had other problems in this
> realm before - there's only a single test that fails if I remove the
> ReleaseBulkInsertStatePin() call, and I don't think that's guaranteed at all.
> 
> I'm a bit on the fence on how large to make the relation. For me the bug
> triggers when filling both relations up to the 3rd block, but how many rows
> that takes is somewhat dependent on space utilization on the page and stuff.
> 
> Right now the test uses data/desc.data and ends up with 328kB and 312kB in two
> partitions. Alternatively I could make the test create a new file to load with
> copy that has fewer rows than data/desc.data - I didn't see another data file
> that works conveniently and has fewer rows.  The copy is reasonably fast, even
> under something as expensive as rr (~60ms). So I'm inclined to just go with
> that?

Patch with fix and test attached (0001).

Given how easy a missing ReleaseBulkInsertStatePin() can cause corruption (not
due to this bug, but the issue fixed in b1ecb9b3fcf), I think we should
consider adding an assertion along the lines of 0002 to HEAD. Perhaps adding a
new bufmgr.c function to avoid having to get the fields in the buffer tag we
don't care about. Or perhaps we should just promote the check to an elog, we
already call BufferGetBlockNumber(), using BufferGetTag() instead doesn't cost
much more.

Greetings,

Andres Freund
>From e202cf5c3cae0551ae8fabd2629b4f9c6a0734d5 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Fri, 13 Oct 2023 10:54:16 -0700
Subject: [PATCH v1 1/2] Fix bulk table extension when copying into multiple
 partitions

When COPYing into a partitioned table that does now permit the use of
table_multi_insert(), we could error out with
  ERROR: could not read block NN in file "base/...": read only 0 of 8192 bytes

because BulkInsertState->next_free was not reset between partitions. This
problem occured only when not able to use table_multi_insert(), as a dedicated
BulkInsertState for each partition is used in that case.

The bug was introduced in 00d1e02be24, but it was hard to hit at that point as
commonly bulk relation extension is not used when not using
table_multi_insert(). It became more likely after 82a4edabd27, which expanded
the use of bulk extension.

To fix the bug, reset the bulk relation extension state in BulkInsertState in
ReleaseBulkInsertStatePin(). That was added (in b1ecb9b3fcf) to tackle a very
similar issue.  Obviously the name is not quite correct, but there might be
external callers, and bulk insert state needs to be reset in precisely in the
situations that ReleaseBulkInsertStatePin() already needed to be called.

Medium term the better fix likely is to disallow reusing BulkInsertState
across relations.

Add a test that, without the fix, reproduces #18130 in most
configurations. The test also catches the problem fixed in b1ecb9b3fcf when
run with small shared_buffers.

Reported-by: Ivan Kolombet 
Analyzed-by: Tom Lane 
Analyzed-by: Andres Freund 
Bug: #18130
Discussion: https://postgr.es/m/18130-7a86a7356a75209d%40postgresql.org
Discussion: https://postgr.es/m/257696.1695670946%40sss.pgh.pa.us
Backpatch: 16-
---
 src/backend/access/heap/heapam.c   | 11 +
 src/test/regress/expected/copy.out | 37 ++
 src/test/regress/sql/copy.sql  | 37 ++
 3 files changed, 85 insertions(+)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 88a123d38a6..51eceaca6cf 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1792,6 +1792,17 @@ ReleaseBulkInsertStatePin(BulkInsertState bistate)
 	if (bistate->current_buf != InvalidBuffer)
 		ReleaseBuffer(bistate->current_buf);
 	bistate->current_buf = InvalidBuffer;
+
+	/*
+	 * Despite the name, we also release information about bulk
+	 * inserts. Otherwise we can end up erroring out due to looking for free
+	 * space in ->next_free of one partition, even though ->next_free was set
+	 * when extending another partition. It's obviously also could be bad for
+	 * efficiency to look at existing blocks at offsets from another
+	 * partition, even if we don't error out.
+	 */
+	bistate->next_free = InvalidBlockNumber;
+	bistate->last_free = InvalidBlockNumber;
 }
 
 
diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index a5912c13a8c..b48365ec981 100644
--- a/src/test/regress/expected/copy.out
+++ 

Re: Performance degradation on concurrent COPY into a single relation in PG16.

2023-10-13 Thread Andres Freund
Hi,

On 2023-10-12 09:24:19 -0700, Andres Freund wrote:
> On 2023-10-12 11:44:09 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > >> On 2023-09-25 15:42:26 -0400, Tom Lane wrote:
> > >>> I just did a git bisect run to discover when the failure documented
> > >>> in bug #18130 [1] started.  And the answer is commit 82a4edabd.
> > 
> > > Uh, huh.  The problem is that COPY uses a single BulkInsertState for 
> > > multiple
> > > partitions. Which to me seems to run counter to the following comment:
> > >  *The caller can also provide a BulkInsertState object to 
> > > optimize many
> > >  *insertions into the same relation.  This keeps a pin on the 
> > > current
> > >  *insertion target page (to save pin/unpin cycles) and also 
> > > passes a
> > >  *BULKWRITE buffer selection strategy object to the buffer 
> > > manager.
> > >  *Passing NULL for bistate selects the default behavior.
> > 
> > > The reason this doesn't cause straight up corruption due to reusing a pin 
> > > from
> > > another relation is that b1ecb9b3fcfb added ReleaseBulkInsertStatePin() 
> > > and a
> > > call to it. But I didn't make ReleaseBulkInsertStatePin() reset the bulk
> > > insertion state, which is what leads to the errors from the bug report.
> > 
> > > Resetting the relevant BulkInsertState fields fixes the problem. But I'm 
> > > not
> > > sure that's the right fix. ISTM that independent of whether we fix this 
> > > via
> > > ReleaseBulkInsertStatePin() resetting the fields or via not reusing
> > > BulkInsertState, we should add assertions defending against future issues 
> > > like
> > > this (e.g. by adding a relation field to BulkInsertState in cassert 
> > > builds,
> > > and asserting that the relation is the same as in prior calls unless
> > > ReleaseBulkInsertStatePin() has been called).
> > 
> > Ping?  We really ought to have a fix for this committed in time for
> > 16.1.
> 
> I kind of had hoped somebody would comment on the approach.  Given that nobody
> has, I'll push the minimal fix of resetting the state in
> ReleaseBulkInsertStatePin(), even though I think architecturally that's not
> great.

I spent some time working on a test that shows the problem more cheaply than
the case upthread. I think it'd be desirable to have a test that's likely to
catch an issue like this fairly quickly. We've had other problems in this
realm before - there's only a single test that fails if I remove the
ReleaseBulkInsertStatePin() call, and I don't think that's guaranteed at all.

I'm a bit on the fence on how large to make the relation. For me the bug
triggers when filling both relations up to the 3rd block, but how many rows
that takes is somewhat dependent on space utilization on the page and stuff.

Right now the test uses data/desc.data and ends up with 328kB and 312kB in two
partitions. Alternatively I could make the test create a new file to load with
copy that has fewer rows than data/desc.data - I didn't see another data file
that works conveniently and has fewer rows.  The copy is reasonably fast, even
under something as expensive as rr (~60ms). So I'm inclined to just go with
that?

Greetings,

Andres Freund




Re: pg_stat_statements and "IN" conditions

2023-10-13 Thread Nathan Bossart
On Tue, Jul 04, 2023 at 09:02:56PM +0200, Dmitry Dolgov wrote:
>> On Mon, Jul 03, 2023 at 09:46:11PM -0700, Nathan Bossart wrote:
>> Also, it seems counterintuitive that queries with fewer than 10
>> constants are not merged.
> 
> Why? What would be your intuition using this feature?

For the "powers" setting, I would've expected queries with 0-9 constants to
be merged.  Then 10-99, 100-999, 1000-, etc.  I suppose there might be
an argument for separating 0 from 1-9, too.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-13 Thread Dean Rasheed
On Fri, 13 Oct 2023 at 13:17, Tomas Vondra
 wrote:
>
> I do plan to backpatch this, yes. I don't think there are many people
> affected by this (few people are using infinite dates/timestamps, but
> maybe the overflow could be more common).
>

OK, though I doubt that such values are common in practice.

There's also an overflow problem in
brin_minmax_multi_distance_interval() though, and I think that's worse
because overflows there throw "interval out of range" errors, which
can prevent index creation or inserts.

There's a patch (0009 in [1]) as part of the infinite interval work,
but that just uses interval_mi(), and so doesn't fix the
interval-out-of-range errors, except for infinite intervals, which are
treated as special cases, which I don't think is really necessary.

I think this should be rewritten to compute delta from ia and ib
without going via an intermediate Interval value. I.e., instead of
computing "result", just do something like

dayfraction = (ib->time % USECS_PER_DAY) - (ia->time % USECS_PER_DAY);
days = (ib->time / USECS_PER_DAY) - (ia->time / USECS_PER_DAY);
days += (int64) ib->day - (int64) ia->day;
days += ((int64) ib->month - (int64) ia->month) * INT64CONST(30);

then convert to double precision as it does now:

delta = (double) days + dayfraction / (double) USECS_PER_DAY;

So the first part is exact 64-bit integer arithmetic, with no chance
of overflow, and it'll handle "infinite" intervals just fine, when
that feature gets added.

Regards,
Dean

[1] 
https://www.postgresql.org/message-id/CAExHW5u1JE7dxK=wlzqhcszntoxqzjdiermhrepw6r8w6kc...@mail.gmail.com




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Andres Freund
On 2023-10-13 16:44:13 +0200, Dmitry Dolgov wrote:
> Here is what I had in mind (only this part in the second patch was changed).

Makes sense to me. I think we'll likely eventually want to use a custom
pipeline anyway, and I think we should consider using an optimization level
inbetween "not at all" "as hard as possible"...

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Andres Freund
Hi,

On 2023-10-13 11:06:21 +0200, Dmitry Dolgov wrote:
> > On Thu, Oct 12, 2023 at 04:31:20PM -0700, Andres Freund wrote:
> > I also don't think we should add the mem2reg pass outside of -O0 - running 
> > it
> > after a real optimization pipeline doesn't seem useful and might even make 
> > the
> > code worse? mem2reg is included in default (and obviously also in O3).
> 
> My understanding was that while mem2reg is included everywhere above
> -O0, this set of passes won't hurt. But yeah, if you say it could
> degrade the final result, it's better to not do this. I'll update this
> part.

It's indeed included anywhere above that, but adding it explicitly to the
schedule means it's excuted twice:

echo 'int foo(int a) { return a / 343; }' | clang-16 -emit-llvm -x c -c -o - -S 
-|sed -e 's/optnone//'|opt-17 -debug-pass-manager -passes='default,mem2reg' 
-o /dev/null 2>&1|grep Promote
Running pass: PromotePass on foo (2 instructions)
Running pass: PromotePass on foo (2 instructions)

The second one is in a point in the pipeline where it doesn't help. It also
requires another analysis pass to be executed unnecessarily.

Greetings,

Andres Freund




Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Dmitry Dolgov
> On Fri, Oct 13, 2023 at 11:06:21AM +0200, Dmitry Dolgov wrote:
> > On Thu, Oct 12, 2023 at 04:31:20PM -0700, Andres Freund wrote:
> > I don't think the "function(no-op-function),no-op-module" bit does something
> > particularly useful?
>
> Right, looks like leftovers after verifying which passes were actually
> applied. My bad, could be removed.
>
> > I also don't think we should add the mem2reg pass outside of -O0 - running 
> > it
> > after a real optimization pipeline doesn't seem useful and might even make 
> > the
> > code worse? mem2reg is included in default (and obviously also in O3).
>
> My understanding was that while mem2reg is included everywhere above
> -O0, this set of passes won't hurt. But yeah, if you say it could
> degrade the final result, it's better to not do this. I'll update this
> part.

Here is what I had in mind (only this part in the second patch was changed).
>From 040121be6d150e8f18f154a64b409baef2b15ffb Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 9 May 2022 08:27:47 +1200
Subject: [PATCH v4 1/2] jit: Support opaque pointers in LLVM 16.

Remove use of LLVMGetElementType() and provide the type of all pointers
to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM
versions[1].

 * For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions.
 * For LLVM == 15, we'll continue to do the same, explicitly opting
   out of opaque pointer mode.
 * For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take
   the extra type argument.

The difference is hidden behind some new IR emitting wrapper functions
l_load(), l_gep(), l_call() etc.  The change is mostly mechanical,
except that at each site the correct type had to be provided.

In some places we needed to do some extra work to get functions types,
including some new wrappers for C++ APIs that are not yet exposed by in
LLVM's C API, and some new "example" functions in llvmjit_types.c
because it's no longer possible to start from the function pointer type
and ask for the function type.

Back-patch to 12, because it's a little tricker in 11 and we agreed not
to put the latest LLVM support into the upcoming final release of 11.

[1] https://llvm.org/docs/OpaquePointers.html

Reviewed-by: Dmitry Dolgov <9erthali...@gmail.com>
Reviewed-by: Ronan Dunklau 
Reviewed-by: Andres Freund 
Discussion: https://postgr.es/m/CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com
---
 src/backend/jit/llvm/llvmjit.c|  59 ++--
 src/backend/jit/llvm/llvmjit_deform.c | 119 
 src/backend/jit/llvm/llvmjit_expr.c   | 401 --
 src/backend/jit/llvm/llvmjit_types.c  |  39 ++-
 src/backend/jit/llvm/llvmjit_wrap.cpp |  12 +
 src/backend/jit/llvm/meson.build  |   2 +-
 src/include/jit/llvmjit.h |   7 +
 src/include/jit/llvmjit_emit.h| 106 +--
 8 files changed, 481 insertions(+), 264 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 4dfaf797432..1c8e1ab3a76 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -64,8 +64,10 @@ LLVMTypeRef TypeStorageBool;
 LLVMTypeRef TypePGFunction;
 LLVMTypeRef StructNullableDatum;
 LLVMTypeRef StructHeapTupleData;
+LLVMTypeRef StructMinimalTupleData;
 LLVMTypeRef StructTupleDescData;
 LLVMTypeRef StructTupleTableSlot;
+LLVMTypeRef StructHeapTupleHeaderData;
 LLVMTypeRef StructHeapTupleTableSlot;
 LLVMTypeRef StructMinimalTupleTableSlot;
 LLVMTypeRef StructMemoryContextData;
@@ -76,8 +78,11 @@ LLVMTypeRef StructExprState;
 LLVMTypeRef StructAggState;
 LLVMTypeRef StructAggStatePerGroupData;
 LLVMTypeRef StructAggStatePerTransData;
+LLVMTypeRef StructPlanState;
 
 LLVMValueRef AttributeTemplate;
+LLVMValueRef ExecEvalSubroutineTemplate;
+LLVMValueRef ExecEvalBoolSubroutineTemplate;
 
 static LLVMModuleRef llvm_types_module = NULL;
 
@@ -451,11 +456,7 @@ llvm_pg_var_type(const char *varname)
 	if (!v_srcvar)
 		elog(ERROR, "variable %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	typ = LLVMTypeOf(v_srcvar);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL);
+	typ = LLVMGlobalGetValueType(v_srcvar);
 
 	return typ;
 }
@@ -467,12 +468,14 @@ llvm_pg_var_type(const char *varname)
 LLVMTypeRef
 llvm_pg_var_func_type(const char *varname)
 {
-	LLVMTypeRef typ = llvm_pg_var_type(varname);
+	LLVMValueRef v_srcvar;
+	LLVMTypeRef typ;
+
+	v_srcvar = LLVMGetNamedFunction(llvm_types_module, varname);
+	if (!v_srcvar)
+		elog(ERROR, "function %s not in llvmjit_types.c", varname);
 
-	/* look at the contained type */
-	Assert(LLVMGetTypeKind(typ) == LLVMPointerTypeKind);
-	typ = LLVMGetElementType(typ);
-	Assert(typ != NULL && LLVMGetTypeKind(typ) == LLVMFunctionTypeKind);
+	typ = LLVMGetFunctionType(v_srcvar);
 
 	return typ;
 }
@@ -502,7 +505,7 @@ llvm_pg_func(LLVMModuleRef mod, const char *funcname)
 
 	v_fn = LLVMAddFunction(mod,
 			

Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

2023-10-13 Thread David Steele

On 10/12/23 19:15, Michael Paquier wrote:

On Thu, Oct 12, 2023 at 10:41:39AM -0400, David Steele wrote:

After some more thought, I think we could massage the "pg_control in
backup_label" method into something that could be back patched, with more
advanced features (e.g. error on backup_label and pg_control both present on
initial cluster start) saved for HEAD.


I doubt that anything changed in this area would be in the
backpatchable zone, particularly as it would involve protocol changes
within the replication commands, so I'd recommend to focus on HEAD.


I can't see why there would be any protocol changes, but perhaps I am 
missing something?


One thing that does have to change, however, is the ordering of 
backup_label in the base tar file. Right now it is at the beginning but 
we need it to be at the end like pg_control is now.


I'm working up a POC patch now and hope to have something today or 
tomorrow. I think it makes sense to at least have a look at an 
alternative solution before going forward.


Regards,
-David





Re: pg_stat_statements and "IN" conditions

2023-10-13 Thread Dmitry Dolgov
> On Fri, Oct 13, 2023 at 05:07:00PM +0900, Michael Paquier wrote:
> Now, it doesn't mean that this approach with the "powers" will never
> happen, but based on the set of opinions I am gathering on this thread
> I would suggest to rework the patch as follows:
> - First implement an on/off switch that reduces the lists in IN and/or
> ANY to one parameter.  Simply.
> - Second, refactor the powers routine.
> - Third, extend the on/off switch, or just implement a threshold with
> a second switch.

Well, if it will help move this patch forward, why not. To clarify, I'm
going to split the current implementation into three patches, one for
each point you've mentioned.

> When it comes to my opinion, I am not seeing any objections to the
> feature as a whole, and I'm OK with the first step.  I'm also OK to
> keep the door open for more improvements in controlling how these
> IN/ANY lists show up, but there could be more than just the number of
> items as parameter (say the query size, different behaviors depending
> on the number of clauses in queries, subquery context or CTEs/WITH,
> etc. just to name a few things coming in mind).

Interesting point, but now it's my turn to have troubles imagining the
case, where list representation could be controlled depending on
something else than the number of elements in it. Do you have any
specific example in mind?




Re: PostgreSQL domains and NOT NULL constraint

2023-10-13 Thread Erki Eessaar
Hello

Equaling a domain with a type is really confusing because why, for instance, in 
this case the following is possible without defining any additional operators.

CREATE DOMAIN d_name VARCHAR(50) NOT NULL;
CREATE DOMAIN d_description VARCHAR(1000) NOT NULL;
CREATE TABLE x(name d_name, description d_description);
SELECT *
FROM x
WHERE name=description;

Isn't it so that domains are not types and for this reason there are separate 
CREATE DOMAIN and CREATE TYPE statements?!

In my opinion the Notes section of CREATE DOMAIN documentation should offer 
better examples. The two examples that I provided in my demonstration seemed 
very far fetched and artificial. Frankly, I have difficulties in imagining why 
someone would like to write statements like that in a production environment 
and how the proper enforcement of NOT NULL constraints of domains could break 
things.

Lets say I have a column that I have declared mandatory by using a domain, but 
somehow I have added NULLs to the column, and if it is not possible any more, 
then things break down.

If I want to permit NULLs, then ALTER DOMAIN d DROP NOT NULL; will fix it with 
one stroke. If I do not want to permit NULLs but I have registered NULLs, then 
this is a data quality issue that has to be addressed.

Currently there is a feature (NOT NULL of domain) that the documentation 
explicitly suggests not to use. Isn't it in this case better to remove this 
feature completely?! If this would break something, then it would mean that 
systems actually rely on this constraint.

Best regards
Erki Eessaar


From: Tom Lane 
Sent: Friday, October 13, 2023 08:37
To: Vik Fearing 
Cc: Erki Eessaar ; pgsql-hackers@lists.postgresql.org 

Subject: Re: PostgreSQL domains and NOT NULL constraint

Vik Fearing  writes:
> Regardless of what the spec may or may not say about v1.d, it still
> remains that nulls should not be allowed in a *base table* if the domain
> says nulls are not allowed.  Not mentioned in this thread but the
> constraints are also applied when CASTing to the domain.

Hmph.  The really basic problem here, I think, is that the spec
wants to claim that a domain is a data type, but then it backs
off and limits where the domain's constraints need to hold.
That's fundamentally inconsistent.  It's like claiming that
'foobarbaz' is a valid value of type numeric as long as it's
only in flight within a query and you haven't tried to store it
into a table.

Practical problems with this include:

* If a function declares its argument as being of a domain type,
can it expect that the passed value obeys the constraints?

* If a function declares its result as being of a domain type,
is it required to return a result that obeys the constraints?
(This has particular force for RETURNS NULL ON NULL INPUT
functions, for which we just automatically return NULL given
a NULL input without any consideration of whether the result
type nominally prohibits that.)

* If a plpgsql function has a variable that is declared to be of
domain type, do we enforce the domain's constraints when assigning?

* If a composite type has a column of a domain type, do we enforce
the domain's constraints when assigning or casting to that?

AFAICS, the spec's position leaves all of these as judgment calls,
or else you might claim that none of the above cases are even allowed
to be declared per spec.  I don't find either of those satisfactory,
so I reiterate my position that the committee hasn't thought this
through.

> As you know, I am more than happy to (try to) amend the spec where
> needed, but Erki's complaint of a null value being allowed in a base
> table is clearly a bug in our implementation regardless of what we do
> with views.

I agree it's not a good behavior, but I still say it's traceable
to schizophenia in the spec.  If the result of a sub-select is
nominally of a domain type, we should not have to recheck the
domain constraints in order to assign it to a domain-typed target.
If it's not nominally of a domain type, please cite chapter and
verse that says it isn't.

regards, tom lane


Re: [RFC] Add jit deform_counter

2023-10-13 Thread Daniel Gustafsson
> On 12 Oct 2023, at 15:40, Daniel Gustafsson  wrote:
>> On 12 Oct 2023, at 15:37, Nazir Bilal Yavuz  wrote:

>> I realized that pg_stat_statements is bumped to 1.11 with this patch
>> but oldextversions test is not updated. So, I attached a patch for
>> updating oldextversions.
> 
> Thanks for the patch, that was an oversight in the original commit for this.
> From a quick look it seems correct, I'll have another look later today and 
> will
> then apply it.

Applied, thanks!

--
Daniel Gustafsson





Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-13 Thread Tomas Vondra
On 10/13/23 14:04, Dean Rasheed wrote:
> On Fri, 13 Oct 2023 at 11:44, Tomas Vondra
>  wrote:
>>
>> On 10/13/23 11:21, Dean Rasheed wrote:
>>>
>>> Is this only inefficient? Or can it also lead to wrong query results?
>>
>> I don't think it can produce incorrect results. It only affects which
>> values we "merge" into an interval when building the summaries.
>>
> 
> Ah, I get it now. These "distance" support functions are only used to
> see how far apart 2 ranges are, for the purposes of the algorithm that
> merges the 2 closest ranges. So if it gets it wrong, it only leads to
> a poor choice of ranges to merge, making the query inefficient, but
> still correct.
> 

Right.

> Presumably, that also makes this kind of change safe to back-patch
> (not sure if you were planning to do that?), since it will only affect
> range merging choices when inserting new values into existing indexes.
> 

I do plan to backpatch this, yes. I don't think there are many people
affected by this (few people are using infinite dates/timestamps, but
maybe the overflow could be more common).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-13 Thread Dean Rasheed
On Fri, 13 Oct 2023 at 11:44, Tomas Vondra
 wrote:
>
> On 10/13/23 11:21, Dean Rasheed wrote:
> >
> > Is this only inefficient? Or can it also lead to wrong query results?
>
> I don't think it can produce incorrect results. It only affects which
> values we "merge" into an interval when building the summaries.
>

Ah, I get it now. These "distance" support functions are only used to
see how far apart 2 ranges are, for the purposes of the algorithm that
merges the 2 closest ranges. So if it gets it wrong, it only leads to
a poor choice of ranges to merge, making the query inefficient, but
still correct.

Presumably, that also makes this kind of change safe to back-patch
(not sure if you were planning to do that?), since it will only affect
range merging choices when inserting new values into existing indexes.

Regards,
Dean




RE: pg_upgrade's interaction with pg_resetwal seems confusing

2023-10-13 Thread Hayato Kuroda (Fujitsu)
Dear hackers,

> >
> > > >  I mean instead of resetwal directly modifying the control file we
> > > > will modify that value in the server using the binary_upgrade function
> > > > and then have that value flush to the disk by shutdown checkpoint.
> > > >
> > >
> > > True, the alternative to consider is to let pg_upgrade update the
> > > control file by itself with the required value of OID. The point I am
> > > slightly worried about doing via server-side function is that some
> > > online and or shutdown checkpoint can update other values in the
> > > control file as well whereas if we do via pg_upgrade, we can have
> > > better control over just updating the OID.
> >
> > Yeah, thats a valid point.
> >
> 
> But OTOH, just updating the control file via pg_upgrade may not be
> sufficient because we should keep the shutdown checkpoint record also
> updated with that value. See how pg_resetwal achieves it via
> RewriteControlFile() and WriteEmptyXLOG(). So, considering both the
> approaches it seems better to go with a server-side function as Robert
> suggested.

Based on these discussion, I implemented a server-side approach. An attached 
patch
adds a upgrade function which sets ShmemVariableCache->nextOid. It is called at
the end of the upgrade. Comments and name of issue_warnings_and_set_wal_level()
is also updated because they become outdated.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


0001-pg_upgrade-use-upgrade-function-to-restore-OID.patch
Description: 0001-pg_upgrade-use-upgrade-function-to-restore-OID.patch


Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-13 Thread Tomas Vondra


On 10/13/23 11:21, Dean Rasheed wrote:
> On Thu, 12 Oct 2023 at 23:43, Tomas Vondra
>  wrote:
>>
>> Ashutosh Bapat reported me off-list a possible issue in how BRIN
>> minmax-multi calculate distance for infinite timestamp/date values.
>>
>> The current code does this:
>>
>> if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2))
>> PG_RETURN_FLOAT8(0);
>>
> 
> Yes indeed, that looks wrong. I noticed the same thing while reviewing
> the infinite interval patch.
> 
>> so means infinite values are "very close" to any other value, and thus
>> likely to be merged into a summary range. That's exactly the opposite of
>> what we want to do, possibly resulting in inefficient indexes.
>>
> 
> Is this only inefficient? Or can it also lead to wrong query results?
> 

I don't think it can produce incorrect results. It only affects which
values we "merge" into an interval when building the summaries.

>> Attached is a patch fixing this
>>
> 
> I wonder if it's actually necessary to give infinity any special
> handling at all for dates and timestamps. For those types, "infinity"
> is actually just INT_MIN/MAX, which compares correctly with any finite
> value, and will be much larger/smaller than any common value, so it
> seems like it isn't necessary to give "infinite" values any special
> treatment. That would be consistent with date_cmp() and
> timestamp_cmp().
> 

Right, but 

> Something else that looks wrong about that BRIN code is that the
> integer subtraction might lead to overflow -- it is subtracting two
> integer values, then casting the result to float8. It should cast each
> input before subtracting, more like brin_minmax_multi_distance_int8().
> 

... it also needs to fix this, otherwise it overflows. Consider

 delta = dt2 - dt1;

and assume dt1 is INT64_MIN, or that dt2 is INT64_MAX.

> IOW, I think brin_minmax_multi_distance_date/timestamp could be made
> basically identical to brin_minmax_multi_distance_int4/8.
> 

Right. Attached is a patch doing it this way.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index f8b2a3f9bc6..3d5e6d0c8f6 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2084,10 +2084,13 @@ brin_minmax_multi_distance_date(PG_FUNCTION_ARGS)
 	DateADT		dateVal1 = PG_GETARG_DATEADT(0);
 	DateADT		dateVal2 = PG_GETARG_DATEADT(1);
 
-	if (DATE_NOT_FINITE(dateVal1) || DATE_NOT_FINITE(dateVal2))
-		PG_RETURN_FLOAT8(0);
+	/*
+	 * We know the values are range boundaries, but the range may be collapsed
+	 * (i.e. single points), with equal values.
+	 */
+	Assert(dateVal1 <= dateVal2);
 
-	PG_RETURN_FLOAT8(dateVal1 - dateVal2);
+	PG_RETURN_FLOAT8((double) dateVal2 - (double) dateVal1);
 }
 
 /*
@@ -2136,19 +2139,16 @@ brin_minmax_multi_distance_timetz(PG_FUNCTION_ARGS)
 Datum
 brin_minmax_multi_distance_timestamp(PG_FUNCTION_ARGS)
 {
-	float8		delta = 0;
-
 	Timestamp	dt1 = PG_GETARG_TIMESTAMP(0);
 	Timestamp	dt2 = PG_GETARG_TIMESTAMP(1);
 
-	if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2))
-		PG_RETURN_FLOAT8(0);
-
-	delta = dt2 - dt1;
-
-	Assert(delta >= 0);
+	/*
+	 * We know the values are range boundaries, but the range may be collapsed
+	 * (i.e. single points), with equal values.
+	 */
+	Assert(dt1 <= dt2);
 
-	PG_RETURN_FLOAT8(delta);
+	PG_RETURN_FLOAT8((double) dt2 - (double) dt1);
 }
 
 /*


Re: pg_upgrade's interaction with pg_resetwal seems confusing

2023-10-13 Thread Amit Kapila
On Fri, Oct 13, 2023 at 2:03 PM Dilip Kumar  wrote:
>
> On Fri, Oct 13, 2023 at 11:07 AM Amit Kapila  wrote:
> >
> > > But is this a problem? basically, we will set the
> > > ShmemVariableCache->nextOid counter to the value that we want in the
> > > IsBinaryUpgrade-specific function.  And then the shutdown checkpoint
> > > will flush that value to the control file and that is what we want no?
> > >
> >
> > I think that can work. Basically, we need to do something like what
> > SetNextObjectId() does and then let the shutdown checkpoint update the
> > actual value in the control file.
>
> Right.
>
> > >  I mean instead of resetwal directly modifying the control file we
> > > will modify that value in the server using the binary_upgrade function
> > > and then have that value flush to the disk by shutdown checkpoint.
> > >
> >
> > True, the alternative to consider is to let pg_upgrade update the
> > control file by itself with the required value of OID. The point I am
> > slightly worried about doing via server-side function is that some
> > online and or shutdown checkpoint can update other values in the
> > control file as well whereas if we do via pg_upgrade, we can have
> > better control over just updating the OID.
>
> Yeah, thats a valid point.
>

But OTOH, just updating the control file via pg_upgrade may not be
sufficient because we should keep the shutdown checkpoint record also
updated with that value. See how pg_resetwal achieves it via
RewriteControlFile() and WriteEmptyXLOG(). So, considering both the
approaches it seems better to go with a server-side function as Robert
suggested.

-- 
With Regards,
Amit Kapila.




Re: Oversight in reparameterize_path_by_child leading to executor crash

2023-10-13 Thread Andrei Lepikhov

On 23/8/2023 12:37, Richard Guo wrote:

To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
seems that that is not easily done without postponing reparameterization
of paths until createplan.c.

Attached is a patch which is v5 + fix for this new issue.


Having looked into the patch for a while, I couldn't answer to myself 
for some stupid questions:
1. If we postpone parameterization until the plan creation, why do we 
still copy the path node in the FLAT_COPY_PATH macros? Do we really need it?
2. I see big switches on path nodes. May it be time to create a 
path_walker function? I recall some thread where such a suggestion was 
declined, but I don't remember why.
3. Clause changing is postponed. Why does it not influence the 
calc_joinrel_size_estimate? We will use statistics on the parent table 
here. Am I wrong?


--
regards,
Andrey Lepikhov
Postgres Professional





Re: pg_logical_emit_message() misses a XLogFlush()

2023-10-13 Thread Amit Kapila
On Mon, Sep 11, 2023 at 5:13 PM Michael Paquier  wrote:
>
> I'll need a bit more input from Fujii-san before doing anything about
> his comments, still it looks like a doc issue to me that may need a
> backpatch to clarify how the non-transactional case behaves.
>

I would prefer to associate the new parameter 'flush' with
non-transactional messages as per the proposed patch.

Few points for you to consider:
1.
+CREATE OR REPLACE FUNCTION pg_catalog.pg_logical_emit_message(
+transactional boolean,
+prefix text,
+message text,
+flush boolean DEFAULT false)
+RETURNS pg_lsn
+LANGUAGE INTERNAL
+VOLATILE STRICT
+AS 'pg_logical_emit_message_text';
+
+CREATE OR REPLACE FUNCTION pg_catalog.pg_logical_emit_message(
+transactional boolean,
+prefix text,
+message bytea,
+flush boolean DEFAULT false)
+RETURNS pg_lsn
+LANGUAGE INTERNAL
+VOLATILE STRICT

Is there a reason to make the functions strict now when they were not earlier?

2.
+The flush parameter (default set to
+false) controls if the message is immediately
+flushed to WAL or not. flush has no effect
+with transactional, as the message's WAL
+record is flushed when its transaction is committed.

The last part of the message sounds a bit too specific (".. as the
message's WAL record is flushed when its transaction is committed.")
because sometimes the WAL could be flushed by walwriter even before
the commit. Can we say something along the lines: ".. as the message's
WAL record is flushed along with its transaction."?

3.
+ /*
+ * Make sure that the message hits disk before leaving if not emitting a
+ * transactional message, if flush is requested.
+ */
+ if (!transactional && flush)

Two ifs in the above comment sound a bit odd but if we want to keep it
like that then adding 'and' before the second if may slightly improve
it.

-- 
With Regards,
Amit Kapila.




Re: Removing unneeded self joins

2023-10-13 Thread a.rybakina

On 13.10.2023 12:03, Andrei Lepikhov wrote:

On 13/10/2023 15:56, a.rybakina wrote:



Also I've incorporated improvements from Alena Rybakina except one for
skipping SJ removal when no SJ quals is found.  It's not yet clear for
me if this check fix some cases. But at least optimization got skipped
in some useful cases (as you can see in regression tests).


Agree. I wouldn't say I like it too. But also, I suggest skipping 
some unnecessary assertions proposed in that patch:
Assert(toKeep->relid != -1); - quite strange. Why -1? Why not all 
the negative numbers, at least?
Assert(is_opclause(orinfo->clause)); - above we skip clauses with 
rinfo->mergeopfamilies == NIL. Each mergejoinable clause is already 
checked as is_opclause.

All these changes (see in the attachment) are optional.

I don't mind about asserts, maybe I misunderstood something in the 
patch.


About skipping SJ removal when no SJ quals is found, I assume it is 
about it:


split_selfjoin_quals(root, restrictlist, ,
                               , inner->relid, 
outer->relid);


+            if (list_length(selfjoinquals) == 0)
+             {
+                 /*
+                  * XXX:
+                  * we would detect self-join without quals like 
'x==x' if we had
+                  * an foreign key constraint on some of other quals 
and this join
+                  * haven't any columns from the outer in the target 
list.

+                  * But it is still complex task.
+                  */
+                 continue;
+             }

as far as I remember, this is the place where it is checked that the 
SJ list is empty and it is logical, in my opinion, that no 
transformations should be performed if no elements are found for them.
You forget we have "Degenerate" case, as Alexander mentioned above. 
What if you have something like that:

SELECT ... FROM A a1, A a2 WHERE a1.id=1 AND a2.id=1;
In this case, uniqueness can be achieved by the baserestrictinfo 
"A.id=1", if we have an unique index on this column.



Yes, sorry, I missed it. thanks again for the explanation 




Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-13 Thread Dean Rasheed
On Thu, 12 Oct 2023 at 23:43, Tomas Vondra
 wrote:
>
> Ashutosh Bapat reported me off-list a possible issue in how BRIN
> minmax-multi calculate distance for infinite timestamp/date values.
>
> The current code does this:
>
> if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2))
> PG_RETURN_FLOAT8(0);
>

Yes indeed, that looks wrong. I noticed the same thing while reviewing
the infinite interval patch.

> so means infinite values are "very close" to any other value, and thus
> likely to be merged into a summary range. That's exactly the opposite of
> what we want to do, possibly resulting in inefficient indexes.
>

Is this only inefficient? Or can it also lead to wrong query results?

> Attached is a patch fixing this
>

I wonder if it's actually necessary to give infinity any special
handling at all for dates and timestamps. For those types, "infinity"
is actually just INT_MIN/MAX, which compares correctly with any finite
value, and will be much larger/smaller than any common value, so it
seems like it isn't necessary to give "infinite" values any special
treatment. That would be consistent with date_cmp() and
timestamp_cmp().

Something else that looks wrong about that BRIN code is that the
integer subtraction might lead to overflow -- it is subtracting two
integer values, then casting the result to float8. It should cast each
input before subtracting, more like brin_minmax_multi_distance_int8().

IOW, I think brin_minmax_multi_distance_date/timestamp could be made
basically identical to brin_minmax_multi_distance_int4/8.

Regards,
Dean




Re: Questions about the new subscription parameter: password_required

2023-10-13 Thread Benoit Lobréau

On 9/23/23 03:57, Jeff Davis wrote:

IIUC there is really one use case here, which is for superuser to
define a subscription including the connection, and then change the
owner to a non-superuser to actually run it (without being able to
touch the connection string itself). I'd just document that in its own
section, and mention a few caveats / mistakes to avoid. For instance,
when the superuser is defining the connection, don't forget to set
password_required=false, so that when you reassign to a non-superuser
then the connection doesn't break.


Hi,

I tried adding a section in "Logical Replication > Subscription" with 
the text you suggested and links in the CREATE / ALTER SUBSRIPTION commands.


Is it better ?

--
Benoit Lobréau
Consultant
http://dalibo.comFrom f3f1b0ce8617971b173ea901c9735d8357955aa2 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Thu, 12 Oct 2023 16:45:11 +0200
Subject: [PATCH] Doc patch for password_required

Add documentation regarding non-superuser subscriptions with
password_required=true.
---
 doc/src/sgml/logical-replication.sgml | 32 +++
 doc/src/sgml/ref/alter_subscription.sgml  |  3 ++-
 doc/src/sgml/ref/create_subscription.sgml |  3 ++-
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 3b2fa1129e..c3faaf88cd 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -329,6 +329,38 @@

   
 
+  
+   Password required
+
+   
+password_required is a subscription parameter which specifies whether
+connections to the publisher made as a result of this subscription must
+use password authentication. This setting is ignored when the subscription
+is owned by a superuser and set to true by default.
+   
+
+   
+If you want to have a subscription managed by a non-superuser with a connection string without
+a password, you have to set password_required = false before transferring it's
+ownership. In that case, only superusers can modify the subscription.
+   
+test_pub=# CREATE SUBSCRIPTION test_sub CONNECTION 'host=somehost port=5432 user=repli dbname=tests_pub' PUBLICATION test_pub WITH (password_required=false);
+CREATE SUBSCRIPTION
+test_pub=# ALTER SUBSCRIPTION test_sub OWNER TO new_sub_owner;
+ALTER SUBSCRIPTION
+   
+   
+
+   
+   
+   If the connection string doesn't contain a password or the publication
+   side doesn't require a password during authentication and you have set
+   password_required = truebefore transferring ownership,
+   the subscription will start failing.
+   
+   
+  
+
   
 Examples: Set Up Logical Replication
 
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..e061c96937 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -50,7 +50,8 @@ ALTER SUBSCRIPTION name RENAME TO <
CREATE permission on the database. In addition,
to alter the owner, you must be able to SET ROLE to the
new owning role. If the subscription has
-   password_required=false, only superusers can modify it.
+   password_required=false, only superusers can modify it
+   (See ).
   
 
   
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index c1bafbfa06..33ad3d12c7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -361,7 +361,8 @@ CREATE SUBSCRIPTION subscription_nametrue. Only superusers can set
-  this value to false.
+  this value to false
+  (See ).
  
 

-- 
2.41.0



Re: LLVM 16 (opaque pointers)

2023-10-13 Thread Dmitry Dolgov
> On Thu, Oct 12, 2023 at 04:31:20PM -0700, Andres Freund wrote:
> Hi,
>
> On 2023-10-11 21:59:50 +1300, Thomas Munro wrote:
> > +#else
> > +   LLVMPassBuilderOptionsRef options;
> > +   LLVMErrorRef err;
> > +   int compile_optlevel;
> > +   char   *passes;
> > +
> > +   if (context->base.flags & PGJIT_OPT3)
> > +   compile_optlevel = 3;
> > +   else
> > +   compile_optlevel = 0;
> > +
> > +   passes = 
> > psprintf("default,mem2reg,function(no-op-function),no-op-module",
> > + compile_optlevel);
>
> I don't think the "function(no-op-function),no-op-module" bit does something
> particularly useful?

Right, looks like leftovers after verifying which passes were actually
applied. My bad, could be removed.

> I also don't think we should add the mem2reg pass outside of -O0 - running it
> after a real optimization pipeline doesn't seem useful and might even make the
> code worse? mem2reg is included in default (and obviously also in O3).

My understanding was that while mem2reg is included everywhere above
-O0, this set of passes won't hurt. But yeah, if you say it could
degrade the final result, it's better to not do this. I'll update this
part.




Re: Removing unneeded self joins

2023-10-13 Thread Andrei Lepikhov

On 13/10/2023 15:56, a.rybakina wrote:



Also I've incorporated improvements from Alena Rybakina except one for
skipping SJ removal when no SJ quals is found.  It's not yet clear for
me if this check fix some cases. But at least optimization got skipped
in some useful cases (as you can see in regression tests).


Agree. I wouldn't say I like it too. But also, I suggest skipping some 
unnecessary assertions proposed in that patch:
Assert(toKeep->relid != -1); - quite strange. Why -1? Why not all the 
negative numbers, at least?
Assert(is_opclause(orinfo->clause)); - above we skip clauses with 
rinfo->mergeopfamilies == NIL. Each mergejoinable clause is already 
checked as is_opclause.

All these changes (see in the attachment) are optional.


I don't mind about asserts, maybe I misunderstood something in the patch.

About skipping SJ removal when no SJ quals is found, I assume it is 
about it:


split_selfjoin_quals(root, restrictlist, ,
                               , inner->relid, 
outer->relid);


+            if (list_length(selfjoinquals) == 0)
+             {
+                 /*
+                  * XXX:
+                  * we would detect self-join without quals like 'x==x' 
if we had
+                  * an foreign key constraint on some of other quals 
and this join

+                  * haven't any columns from the outer in the target list.
+                  * But it is still complex task.
+                  */
+                 continue;
+             }

as far as I remember, this is the place where it is checked that the SJ 
list is empty and it is logical, in my opinion, that no transformations 
should be performed if no elements are found for them.
You forget we have "Degenerate" case, as Alexander mentioned above. What 
if you have something like that:

SELECT ... FROM A a1, A a2 WHERE a1.id=1 AND a2.id=1;
In this case, uniqueness can be achieved by the baserestrictinfo 
"A.id=1", if we have an unique index on this column.


--
regards,
Andrey Lepikhov
Postgres Professional





Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-13 Thread Aleksander Alekseev
Hi,

> Those all make sense to me.
>
> > [...]
>
> Of course. Your general approach seems wise.
>
> Thanks for working on this. I will be relieved once this is finally
> taken care of.

+1, and many thanks for your attention to the patchset and all the details!

-- 
Best regards,
Aleksander Alekseev




Re: Improve the log message output of basic_archive when basic_archive.archive_directory parameter is not set

2023-10-13 Thread Daniel Gustafsson
> On 13 Oct 2023, at 04:25, Nathan Bossart  wrote:
> 
> On Tue, Sep 26, 2023 at 08:13:45AM +0200, Daniel Gustafsson wrote:
>>> On 26 Sep 2023, at 00:20, Nathan Bossart  wrote:
>>> 
>>> On Thu, Sep 21, 2023 at 11:18:00AM +0900, bt23nguyent wrote:
 -basic_archive_configured(ArchiveModuleState *state)
 +basic_archive_configured(ArchiveModuleState *state, char **logdetail)
>>> 
>>> Could we do something more like GUC_check_errdetail() instead to maintain
>>> backward compatibility with v16?
>> 
>> We'd still need something exported to call into which isn't in 16, so it
>> wouldn't be more than optically backwards compatible since a module written 
>> for
>> 17 won't compile for 16, or am I missing something?
> 
> I only mean that a module written for v16 could continue to be used in v17
> without any changes.  You are right that a module that uses this new
> functionality wouldn't compile for v16.

Sure, but that also means that few if any existing modules will be updated to
provide this =).

> But IMHO the interface is nicer,

That's a more compelling reason IMO.  I'm not sure if I prefer the
GUC_check_errdetail-like approach better, I would for sure not be opposed to
reviewing a version of the patch doing it that way.

Tung Nguyen: are you interested in updating the patch along these lines
suggested by Nathan?

> since module authors wouldn't need to worry about allocating the space
> for the string or formatting the message.

Well, they still need to format it; and calling _errdetail(msg),
pstrdup(msg) or psprintf(msg) isn't a world of difference.

--
Daniel Gustafsson





Re: Removing unneeded self joins

2023-10-13 Thread a.rybakina



Also I've incorporated improvements from Alena Rybakina except one for
skipping SJ removal when no SJ quals is found.  It's not yet clear for
me if this check fix some cases. But at least optimization got skipped
in some useful cases (as you can see in regression tests).


Agree. I wouldn't say I like it too. But also, I suggest skipping some 
unnecessary assertions proposed in that patch:
Assert(toKeep->relid != -1); - quite strange. Why -1? Why not all the 
negative numbers, at least?
Assert(is_opclause(orinfo->clause)); - above we skip clauses with 
rinfo->mergeopfamilies == NIL. Each mergejoinable clause is already 
checked as is_opclause.

All these changes (see in the attachment) are optional.


I don't mind about asserts, maybe I misunderstood something in the patch.

About skipping SJ removal when no SJ quals is found, I assume it is 
about it:


split_selfjoin_quals(root, restrictlist, ,
                              , inner->relid, 
outer->relid);


+            if (list_length(selfjoinquals) == 0)
+             {
+                 /*
+                  * XXX:
+                  * we would detect self-join without quals like 'x==x' 
if we had
+                  * an foreign key constraint on some of other quals 
and this join

+                  * haven't any columns from the outer in the target list.
+                  * But it is still complex task.
+                  */
+                 continue;
+             }

as far as I remember, this is the place where it is checked that the SJ 
list is empty and it is logical, in my opinion, that no transformations 
should be performed if no elements are found for them.


As for the cases where SJ did not work, maybe this is just right if 
there are no elements for processing these cases. I'll try to check or 
come up with tests for them. If I'm wrong, write.


On 11.10.2023 06:51, Andrei Lepikhov wrote:

On 11/10/2023 02:29, Alena Rybakina wrote:

I have reviewed your patch and I noticed a few things.


Thanks for your efforts,

I have looked at the latest version of the code, I assume that the 
error lies in the replace_varno_walker function, especially in the 
place where we check the node by type Var, and does not form any 
NullTest node.


It's not a bug, it's an optimization we discussed with Alexander above.

Secondly, I added some code in some places to catch erroneous cases 
and added a condition when we should not try to apply the 
self-join-removal transformation due to the absence of an empty 
self-join list after searching for it and in general if there are no 
joins in the query. Besides, I added a query for testing and wrote 
about it above. I have attached my diff file.

Ok, I will look at this
In addition, I found a comment for myself that was not clear to me. I 
would be glad if you could explain it to me.


You mentioned superior outer join in the comment, unfortunately, I 
didn't find anything about it in the PostgreSQL code, and this 
explanation remained unclear to me. Could you explain in more detail 
what you meant?
I meant here that only clauses pushed by 
reconsider_outer_join_clauses() can be found in the joininfo list, and 
they are not relevant, as you can understand.
Having written that, I realized that it was a false statement. ;) - 
joininfo can also contain results of previous SJE iterations, look:


CREATE TABLE test (oid int PRIMARY KEY);
CREATE UNIQUE INDEX ON test((oid*oid));
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c2.oid AND c1.oid*c2.oid=c3.oid*c3.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c3.oid AND c1.oid*c3.oid=c2.oid*c2.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c3.oid=c2.oid AND c3.oid*c2.oid=c1.oid*c1.oid; 


Ok, I understood. Thank you for explanation.


--
Regards,
Alena Rybakina


Re: A new strategy for pull-up correlated ANY_SUBLINK

2023-10-13 Thread Alena Rybakina

On 13.10.2023 10:04, Andy Fan wrote:


It seems to me that the expressions "=" and "IN" are equivalent
here due to the fact that the aggregated subquery returns only one
value, and the result with the "IN" operation can be considered as
the intersection of elements on the left and right. In this query,
we have some kind of set on the left, among which there will be
found or not only one element on the right.


Yes, they are equivalent at the final result, but there are some
differences at the execution level.  the '=' case will be transformed
to a Subplan whose subPlanType is EXPR_SUBLINK, so if there
is more than 1 rows is returned in the subplan, error will be raised.

select * from tenk1 where
  ten =  (select ten from tenk1 i where i.two = tenk1.two );

ERROR:  more than one row returned by a subquery used as an expression

However the IN case would not.
select * from tenk1 where
  ten =  (select ten from tenk1 i where i.two = tenk1.two ) is OK.

I think the test case you added is not related to this feature. the
difference is there even without the patch.  so I kept the code
you changed, but not for the test  case.
Yes, I understand and agree with you that we should delete the last 
queries, except to one.


The query below have a different result compared to master, and it is 
correct.



Without your patch:

explain (costs off)
+SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
 QUERY PLAN
-
 Nested Loop Left Join
   ->  Seq Scan on tenk1 a
   ->  Materialize
 ->  Seq Scan on tenk2 b
   Filter: (SubPlan 2)
   SubPlan 2
 ->  Result
   InitPlan 1 (returns $1)
 ->  Limit
   ->  Index Scan using tenk2_hundred on 
tenk2 c

 Index Cond: (hundred IS NOT NULL)
 Filter: (odd = b.odd)
(12 rows)


After your patch:

postgres=# explain (costs off)
SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);

   QUERY PLAN
--
  Nested Loop Left Join
    ->  Seq Scan on tenk1 a
    ->  Materialize
  ->  Nested Loop
    ->  Seq Scan on tenk2 b
*->  Subquery Scan on "ANY_subquery"
  Filter: (b.hundred = "ANY_subquery".min)*
  ->  Aggregate
    ->  Seq Scan on tenk2 c
  Filter: (odd = b.odd)
(10 rows)




I took the liberty of adding this to your patch and added
myself as reviewer, if you don't mind.

Sure, the patch after your modification looks better than the
original.
I'm not sure how the test case around "because of got one row" is
relevant to the current changes.  After we reach to some agreement
on the above discussion, I think v4 is good for committer to review!


Thank you!) I am ready to discuss it.

Actually I meant to discuss the "Unfortunately, I found a request..", 
looks

we have reached an agreement there:)


Yes, we have)

--
Regards,
Alena Rybakina
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index 17df6b5dc9c..e41b728df83 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -2028,3 +2028,27 @@ ON A.hundred in (SELECT c.hundred FROM tenk2 C WHERE c.odd = b.odd);
  ->  Seq Scan on tenk2 b
 (11 rows)
 
+-- we can pull up the aggregate sublink into RHS of a left join.
+explain (costs off)
+SELECT * FROM tenk1 A LEFT JOIN tenk2 B
+ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
+  QUERY PLAN   
+---
+ Nested Loop Left Join
+   ->  Seq Scan on tenk1 a
+   ->  Materialize
+ ->  Nested Loop
+   ->  Seq Scan on tenk2 b
+   ->  Memoize
+ Cache Key: b.hundred, b.odd
+ Cache Mode: binary
+ ->  Subquery Scan on "ANY_subquery"
+   Filter: (b.hundred = "ANY_subquery".min)
+   ->  Result
+ InitPlan 1 (returns $1)
+   ->  Limit
+ ->  Index Scan using tenk2_hundred on tenk2 c
+   Index Cond: (hundred IS NOT NULL)
+   Filter: (odd = b.odd)
+(16 rows)
+
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql

Re: pg_upgrade's interaction with pg_resetwal seems confusing

2023-10-13 Thread Dilip Kumar
On Fri, Oct 13, 2023 at 11:07 AM Amit Kapila  wrote:
>
> > But is this a problem? basically, we will set the
> > ShmemVariableCache->nextOid counter to the value that we want in the
> > IsBinaryUpgrade-specific function.  And then the shutdown checkpoint
> > will flush that value to the control file and that is what we want no?
> >
>
> I think that can work. Basically, we need to do something like what
> SetNextObjectId() does and then let the shutdown checkpoint update the
> actual value in the control file.

Right.

> >  I mean instead of resetwal directly modifying the control file we
> > will modify that value in the server using the binary_upgrade function
> > and then have that value flush to the disk by shutdown checkpoint.
> >
>
> True, the alternative to consider is to let pg_upgrade update the
> control file by itself with the required value of OID. The point I am
> slightly worried about doing via server-side function is that some
> online and or shutdown checkpoint can update other values in the
> control file as well whereas if we do via pg_upgrade, we can have
> better control over just updating the OID.

Yeah, thats a valid point.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: On login trigger: take three

2023-10-13 Thread Alexander Korotkov
On Fri, Oct 13, 2023 at 4:18 AM Robert Haas  wrote:
> On Thu, Oct 12, 2023 at 6:54 PM Alexander Korotkov  
> wrote:
> > On Thu, Oct 12, 2023 at 8:35 PM Robert Haas  wrote:
>
> > > Doesn't that mean that if you create the first login trigger in a
> > > database and leave the transaction open, nobody can connect to that
> > > database until the transaction ends?
> >
> > It doesn't mean that, because when trying to reset the flag v44 does
> > conditional lock.  So, if another transaction is holding the log we
> > will just skip resetting the flag.  So, the flag will be cleared on
> > the first connection after that transaction ends.
>
> But in the scenario I am describing the flag is being set, not reset.

Sorry, it seems I just missed some logical steps.  Imagine, that
transaction A created the first login trigger and hangs open.  Then
the new connection B sees no visible triggers yet, but dathasloginevt
flag is set.  Therefore, connection B tries conditional lock but just
gives up because the lock is held by transaction A.

Also, note that the lock has been just some lock with a custom tag.
It doesn't effectively block the database.  You may think about it as
of custom advisory lock.

--
Regards,
Alexander Korotkov




Re: pg_stat_statements and "IN" conditions

2023-10-13 Thread Michael Paquier
On Tue, Jul 04, 2023 at 09:02:56PM +0200, Dmitry Dolgov wrote:
> On Mon, Jul 03, 2023 at 09:46:11PM -0700, Nathan Bossart wrote:
>> IMHO this adds way too much complexity to something that most users would
>> expect to be an on/off switch.
> 
> This documentation is exclusively to be precise about how does it work.
> Users don't have to worry about all this, and pretty much turn it
> on/off, as you've described. I agree though, I could probably write this
> text a bit differently.

FWIW, I am going to side with Nathan on this one, but not completely
either.  I was looking at the patch and it brings too much complexity
for a monitoring feature in this code path.  In my experience, I've
seen people complain about IN/ANY never strimmed down to a single
parameter in pg_stat_statements but I still have to hear from somebody
outside this thread that they'd like to reduce an IN clause depending
on the number of items, or something else.

>> If I understand Álvaro's suggestion [0] correctly, he's saying that in
>> addition to allowing "on" and "off", it might be worth allowing
>> something like "powers" to yield roughly the behavior described above.
>> I don't think he's suggesting that this "powers" behavior should be
>> the only available option.
> 
> Independently of what Álvaro was suggesting, I find the "powers"
> approach more suitable, because it answers my own concerns about the
> previous implementation. Having "on"/"off" values means we would have to
> scratch heads coming up with a one-size-fit-all default value, or to
> introduce another option for the actual cut-off threshold. I would like
> to avoid both of those options, that's why I went with "powers" only.

Now, it doesn't mean that this approach with the "powers" will never
happen, but based on the set of opinions I am gathering on this thread
I would suggest to rework the patch as follows:
- First implement an on/off switch that reduces the lists in IN and/or
ANY to one parameter.  Simply.
- Second, refactor the powers routine.
- Third, extend the on/off switch, or just implement a threshold with
a second switch.

When it comes to my opinion, I am not seeing any objections to the
feature as a whole, and I'm OK with the first step.  I'm also OK to
keep the door open for more improvements in controlling how these
IN/ANY lists show up, but there could be more than just the number of
items as parameter (say the query size, different behaviors depending
on the number of clauses in queries, subquery context or CTEs/WITH,
etc. just to name a few things coming in mind).
--
Michael


signature.asc
Description: PGP signature


Re: Tab completion for AT TIME ZONE

2023-10-13 Thread Michael Paquier
On Fri, Oct 13, 2023 at 08:01:08AM +0200, Vik Fearing wrote:
> On 10/13/23 06:31, Michael Paquier wrote:
>> but after also removing
>> the completion for "ZONE" after typing "AT TIME" because AT would be
>> completed by "TIME ZONE".
> 
> Why?  The user can tab at any point.

IMO this leads to unnecessary bloat in tab-complete.c because we
finish with the full completion as long as "TIME" is not fully typed.
--
Michael


signature.asc
Description: PGP signature


Re: A new strategy for pull-up correlated ANY_SUBLINK

2023-10-13 Thread Andy Fan
Hi Tom,

Would you like to have a look at this?  The change is not big and the
optimization has also been asked for many times. The attached is the
v5 version and  I also try my best to write a good commit message.

Here is the commit fest entry:

https://commitfest.postgresql.org/45/4268/


v5-0001-Pull-up-ANY-SUBLINK-with-the-necessary-lateral-su.patch
Description: Binary data


Re: New WAL record to detect the checkpoint redo location

2023-10-13 Thread Michael Paquier
On Tue, Oct 10, 2023 at 02:43:34PM -0400, Robert Haas wrote:
> Here's a new patch set. I think I've incorporated the performance
> fixes that you've suggested so far into this version. I also adjusted
> a couple of other things:

Now looking at 0002, where you should be careful about the code
indentation or koel will complain.

> - After further study of a point previously raised by Amit, I adjusted
> CreateCheckPoint slightly to call WALInsertLockAcquireExclusive
> significantly later than before. I think that there's no real reason
> to do it so early and that the current coding is probably just a
> historical leftover, but it would be good to have some review here.

This makes the new code call LocalSetXLogInsertAllowed() and what we
set for checkPoint.PrevTimeLineID after taking the insertion locks,
which should be OK.

> - I added a cross-check that when starting redo from a checkpoint
> whose redo pointer points to an earlier LSN that the checkpoint
> itself, the record we read from that LSN must an XLOG_CHECKPOINT_REDO
> record.

I've mentioned as well a test in pg_walinspect after one of the
checkpoints generated there, but what you do here is enough for the
online case.

+   /*
+* XLogInsertRecord will have updated RedoRecPtr, but we need to copy
+* that into the record that will be inserted when the checkpoint is
+* complete.
+*/
+   checkPoint.redo = RedoRecPtr;

For online checkpoints, a very important point is that
XLogCtl->Insert.RedoRecPtr is also updated in XLogInsertRecord().
Perhaps that's worth an addition?  I was a bit confused first that we
do the following for shutdown checkpoints:
RedoRecPtr = XLogCtl->Insert.RedoRecPtr = checkPoint.redo;

Then repeat this pattern for non-shutdown checkpoints a few lines down
without touching the copy of the redo LSN in XLogCtl->Insert, because
of course we don't hold the WAL insert locks in an exclusive fashion
here:
checkPoint.redo = RedoRecPtr;

My point is that this is not only about RedoRecPtr, but also about
XLogCtl->Insert.RedoRecPtr here.  The comment in ReserveXLogSwitch()
says that.
--
Michael


signature.asc
Description: PGP signature


Re: A new strategy for pull-up correlated ANY_SUBLINK

2023-10-13 Thread Andy Fan
>
> It seems to me that the expressions "=" and "IN" are equivalent here due
> to the fact that the aggregated subquery returns only one value, and the
> result with the "IN" operation can be considered as the intersection of
> elements on the left and right. In this query, we have some kind of set on
> the left, among which there will be found or not only one element on the
> right.
>

Yes, they are equivalent at the final result, but there are some
differences at the execution level.  the '=' case will be transformed
to a Subplan whose subPlanType is EXPR_SUBLINK, so if there
is more than 1 rows is returned in the subplan, error will be raised.

select * from tenk1 where
  ten =  (select ten from tenk1 i where i.two = tenk1.two );

ERROR:  more than one row returned by a subquery used as an expression

However the IN case would not.
select * from tenk1 where
  ten =  (select ten from tenk1 i where i.two = tenk1.two ) is OK.

I think the test case you added is not related to this feature. the
difference is there even without the patch.  so I kept the code
you changed, but not for the test  case.

I took the liberty of adding this to your patch and added myself as
>> reviewer, if you don't mind.
>>
> Sure, the patch after your modification looks better than the original.
> I'm not sure how the test case around "because of got one row" is
> relevant to the current changes.  After we reach to some agreement
> on the above discussion, I think v4 is good for committer to review!
>
>
> Thank you!) I am ready to discuss it.
>

Actually I meant to discuss the "Unfortunately, I found a request..", looks
we have reached an agreement there:)

-- 
Best Regards
Andy Fan


Re: [PATCH] Add support function for containment operators

2023-10-13 Thread jian he
On Tue, Aug 1, 2023 at 10:07 AM Laurenz Albe  wrote:
> >
> >
> > > I had an idea about this:
> > > So far, you only consider constant ranges.  But if we have a STABLE range
> > > expression, you could use an index scan for "expr <@ range", for example
> > > Index Cond (expr >= lower(range) AND expr < upper(range)).
> > >

The above part, not sure how to implement it, not sure it is doable.

Refactor:
drop SupportRequestIndexCondition and related code, since mentioned in
upthread, it's dead code.
refactor the regression test. (since data types with infinity cover
more cases than int4range, so I deleted some tests).

now there are 3 helper functions (build_bound_expr,
find_simplified_clause, match_support_request), 2 entry functions
(elem_contained_by_range_support, range_contains_elem_support)

Collation problem seems solved. Putting the following test on the
src/test/regress/sql/rangetypes.sql will not work. Maybe because of
the order of the regression test, in SQL-ASCII encoding, I cannot
create collation="cs-CZ-x-icu".

drop type if EXISTS textrange1, textrange2;
drop table if EXISTS collate_test1, collate_test2;
CREATE TYPE textrange1 AS RANGE (SUBTYPE = text, collation="C");
create type textrange2 as range(subtype=text, collation="cs-CZ-x-icu");
CREATE TABLE collate_test1 (b text COLLATE "en-x-icu" NOT NULL);
INSERT INTO collate_test1(b) VALUES ('a'), ('c'), ('d'), ('ch');
CREATE TABLE collate_test2 (b text NOT NULL);
INSERT INTO collate_test2(b) VALUES ('a'), ('c'), ('d'), ('ch');

--should include 'ch'
SELECT * FROM collate_test1 WHERE b <@ textrange1('a', 'd');
--should not include 'ch'
SELECT * FROM collate_test1 WHERE b <@ textrange2('a', 'd');
--should include 'ch'
SELECT * FROM collate_test2 WHERE b <@ textrange1('a', 'd');
--should not include 'ch'
SELECT * FROM collate_test2 WHERE b <@ textrange2('a', 'd');
-
From bcae7f8a0640b48b04f243660539e8670de43d41 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 13 Oct 2023 12:43:41 +0800
Subject: [PATCH v2 1/1] Optimize qual cases like Expr <@ RangeConst and
 RangeConst @> Expr

Previously these two quals will be processed as is.
With this patch, adding prosupport
function to range_contains_elem, elem_contained_by_range.

So cases like Expr <@ rangeCOnst, rangeConst @> expr
will be rewritten to "expr opr range_lower_bound and expr opr
range_upper_bound".

Expr <@ rangeConst will be logically equivalent to rangeConst @> Expr,
if rangeConst is the same. added some tests to validate the generated
plan.
---
 src/backend/utils/adt/rangetypes.c  | 199 +++-
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 194 +++
 src/test/regress/sql/rangetypes.sql | 101 ++
 5 files changed, 505 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625..647bd5e4 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,13 +31,19 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/supportnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
 #include "nodes/miscnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
@@ -69,7 +75,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +568,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2182,29 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, , , true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)

Re: Pro et contra of preserving pg_proc oids during pg_upgrade

2023-10-13 Thread Laurenz Albe
On Thu, 2023-10-12 at 19:56 +0300, Nikita Malakhov wrote:
> Say, we have data processed by some user function and we want to keep 
> reference to this function
> in our data. In this case we have two ways - first - store string output of 
> regprocedure, which is not
> very convenient, and the second - store its OID, which requires slight 
> modification of pg_upgrade
> (pg_dump and func/procedure creation function).

So far, we have lived quite well with the rule "don't store any system OIDs in 
the database
if you want to pg_upgrade" (views on system objects, reg* data types, ...).

What is inconvenient about storing the output of regprocedure?

Yours,
Laurenz Albe




Re: Tab completion for AT TIME ZONE

2023-10-13 Thread Vik Fearing

On 10/13/23 06:31, Michael Paquier wrote:

On Fri, Oct 13, 2023 at 03:07:25AM +0200, Vik Fearing wrote:

The SQL committee already has another operator starting with AT which is AT
LOCAL.


The other patch was the reason why I looked at this one. 



Thank you for updating and committing this patch!



but after also removing
the completion for "ZONE" after typing "AT TIME" because AT would be
completed by "TIME ZONE".



Why?  The user can tab at any point.
--
Vik Fearing