Re: Synchronizing slots from primary to standby

2023-08-15 Thread Ajin Cherian
On Mon, Aug 14, 2023 at 8:38 PM shveta malik  wrote:
>
> On Mon, Aug 14, 2023 at 3:22 PM shveta malik  wrote:
> >
> > On Tue, Aug 8, 2023 at 11:11 AM Drouvot, Bertrand
> >  wrote:
> > >
> > > Hi,
> > >
> > > On 8/8/23 7:01 AM, shveta malik wrote:
> > > > On Mon, Aug 7, 2023 at 3:17 PM Drouvot, Bertrand
> > > >  wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> On 8/4/23 1:32 PM, shveta malik wrote:
> > > >>> On Fri, Aug 4, 2023 at 2:44 PM Drouvot, Bertrand
> > > >>>  wrote:
> > >  On 7/28/23 4:39 PM, Bharath Rupireddy wrote:
> > > >>
> > > >
> > > > Agreed. That is why in v10,v11 patches, we have different infra for
> > > > sync-slot worker i.e. it is not relying on "logical replication
> > > > background worker" anymore.
> > >
> > > yeah saw that, looks like the right way to go to me.
> > >
> > > >> Maybe we should start some tests/benchmark with only one sync worker 
> > > >> to get numbers
> > > >> and start from there?
> > > >
> > > > Yes, we can do that performance testing to figure out the difference
> > > > between the two modes. I will try to get some statistics on this.
> > > >
> > >
> > > Great, thanks!
> > >
> >
> > We (myself and Ajin) performed the tests to compute the lag in standby
> > slots as compared to primary slots with different number of slot-sync
> > workers configured.
> >
> > 3 DBs were created, each with 30 tables and each table having one
> > logical-pub/sub configured. So this made a total of 90 logical
> > replication slots to be synced. Then the workload was run for aprox 10
> > mins. During this workload, at regular intervals, primary and standby
> > slots' lsns were captured (from pg_replication_slots) and compared. At
> > each capture, the intent was to know how much is each standby's slot
> > lagging behind corresponding primary's slot by taking the distance
> > between confirmed_flush_lsn of primary and standby slot. Then we took
> > the average (integer value) of this distance over the span of 10 min
> > workload and this is what we got:
> >
>
> I have attached the scripts for schema-setup, running workload and
> capturing lag. Please go through Readme for details.
>
>
I did some more tests for 10,20 and 40 slots to calculate the average
lsn distance
between slots, comparing 1 worker and 3 workers.

My results are as follows:

10 slots
1 worker: 5529.75527426 (average lsn distance between primary and
standby per slot)
3 worker: 2224.57589134

20 slots
1 worker: 9592.87234043
3 worker: 3194.6293

40 slots
1 worker: 20566.093
3 worker: 7885.80952381

90 slots
1 worker: 36706.8405797
3 worker: 10236.6393162

regards,
Ajin Cherian
Fujitsu Australia


Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning

2023-08-15 Thread Ashutosh Bapat
On Fri, Aug 4, 2023 at 2:11 PM Ashutosh Bapat
 wrote:
>
> Attached patchset fixing those.
> 0001 - patch to report planning memory, with to explain.out regression output 
> fix. We may consider committing this as well.
> 0002 - with your comment addressed above.

0003 - Added this patch for handling SpecialJoinInfos for inner joins.
These SpecialJoinInfos are created on the fly for parent joins. They
can be created on the fly for child joins as well without requiring
any translations. Thus they do not require any new memory. This patch
is intended to be merged into 0002 finally.

Added to the upcoming commitfest https://commitfest.postgresql.org/44/4500/


--
Best Wishes,
Ashutosh Bapat
From 1c39902456e715902e0e24b7cfdaf7135c472ae8 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Mon, 7 Aug 2023 11:00:53 +0530
Subject: [PATCH 3/3] Do not translate dummy SpecialJoinInfos

Plain inner joins do not have SpecialJoinInfos associated with them.
They are crafted on the fly for parent joins. Do the same for child
joins.

Ashutosh Bapat
---
 src/backend/optimizer/path/costsize.c | 37 ++
 src/backend/optimizer/path/joinrels.c | 72 ++-
 src/include/optimizer/paths.h |  2 +
 3 files changed, 54 insertions(+), 57 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index d6ceafd51c..43e0c9621c 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -5021,23 +5021,7 @@ compute_semi_anti_join_factors(PlannerInfo *root,
 	/*
 	 * Also get the normal inner-join selectivity of the join clauses.
 	 */
-	norm_sjinfo.type = T_SpecialJoinInfo;
-	norm_sjinfo.min_lefthand = outerrel->relids;
-	norm_sjinfo.min_righthand = innerrel->relids;
-	norm_sjinfo.syn_lefthand = outerrel->relids;
-	norm_sjinfo.syn_righthand = innerrel->relids;
-	norm_sjinfo.jointype = JOIN_INNER;
-	norm_sjinfo.ojrelid = 0;
-	norm_sjinfo.commute_above_l = NULL;
-	norm_sjinfo.commute_above_r = NULL;
-	norm_sjinfo.commute_below_l = NULL;
-	norm_sjinfo.commute_below_r = NULL;
-	/* we don't bother trying to make the remaining fields valid */
-	norm_sjinfo.lhs_strict = false;
-	norm_sjinfo.semi_can_btree = false;
-	norm_sjinfo.semi_can_hash = false;
-	norm_sjinfo.semi_operators = NIL;
-	norm_sjinfo.semi_rhs_exprs = NIL;
+	make_dummy_sjinfo(_sjinfo, outerrel, innerrel);
 
 	nselec = clauselist_selectivity(root,
 	joinquals,
@@ -5190,23 +5174,8 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
 	/*
 	 * Make up a SpecialJoinInfo for JOIN_INNER semantics.
 	 */
-	sjinfo.type = T_SpecialJoinInfo;
-	sjinfo.min_lefthand = path->outerjoinpath->parent->relids;
-	sjinfo.min_righthand = path->innerjoinpath->parent->relids;
-	sjinfo.syn_lefthand = path->outerjoinpath->parent->relids;
-	sjinfo.syn_righthand = path->innerjoinpath->parent->relids;
-	sjinfo.jointype = JOIN_INNER;
-	sjinfo.ojrelid = 0;
-	sjinfo.commute_above_l = NULL;
-	sjinfo.commute_above_r = NULL;
-	sjinfo.commute_below_l = NULL;
-	sjinfo.commute_below_r = NULL;
-	/* we don't bother trying to make the remaining fields valid */
-	sjinfo.lhs_strict = false;
-	sjinfo.semi_can_btree = false;
-	sjinfo.semi_can_hash = false;
-	sjinfo.semi_operators = NIL;
-	sjinfo.semi_rhs_exprs = NIL;
+	make_dummy_sjinfo(, path->outerjoinpath->parent,
+	  path->innerjoinpath->parent);
 
 	/* Get the approximate selectivity */
 	foreach(l, quals)
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 1a341c9b73..540eda612f 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -44,7 +44,8 @@ static void try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1,
    List *parent_restrictlist);
 static void build_child_join_sjinfo(PlannerInfo *root,
 	SpecialJoinInfo *parent_sjinfo,
-	Relids left_relids, Relids right_relids,
+	RelOptInfo *left_child,
+	RelOptInfo *right_child,
 	SpecialJoinInfo *child_sjinfo);
 static void free_child_sjinfo_members(SpecialJoinInfo *child_sjinfo);
 static void compute_partition_bounds(PlannerInfo *root, RelOptInfo *rel1,
@@ -655,6 +656,32 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 	return true;
 }
 
+/*
+ * make_dummy_sjinfo
+ *Populate the given SpecialJoinInfo for a plain inner join, between rel1
+ *and rel2, which does not have a SpecialJoinInfo associated with it.
+ */
+void
+make_dummy_sjinfo(SpecialJoinInfo *sjinfo, RelOptInfo *rel1, RelOptInfo *rel2)
+{
+	sjinfo->type = T_SpecialJoinInfo;
+	sjinfo->min_lefthand = rel1->relids;
+	sjinfo->min_righthand = rel2->relids;
+	sjinfo->syn_lefthand = rel1->relids;
+	sjinfo->syn_righthand = rel2->relids;
+	sjinfo->jointype = JOIN_INNER;
+	sjinfo->ojrelid = 0;
+	sjinfo->commute_above_l = NULL;
+	sjinfo->commute_above_r = NULL;
+	sjinfo->commute_below_l = NULL;
+	sjinfo->commute_below_r = NULL;
+	/* we don't bother 

Re: WIP: new system catalog pg_wait_event

2023-08-15 Thread Drouvot, Bertrand

Hi,

On 8/14/23 6:37 AM, Michael Paquier wrote:

On Thu, Aug 10, 2023 at 08:09:34PM +0200, Drouvot, Bertrand wrote:

Agree that's worth it given the fact that iterating one more time is not that
costly here.


I have reviewed v4, and finished by putting my hands on it to see what
I could do.


Thanks!



There are two things
that we could do:
- Hide that behind a macro defined in wait_event_funcs.c.
- Feed the data generated here into a static structure, like:
+static const struct
+{
+   const char *type;
+   const char *name;
+   const char *description;
+}

After experimenting with both, I've found the latter a tad cleaner, so
the attached version does that.


Yeah, looking at what you've done in v5, I also agree that's better
that what has been done in v4 (I also think it will be easier to maintain).


I am not sure that "pg_wait_event" is a good idea for the name if the
new view.  How about "pg_wait_events" instead, in plural form?  There
is more than one wait event listed.



I'd prefer the singular form. There is a lot of places where it's already used
(pg_database, pg_user, pg_namespace...to name a few) and it looks like that 
using
the plural form are exceptions.


One log entry in Solution.pm has missed the addition of a reference to
wait_event_funcs_data.c.



Oh right, thanks for fixing it in v5.


And..  src/backend/Makefile missed two updates for maintainer-clean & co,
no?
 

Oh right, thanks for fixing it in v5.


One thing that the patch is still missing is the handling of custom
wait events for extensions. 


Yeah, now that af720b4c50 is done, I'll add the custom wait events handling
in v6.


So this still requires more code.  I was
thinking about listed these events as:
- Type: "Extension"
- Name: What a module has registered.
- Description: "Custom wait event \"%Name%\" defined by extension".


That sounds good to me, I'll do that.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: remaining sql/json patches

2023-08-15 Thread Amit Langote
On Tue, Aug 15, 2023 at 5:58 PM jian he  wrote:
> Hi.
> in v11, json_query:
> +The returned data_type has the
> same semantics
> +as for constructor functions like 
> json_objectagg;
> +the default returned type is text.
> +The ON EMPTY clause specifies the behavior if the
> +path_expression yields no value at all; 
> the
> +default when ON ERROR is not specified is
> to return a
> +null value.
>
> the default returned type is jsonb?

You are correct.

> Also in above quoted second last
> line should be ON EMPTY ?

Correct too.

> Other than that, the doc looks good.

Thanks for the review.

I will post a new version after finishing working on a few other
improvements I am working on.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Andres Freund
Hi,

On 2023-08-16 12:37:21 +0900, Michael Paquier wrote:
> I won't fight much if there are objections to backpatching, but that's
> not really wise (no idea how much EDB's close flavor of BDR relies on
> that).

To be clear: I don't just object to backpatching, I also object to making
existing invocations flush WAL in HEAD. I do not at all object to adding a
parameter that indicates flushing, or a separate function to do so. The latter
might be better, because it allows you to flush a group of messages, rather
than a single one.

Greetings,

Andres Freund




Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Andres Freund
Hi,

On 2023-08-16 03:20:53 +0200, Tomas Vondra wrote:
> On 8/16/23 02:33, Andres Freund wrote:
> > Hi,
> >
> > On 2023-08-16 06:58:56 +0900, Michael Paquier wrote:
> >> On Tue, Aug 15, 2023 at 11:37:32AM +0200, Tomas Vondra wrote:
> >>> Shouldn't the flush be done only for non-transactional messages? The
> >>> transactional case will be flushed by regular commit flush.
> >>
> >> Indeed, that would be better.  I am sending an updated patch.
> >>
> >> I'd like to backpatch that, would there be any objections to that?
> >
> > Yes, I object. This would completely cripple the performance of some uses of
> > logical messages - a slowdown of several orders of magnitude. It's not clear
> > to me that flushing would be the right behaviour if it weren't released, but
> > it certainly doesn't seem right to make such a change in a minor release.
> >
>
> So are you objecting to adding the flush in general, or just to the
> backpatching part?

Both, I think. I don't object to adding a way to trigger flushing, but I think
it needs to be optional.


> IMHO we either guarantee durability of non-transactional messages, in which
> case this would be a clear bug - and I'd say a fairly serious one.  I'm
> curious what the workload that'd see order of magnitude of slowdown does
> with logical messages I've used it, but even if such workload exists, would
> it really be enough to fix any other durability bug?

Not sure what you mean with the last sentence?

I've e.g. used non-transactional messages for:

- A non-transactional queuing system. Where sometimes one would dump a portion
  of tables into messages, with something like
  SELECT pg_logical_emit_message(false, 'app:', to_json(r)) FROM r;
  Obviously flushing after every row would be bad.

  This is useful when you need to coordinate with other systems in a
  non-transactional way. E.g. letting other parts of the system know that
  files on disk (or in S3 or ...) were created/deleted, since a database
  rollback wouldn't unlink/revive the files.

- Audit logging, when you want to log in a way that isn't undone by rolling
  back transaction - just flushing every pg_logical_emit_message() would
  increase the WAL flush rate many times, because instead of once per
  transaction, you'd now flush once per modified row. It'd basically make it
  impractical to use for such things.

- Optimistic locking. Emitting things that need to be locked on logical
  replicas, to be able to commit on the primary. A pre-commit hook would wait
  for the WAL to be replayed sufficiently - but only once per transaction, not
  once per object.


> Or perhaps we don't want to guarantee durability for such messages, in
> which case we don't need to fix it at all (even in master).

Well, I can see adding an option to flush, or perhaps a separate function to
flush, to master.


> The docs are not very clear on what to expect, unfortunately. It says
> that non-transactional messages are "written immediately" which I could
> interpret in either way.

Yea, the docs certainly should be improved, regardless what we end up with.

Greetings,

Andres Freund




Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-08-15 Thread Peter Smith
On Fri, Aug 11, 2023 at 11:45 PM Melih Mutlu  wrote:
>
> Again, I couldn't reproduce the cases where you saw significantly degraded 
> performance. I wonder if I'm missing something. Did you do anything not 
> included in the test scripts you shared? Do you think v26-0001 will perform 
> 84% worse than HEAD, if you try again? I just want to be sure that it was not 
> a random thing.
> Interestingly, I also don't see an improvement in above results as big as in 
> your results when inserts/tx ratio is smaller. Even though it certainly is 
> improved in such cases.
>

TEST ENVIRONMENTS

I am running the tests on a high-spec machine:

-- NOTE: Nobody else is using this machine during our testing, so
there are no unexpected influences messing up the results.


Linix

Architecture:  x86_64
CPU(s):120
Thread(s) per core:2
Core(s) per socket:15

  totalusedfree  shared  buff/cache   available
Mem:   755G5.7G737G 49M 12G748G
Swap:  4.0G  0B4.0G

~~~

The results I am seeing are not random. HEAD+v26-0001 is consistently
worse than HEAD but only for some settings. With these settings, I see
bad results (i.e. worse than HEAD) consistently every time using the
dedicated test machine.

Hou-san also reproduced bad results using a different high-spec machine

Vignesh also reproduced bad results using just his laptop but in his
case, it did *not* occur every time. As discussed elsewhere the
problem is timing-related, so sometimes you may be lucky and sometimes
not.

~

I expect you are running everything correctly, but if you are using
just a laptop (like Vignesh) then like him you might need to try
multiple times before you can hit the problem happening in your
environment.

Anyway, in case there is some other reason you are not seeing the bad
results I have re-attached scripts and re-described every step below.

==

BUILDING

-- NOTE: I have a very minimal configuration without any
optimization/debug flags etc. See config.log

$ ./configure --prefix=/home/peter/pg_oss

-- NOTE: Of course, make sure to be running using the correct Postgres:

echo 'set environment variables for OSS work'
export PATH=/home/peter/pg_oss/bin:$PATH

-- NOTE: Be sure to do git stash or whatever so don't accidentally
build a patched version thinking it is the HEAD version
-- NOTE: Be sure to do a full clean build and apply (or don't apply
v26-0001) according to the test you wish to run.

STEPS
1. sudo make clean
2. make
3. sudo make install

==

SCRIPTS & STEPS

SCRIPTS
testrun.sh
do_one_test_setup.sh
do_one_test_PUB.sh
do_one_test_SUB.sh

---

STEPS

Step-1. Edit the testrun.sh

tables=( 100 )
workers=( 2 4 8 16 )
size="0"
prefix="0816headbusy" <-- edit to differentiate each test run

~

Step-2. Edit the do_one_test_PUB.sh
IF commit_counter = 1000 THEN <-- edit this if needed. I wanted 1000
inserts/tx so nothing to do

~

Step-3: Check nothing else is running. If yes, then clean it up
[peter@localhost testing_busy]$ ps -eaf | grep postgres
peter111924 100103  0 19:31 pts/000:00:00 grep --color=auto postgres

~

Step-4: Run the tests
[peter@localhost testing_busy]$ ./testrun.sh
num_tables=100, size=0, num_workers=2, run #1 <-- check the echo
matched the config you set in the Setp-1
waiting for server to shut down done
server stopped
waiting for server to shut down done
server stopped
num_tables=100, size=0, num_workers=2, run #2
waiting for server to shut down done
server stopped
waiting for server to shut down done
server stopped
num_tables=100, size=0, num_workers=2, run #3
...

~

Step-5: Sanity check
When the test completes the current folder will be full of .log and .dat* files.
Check for sanity that no errors happened

[peter@localhost testing_busy]$ cat *.log | grep ERROR
[peter@localhost testing_busy]$

~

Step-6: Collect the results
The results are output (by the do_one_test_SUB.sh) into the *.dat_SUB files
Use grep to extract them

[peter@localhost testing_busy]$ cat 0816headbusy_100t_0_2w_*.dat_SUB |
grep RESULT | grep -v duration | awk '{print $3}'
11742.019
12157.355
11773.807
11582.981
12220.962
12546.325
12210.713
12614.892
12015.489
13527.05

Repeat grep for other files:
$ cat 0816headbusy_100t_0_4w_*.dat_SUB | grep RESULT | grep -v
duration | awk '{print $3}'
$ cat 0816headbusy_100t_0_8w_*.dat_SUB | grep RESULT | grep -v
duration | awk '{print $3}'
$ cat 0816headbusy_100t_0_16w_*.dat_SUB | grep RESULT | grep -v
duration | awk '{print $3}'

~

Step-7: Summarise the results
Now I just cut/paste the results from Step-6 into a spreadsheet and
report the median of the runs.

For example, for the above HEAD run, it was:
 2w4w   8w  16w
1   11742   5996   1919   1582
2   12157   5960   1871   1469
3   11774   5926   2101   1571
4   11583   6155   1883   1671
5   12221   6310   1895   1707
6   

Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Michael Paquier
On Wed, Aug 16, 2023 at 03:20:53AM +0200, Tomas Vondra wrote:
> So are you objecting to adding the flush in general, or just to the
> backpatching part?
> 
> IMHO we either guarantee durability of non-transactional messages, in
> which case this would be a clear bug - and I'd say a fairly serious one.
> I'm curious what the workload that'd see order of magnitude of slowdown
> does with logical messages, but even if such workload exists, would it
> really be enough to fix any other durability bug?

Yes, I also think that this is a pretty serious issue to not ensure
durability in the non-transactional case if you have solutions
designed around that.

> Or perhaps we don't want to guarantee durability for such messages, in
> which case we don't need to fix it at all (even in master).

I mean, just look at the first message of the thread I am mentioning
at the top of this thread: it lists three cases where something like
pg_logical_emit_message() was wanted.  And, it looks like a serious
issue to me for the first two ones at least (out-of-order messages and
inter-node communication), because an application may want to send
something from node1 to node2, and node1 may just forget about it
entirely if it crashes, finishing WAL redo at an earlier point than
the record inserted.

> The docs are not very clear on what to expect, unfortunately. It says
> that non-transactional messages are "written immediately" which I could
> interpret in either way.

Agreed.  The original thread never mentions "flush", "sync" or
"durab".

I won't fight much if there are objections to backpatching, but that's
not really wise (no idea how much EDB's close flavor of BDR relies on
that).
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Add function to_oct

2023-08-15 Thread John Naylor
On Wed, Aug 16, 2023 at 12:17 AM Nathan Bossart 
wrote:
>
> On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:

> > - *ptr = '\0';
> > + Assert(base == 2 || base == 8 || base == 16);
> >
> > + *ptr = '\0';
> >
> > Spurious whitespace change?
>
> I think this might just be a weird artifact of the diff algorithm.

Don't believe everything you think. :-)

```
*ptr = '\0';

do
```

to

```
*ptr = '\0';
do
```

> > - char buf[32]; /* bigger than needed, but reasonable */
> > + char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> >
> > Why is this no longer allocated on the stack? Maybe needs a comment
about
> > the size calculation.
>
> It really should be.  IIRC I wanted to avoid passing a pre-allocated
buffer
> to convert_to_base(), but I don't remember why.  I fixed this in v5.

Now I'm struggling to understand why each and every instance has its own
nominal buffer, passed down to the implementation. All we care about is the
result -- is there some reason not to confine the buffer declaration to the
general implementation?

--
John Naylor
EDB: http://www.enterprisedb.com


RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-15 Thread Hayato Kuroda (Fujitsu)
Dear Hou,

Thanks for posting the patch! I want to open a question to gather opinions from 
others.

> > It was primarily for upgrade purposes only. So, as we can't see a good 
> > reason to
> > go via pg_dump let's do it in upgrade unless someone thinks otherwise.
> 
> Removed the new option in pg_dump and modified the pg_upgrade
> directly use the slot info to restore the slot in new cluster.

In this version, creations of logical slots are serialized, whereas old ones 
were
parallelised per db. Do you it should be parallelized again? I have tested 
locally
and felt harmless. Also, this approch allows to log the executed SQLs.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Tomas Vondra
On 8/16/23 02:33, Andres Freund wrote:
> Hi,
> 
> On 2023-08-16 06:58:56 +0900, Michael Paquier wrote:
>> On Tue, Aug 15, 2023 at 11:37:32AM +0200, Tomas Vondra wrote:
>>> Shouldn't the flush be done only for non-transactional messages? The
>>> transactional case will be flushed by regular commit flush.
>>
>> Indeed, that would be better.  I am sending an updated patch.
>>
>> I'd like to backpatch that, would there be any objections to that?
> 
> Yes, I object. This would completely cripple the performance of some uses of
> logical messages - a slowdown of several orders of magnitude. It's not clear
> to me that flushing would be the right behaviour if it weren't released, but
> it certainly doesn't seem right to make such a change in a minor release.
> 

So are you objecting to adding the flush in general, or just to the
backpatching part?

IMHO we either guarantee durability of non-transactional messages, in
which case this would be a clear bug - and I'd say a fairly serious one.
I'm curious what the workload that'd see order of magnitude of slowdown
does with logical messages, but even if such workload exists, would it
really be enough to fix any other durability bug?

Or perhaps we don't want to guarantee durability for such messages, in
which case we don't need to fix it at all (even in master).

The docs are not very clear on what to expect, unfortunately. It says
that non-transactional messages are "written immediately" which I could
interpret in either way.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Test case for parameterized remote path in postgres_fdw

2023-08-15 Thread Richard Guo
On Tue, Aug 15, 2023 at 7:50 PM Etsuro Fujita 
wrote:

> So we should have modified the second one as well?  Attached is a
> small patch for that.


Agreed, nice catch!  +1 to the patch.

Thanks
Richard


Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Andres Freund
Hi,

On 2023-08-16 06:58:56 +0900, Michael Paquier wrote:
> On Tue, Aug 15, 2023 at 11:37:32AM +0200, Tomas Vondra wrote:
> > Shouldn't the flush be done only for non-transactional messages? The
> > transactional case will be flushed by regular commit flush.
> 
> Indeed, that would be better.  I am sending an updated patch.
> 
> I'd like to backpatch that, would there be any objections to that?

Yes, I object. This would completely cripple the performance of some uses of
logical messages - a slowdown of several orders of magnitude. It's not clear
to me that flushing would be the right behaviour if it weren't released, but
it certainly doesn't seem right to make such a change in a minor release.

Greetings,

Andres Freund




Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-08-15 Thread Peter Smith
Here is another review comment about patch v26-0001.

The tablesync worker processes include the 'relid' as part of their
name. See launcher.c:

snprintf(bgw.bgw_name, BGW_MAXLEN,
"logical replication tablesync worker for subscription %u sync %u",
subid,
relid);

~~

And if that worker is "reused" by v26-0001 to process another relation
there is a LOG

if (reuse_worker)
ereport(LOG,
errmsg("logical replication table synchronization worker for
subscription \"%s\" will be reused to sync table \"%s\" with relid
%u.",
MySubscription->name,
get_rel_name(MyLogicalRepWorker->relid),
MyLogicalRepWorker->relid));


AFAICT, when being "reused" the original process name remains
unchanged, and so I think it will continue to appear to any user
looking at it that the tablesync process is just taking a very long
time handling the original 'relid'.

Won't the stale process name cause confusion to the users?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: should frontend tools use syncfs() ?

2023-08-15 Thread Michael Paquier
On Tue, Aug 08, 2023 at 01:06:06PM -0700, Nathan Bossart wrote:
> I ran a couple of tests for pg_upgrade with 100k tables (created using the
> script here [0]) in order to demonstrate the potential benefits of this
> patch.

That shows some nice numbers with many files, indeed.  How does the
size of each file influence the difference in time?

+else
+{
+while (errno = 0, (de = readdir(dir)) != NULL)
+{
+charsubpath[MAXPGPATH * 2];
+
+if (strcmp(de->d_name, ".") == 0 ||
+strcmp(de->d_name, "..") == 0)
+continue;

It seems to me that there is no need to do that for in-place
tablespaces.  There are relative paths in pg_tblspc/, so they would be
taken care of by the syncfs() done on the main data folder.

This does not really check if the mount points of each tablespace is
different, as well.  For example, imagine that you have two
tablespaces within the same disk, syncfs() twice.  Perhaps, the
current logic is OK anyway as long as the behavior is optional, but it
should be explained in the docs, at least.

I'm finding a bit confusing that fsync_pgdata() is coded in such a way
that it does a silent fallback to the cascading syncs through
walkdir() when syncfs is specified but not available in the build.
Perhaps an error is more helpful because one would then know that they
are trying something that's not around?

+   pg_log_error("could not synchronize file system for file \"%s\": %m", 
path);
+   (void) close(fd);
+   exit(EXIT_FAILURE);

walkdir() reports errors and does not consider these fatal.  Why the
early exit()?

I am a bit concerned about the amount of duplication this patch
introduces in the docs.  Perhaps this had better be moved into a new
section of the docs to explain the tradeoffs, with each tool linking
to it?

Do we actually need --no-sync at all if --sync-method is around?  We
could have an extra --sync-method=none at option level with --no-sync
still around mainly for compatibility?  Or perhaps that's just
over-designing things?
--
Michael


signature.asc
Description: PGP signature


Re: Avoid a potential unstable test case: xmlmap.sql

2023-08-15 Thread Andy Fan
>
>
> Please look at the bug #18014:
>
> https://www.postgresql.org/message-id/flat/18014-28c81cb79d44295d%40postgresql.org
> There were other aspects of the xmlmap test failure discussed in that
> thread as well.
>

Thank you Alexander for the information,   I will go through there for
discussion.

-- 
Best Regards
Andy Fan


Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-15 Thread Michael Paquier
On Tue, Aug 15, 2023 at 03:39:10PM -0700, Jacob Champion wrote:
> I'm not super comfortable with saying "connection authenticated" when
> it explicitly hasn't been (nor with switching the meaning of a
> non-NULL SYSTEM_USER from "definitely authenticated somehow" to "who
> knows; parse it apart to see"). But adding a log entry ("connection
> trusted:" or some such?) with the pointer to the HBA line that made it
> happen seems like a useful audit helper to me.

Yeah, thanks for confirming.  That's also the impression I get after
reading again the original thread and the idea of how this code path
is handled in this commit.

We could do something like a LOG "connection: method=%s user=%s
(%s:%d)", without the "authenticated" and "identity" terms from
set_authn_id().  Just to drop an idea.
--
Michael


signature.asc
Description: PGP signature


Re: Would it be possible to backpatch Close support in libpq (28b5726) to PG16?

2023-08-15 Thread Alvaro Herrera
On 2023-Aug-16, Michael Paquier wrote:

> > Personally I think backpatching 28b5726 has a really low risk of
> > breaking anything.
> 
> I agree about the low-risk argument, though.  This is just new code.

Here's a way to think about it.  If 16.1 was already out, would we add
libpq support for Close to 16.2?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)




Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-15 Thread Jacob Champion
On Tue, Aug 15, 2023 at 3:24 PM Michael Paquier  wrote:
> The first message from Jacob outlines the idea behind the handling of
> trust.  We could perhaps add one extra set_authn_id() for the uaTrust
> case (not uaCert!) if that's more helpful.

I'm not super comfortable with saying "connection authenticated" when
it explicitly hasn't been (nor with switching the meaning of a
non-NULL SYSTEM_USER from "definitely authenticated somehow" to "who
knows; parse it apart to see"). But adding a log entry ("connection
trusted:" or some such?) with the pointer to the HBA line that made it
happen seems like a useful audit helper to me.

--Jacob




Re: Would it be possible to backpatch Close support in libpq (28b5726) to PG16?

2023-08-15 Thread Michael Paquier
On Wed, Aug 16, 2023 at 12:14:21AM +0200, Jelte Fennema wrote:
> 28b5726 allows sending Close messages from libpq, as opposed to
> sending DEALLOCATE queries to deallocate prepared statements. Without
> support for Close messages, libpq based clients won't be able to
> deallocate prepared statements on PgBouncer, because PgBouncer does
> not parse SQL queries and only looks at protocol level messages (i.e.
> Close messages for deallocation).

The RMT has the final word on anything related to the release, but we
are discussing about adding something new to a branch that has gone
through two beta cycles with a GA targetted around the end of
September ~ beginning of October based on the trends of the recent
years.  That's out of the picture, IMO.  This comes once every year.

> Personally I think backpatching 28b5726 has a really low risk of
> breaking anything.

I agree about the low-risk argument, though.  This is just new code.
--
Michael


signature.asc
Description: PGP signature


Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-15 Thread Michael Paquier
On Tue, Aug 15, 2023 at 04:49:47PM -0500, Shaun Thomas wrote:
> The switch statement that decodes port->hba->auth_method ends by
> simply setting status = STATUS_OK; with no supplementary output since
> it never calls set_authn_id. So in theory, a malicious user could add
> a trust line to pg_hba.conf and have unlimited unlogged access to the
> database.

That's the same as giving access to your data folder.  Updating
pg_hba.conf is only the tip of the iceberg if one has write access to
your data folder.

> Unless you happen to notice that the "connection
> authenticated" line has disappeared, it would look like normal
> activity.

"trust" is not really an anthentication method because it does
nothing, it just authorizes things to go through, so you cannot
really say that it can have an authn ID (grep for "pedantic" around
here):
https://www.postgresql.org/message-id/c55788dd1773c521c862e8e0dddb367df51222be.ca...@vmware.com

> Would it make sense to decouple the hba info from set_authn_id so that
> it is always logged even when new auth methods get added in the
> future? Or alternatively create a function call specifically for that
> output so it can be produced from the trust case statement and
> anywhere else that needs to tag the auth line. I personally would love
> to see if someone got in through a trust line, ESPECIALLY if it isn't
> supposed to be there. Like:
>
> 2023-08-15 13:26:03.159 PDT [692166] postgres@snip LOG: connection
> authenticated: identity="postgres" method=trust
> (/etc/postgresql/15/main/pg_hba.conf:1)

You mean outside the switch/case in ClientAuthentication().  Some
methods have an authn that is implementation-dependent, like ldap.  I
am not sure if switching the logic would lead to a gain, like calling
once set_authn_id() vs passing up a string to set in a single call of
set_authn_id().

> I read through the discussion, and it doesn't seem like the security
> aspect of simply hiding trust auths from the log was considered. Since
> this is a new capability, I suppose nothing is really different from
> say Postgres 14 and below. Still, it never hurts to ask.

The first message from Jacob outlines the idea behind the handling of
trust.  We could perhaps add one extra set_authn_id() for the uaTrust
case (not uaCert!) if that's more helpful.
--
Michael


signature.asc
Description: PGP signature


Would it be possible to backpatch Close support in libpq (28b5726) to PG16?

2023-08-15 Thread Jelte Fennema
Hi,

I know that this will probably get a staunch "No" as an answer, but...
I'm still going to ask: Would it be possible to backport 28b5726 to
the PG16 branch? Even though it's clearly a new feature?

I'm working on named prepared statement support in PgBouncer:
https://github.com/pgbouncer/pgbouncer/pull/845 That PR is pretty
close to mergable (IMO) and my intention is to release a PgBouncer
version with prepared statement support within a few months.

28b5726 allows sending Close messages from libpq, as opposed to
sending DEALLOCATE queries to deallocate prepared statements. Without
support for Close messages, libpq based clients won't be able to
deallocate prepared statements on PgBouncer, because PgBouncer does
not parse SQL queries and only looks at protocol level messages (i.e.
Close messages for deallocation).

Personally I think backpatching 28b5726 has a really low risk of
breaking anything. And since PgBouncer is used a lot in the Postgres
ecosystem, especially with libpq based clients, IMO it might be worth
deviating from the rule of not backporting features after a STABLE
branch has been cut. Otherwise all libpq based clients will have only
limited support for prepared statements with PgBouncer until PG17 is
released.

Jelte




Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Michael Paquier
On Tue, Aug 15, 2023 at 11:37:32AM +0200, Tomas Vondra wrote:
> Shouldn't the flush be done only for non-transactional messages? The
> transactional case will be flushed by regular commit flush.

Indeed, that would be better.  I am sending an updated patch.

I'd like to backpatch that, would there be any objections to that?
This may depend on how much logical solutions depend on this routine.
--
Michael
diff --git a/src/backend/replication/logical/message.c b/src/backend/replication/logical/message.c
index c5de14afc6..cb3a806114 100644
--- a/src/backend/replication/logical/message.c
+++ b/src/backend/replication/logical/message.c
@@ -47,6 +47,7 @@ LogLogicalMessage(const char *prefix, const char *message, size_t size,
   bool transactional)
 {
 	xl_logical_message xlrec;
+	XLogRecPtr	lsn;
 
 	/*
 	 * Force xid to be allocated if we're emitting a transactional message.
@@ -71,7 +72,15 @@ LogLogicalMessage(const char *prefix, const char *message, size_t size,
 	/* allow origin filtering */
 	XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN);
 
-	return XLogInsert(RM_LOGICALMSG_ID, XLOG_LOGICAL_MESSAGE);
+	lsn = XLogInsert(RM_LOGICALMSG_ID, XLOG_LOGICAL_MESSAGE);
+
+	/*
+	 * Make sure that the message hits disk before leaving if not emitting a
+	 * transactional message.
+	 */
+	if (!transactional)
+		XLogFlush(lsn);
+	return lsn;
 }
 
 /*


signature.asc
Description: PGP signature


Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-15 Thread Shaun Thomas
Hi everyone,

This started as a conversation on Discord. Someone asked if Postgres
logs which line in pg_hba.conf matched against a certain login
attempt, and I said no. That's not quite right, as enabling
log_connections includes a line like this:

2023-08-15 13:26:03.159 PDT [692166] postgres@snip LOG: connection
authenticated: identity="postgres" method=md5
(/etc/postgresql/15/main/pg_hba.conf:107)

But I wasn't getting that output. I finally gave up and looked at the
code, where I found that this particular output is only generated by
the set_authn_id function. So if that function is never called,
there's no message saying which line from the pg_hba.conf file matched
a particular login.

The switch statement that decodes port->hba->auth_method ends by
simply setting status = STATUS_OK; with no supplementary output since
it never calls set_authn_id. So in theory, a malicious user could add
a trust line to pg_hba.conf and have unlimited unlogged access to the
database. Unless you happen to notice that the "connection
authenticated" line has disappeared, it would look like normal
activity.

Would it make sense to decouple the hba info from set_authn_id so that
it is always logged even when new auth methods get added in the
future? Or alternatively create a function call specifically for that
output so it can be produced from the trust case statement and
anywhere else that needs to tag the auth line. I personally would love
to see if someone got in through a trust line, ESPECIALLY if it isn't
supposed to be there. Like:

2023-08-15 13:26:03.159 PDT [692166] postgres@snip LOG: connection
authenticated: identity="postgres" method=trust
(/etc/postgresql/15/main/pg_hba.conf:1)

Perhaps I'm being too paranoid; It just seemed to be an odd omission.
Euler Taveira clued me into the initial patch which introduced the
pg_hba.conf tattling:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9afffcb833d3c5e59a328a2af674fac7e7334fc1

I read through the discussion, and it doesn't seem like the security
aspect of simply hiding trust auths from the log was considered. Since
this is a new capability, I suppose nothing is really different from
say Postgres 14 and below. Still, it never hurts to ask.

Cheers!

-- 
Shaun Thomas
High Availability Architect
EDB
www.enterprisedb.com




Re: Using defines for protocol characters

2023-08-15 Thread Tatsuo Ishii
> On Wed, Aug 16, 2023 at 06:25:09AM +0900, Tatsuo Ishii wrote:
>> Currently pqcomm.h needs c.h which is not problem for Pgpool-II. But
>> what about other middleware?
> 
> Why do you need to include directly c.h?  There are definitions in
> there that are not intended to be exposed.

pqcomm.h has this:

#define UNIXSOCK_PATH(path, port, sockdir) \
   (AssertMacro(sockdir), \
AssertMacro(*(sockdir) != '\0'), \
snprintf(path, sizeof(path), "%s/.s.PGSQL.%d", \
 (sockdir), (port)))

AssertMacro is defined in c.h.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Using defines for protocol characters

2023-08-15 Thread Alvaro Herrera
On 2023-Aug-16, Michael Paquier wrote:

> On Wed, Aug 16, 2023 at 06:25:09AM +0900, Tatsuo Ishii wrote:
> > Currently pqcomm.h needs c.h which is not problem for Pgpool-II. But
> > what about other middleware?
> 
> Why do you need to include directly c.h?  There are definitions in
> there that are not intended to be exposed.

What this argument says is that these new defines should be in a
separate file, not in pqcomm.h.  IMO that makes sense, precisely because
these defines should be usable by third parties.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Using defines for protocol characters

2023-08-15 Thread Michael Paquier
On Wed, Aug 16, 2023 at 06:25:09AM +0900, Tatsuo Ishii wrote:
> Currently pqcomm.h needs c.h which is not problem for Pgpool-II. But
> what about other middleware?

Why do you need to include directly c.h?  There are definitions in
there that are not intended to be exposed.
--
Michael


signature.asc
Description: PGP signature


Re: Using defines for protocol characters

2023-08-15 Thread Tatsuo Ishii
> On Tue, Aug 15, 2023 at 02:46:24PM +0900, Tatsuo Ishii wrote:
>> Is it possible to put the new define staff into protocol.h then let
>> pqcomm.h include protocol.h? This makes Pgpool-II and other middle
>> ware/drivers written in C easier to use the defines so that they only
>> include protocol.h to use the defines.
> 
> It is possible, of course, but are there any reasons such programs couldn't
> include pqcomm.h?  It looks relatively inexpensive to me.  That being said,
> I'm fine with the approach you describe if the folks in this thread agree.

Currently pqcomm.h needs c.h which is not problem for Pgpool-II. But
what about other middleware?

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: run pgindent on a regular basis / scripted manner

2023-08-15 Thread Nathan Bossart
On Fri, Aug 11, 2023 at 01:59:40PM -0700, Peter Geoghegan wrote:
> I'm starting to have doubts about this policy. There have now been
> quite a few follow-up "fixes" to indentation issues that koel
> complained about. None of these fixups have been included in
> .git-blame-ignore-revs. If things continue like this then "git blame"
> is bound to become much less usable over time.

Should we add those?  Patch attached.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f4f158e153f9027330ce841ca79b5650dfd24a0e Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 15 Aug 2023 13:24:18 -0700
Subject: [PATCH v1 1/1] Add a few recent commits to .git-blame-ignore-revs.

---
 .git-blame-ignore-revs | 21 +
 1 file changed, 21 insertions(+)

diff --git a/.git-blame-ignore-revs b/.git-blame-ignore-revs
index 00b548c7b0..2ff86c8d0d 100644
--- a/.git-blame-ignore-revs
+++ b/.git-blame-ignore-revs
@@ -14,6 +14,27 @@
 #
 # $ git log --pretty=format:"%H # %cd%n# %s" $PGINDENTGITHASH -1 --date=iso
 
+89be0b89ae60c63856fd26d82a104781540e2312 # 2023-08-15 17:45:00 +0900
+# Fix code indentation vioaltion introduced in commit 9e9931d2b.
+
+5dc456b7dda4f7d0d7735b066607c190c74d174a # 2023-08-11 20:43:34 +0900
+# Fix code indentation violations introduced by recent commit
+
+62e9af4c63fbd36fb9af8450fb44bece76d7766f # 2023-07-25 12:35:58 +0530
+# Fix code indentation vioaltion introduced in commit d38ad8e31d.
+
+4e465aac36ce9a9533c68dbdc83e67579880e628 # 2023-07-18 14:04:31 +0900
+# Fix indentation in twophase.c
+
+328f492d2565cfbe383f13a69425d751fd79415f # 2023-07-13 22:26:10 +0900
+# Fix code indentation violation in commit b6e1157e7d
+
+69a674a170058e63e8178aec8a36a673efce8801 # 2023-07-06 11:49:18 +0530
+# Fix code indentation vioaltion introduced in commit cc32ec24fd.
+
+a4cfeeca5a97f2b5969c31aa69ba775af95ee5a3 # 2023-07-03 12:47:49 +0200
+# Fix code indentation violations
+
 b334612b8aee9f9a34378982d8938b201dfad323 # 2023-06-20 09:50:43 -0400
 # Pre-beta2 mechanical code beautification.
 
-- 
2.25.1



Re: [PATCH] Add function to_oct

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 01:53:25PM +0700, John Naylor wrote:
> If we're not going to have a general SQL conversion function, here are some
> comments on the current patch.

I appreciate the review.

> +static char *convert_to_base(uint64 value, int base);
> 
> Not needed if the definition is above the callers.

Done.

> + * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be
> either
> + * 2, 8, or 16.
> 
> Why wouldn't it work with any base <= 16?

You're right.  I changed this in v5.

> - *ptr = '\0';
> + Assert(base == 2 || base == 8 || base == 16);
> 
> + *ptr = '\0';
> 
> Spurious whitespace change?

I think this might just be a weird artifact of the diff algorithm.

> - char buf[32]; /* bigger than needed, but reasonable */
> + char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);
> 
> Why is this no longer allocated on the stack? Maybe needs a comment about
> the size calculation.

It really should be.  IIRC I wanted to avoid passing a pre-allocated buffer
to convert_to_base(), but I don't remember why.  I fixed this in v5.

> +static char *
> +convert_to_base(uint64 value, int base)
> 
> On my machine this now requires a function call and a DIV instruction, even
> though the patch claims not to support anything but a power of two. It's
> tiny enough to declare it inline so the compiler can specialize for each
> call site.

This was on my list of things to check before committing.  I assumed that
it would be automatically inlined, but given your analysis, I went ahead
and added the inline keyword.  My compiler took the hint and inlined the
function, and it used SHR instead of DIV instructions.  The machine code
for to_hex32/64 is still a couple of instructions longer than before
(probably because of the uint64 casts), but I don't know if we need to
worry about micro-optimizing these functions any further.

> +{ oid => '5101', descr => 'convert int4 number to binary',
> 
> Needs to be over 8000.

Done.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 2ad83f8781a013ef5e5e5a62a422f3b73c6c5f2d Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 25 Jul 2023 16:09:01 -0700
Subject: [PATCH v5 1/1] add to_binary() and to_oct()

---
 doc/src/sgml/func.sgml|  42 +++
 src/backend/utils/adt/varlena.c   | 103 +++---
 src/include/catalog/pg_proc.dat   |  12 +++
 src/test/regress/expected/strings.out |  26 ++-
 src/test/regress/sql/strings.sql  |   9 ++-
 5 files changed, 163 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c914..23b5ac7457 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3737,6 +3737,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ to_binary
+
+to_binary ( integer )
+text
+   
+   
+to_binary ( bigint )
+text
+   
+   
+Converts the number to its equivalent binary representation.
+   
+   
+to_binary(2147483647)
+111
+   
+  
+
+  
+   
+
+ to_oct
+
+to_oct ( integer )
+text
+   
+   
+to_oct ( bigint )
+text
+   
+   
+Converts the number to its equivalent octal representation.
+   
+   
+to_oct(2147483647)
+177
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index b1ec5c32ce..b955e54611 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4919,54 +4919,105 @@ array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	return result;
 }
 
-#define HEXBASE 16
 /*
- * Convert an int32 to a string containing a base 16 (hex) representation of
- * the number.
+ * We size the buffer for to_binary's longest possible return value, including
+ * the terminating '\0'.
  */
-Datum
-to_hex32(PG_FUNCTION_ARGS)
+#define CONVERT_TO_BASE_BUFSIZE		(sizeof(uint64) * BITS_PER_BYTE + 1)
+
+/*
+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be > 1 and
+ * <= 16.  buf must be at least CONVERT_TO_BASE_BUFSIZE bytes.
+ */
+static inline char *
+convert_to_base(uint64 value, int base, char *buf)
 {
-	uint32		value = (uint32) PG_GETARG_INT32(0);
-	char	   *ptr;
 	const char *digits = "0123456789abcdef";
-	char		buf[32];		/* bigger than needed, but reasonable */
+	char	   *ptr = buf + CONVERT_TO_BASE_BUFSIZE - 1;
 
-	ptr = buf + sizeof(buf) - 1;
-	*ptr = '\0';
+	Assert(base > 1);
+	Assert(base <= 16);
 
+	*ptr = '\0';
 	do
 	{
-		*--ptr = digits[value % HEXBASE];
-		value /= HEXBASE;
+		*--ptr = digits[value % base];
+		value /= base;
 	} while (ptr > buf && value);
 
-	PG_RETURN_TEXT_P(cstring_to_text(ptr));
+	return ptr;
+}
+
+/*
+ * Convert an integer to a string containing a base-2 (binary) representation
+ * of the number.
+ 

Re: initial pruning in parallel append

2023-08-15 Thread Robert Haas
On Wed, Aug 9, 2023 at 8:57 AM Amit Langote  wrote:
> I checked enough to be sure that IsParallelWorker() is reliable at the
> time of ExecutorStart() / ExecInitNode() in ParallelQueryMain() in a
> worker.   However, ParallelWorkerContext is not available at that
> point.  Here's the relevant part of ParallelQueryMain():
>
> /* Start up the executor */
> queryDesc->plannedstmt->jitFlags = fpes->jit_flags;
> ExecutorStart(queryDesc, fpes->eflags);
>
> /* Special executor initialization steps for parallel workers */
> queryDesc->planstate->state->es_query_dsa = area;
> if (DsaPointerIsValid(fpes->param_exec))
> {
> char   *paramexec_space;
>
> paramexec_space = dsa_get_address(area, fpes->param_exec);
> RestoreParamExecParams(paramexec_space, queryDesc->estate);
> }
> pwcxt.toc = toc;
> pwcxt.seg = seg;
> ExecParallelInitializeWorker(queryDesc->planstate, );
>
> BTW, we do also use IsParallelWorker() in ExecGetRangeTableRelation()
> which also probably only runs during ExecInitNode(), same as
> ExecInitPartitionPruning() that this patch adds it to.

I don't know if that's a great idea, but I guess if we're already
doing it, it doesn't hurt to expand the use a little bit.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Faster "SET search_path"

2023-08-15 Thread Robert Haas
On Mon, Aug 7, 2023 at 7:24 PM Jeff Davis  wrote:
> I might just avoid guc.c entirely and directly set
> namespace_search_path and baseSearchPathValid=false. The main thing we
> lose there is the GUC stack (AtEOXact_GUC()), but there's already a
> PG_TRY/PG_FINALLY block in fmgr_security_definer, so we can use that to
> change it back safely. (I think? I need to convince myself that it's
> safe to skip the work in guc.c, at least for the special case of
> search_path, and that it won't be too fragile.)

I suspect that dodging the GUC stack machinery is not a very good
idea. The timing of when TRY/CATCH blocks are called is different from
when subtransactions are aborted, and that distinction has messed me
up more than once when trying to code various things. Specifically,
changes that happen in a CATCH block happen before we actually reach
Abort(Sub)Transaction, which sort of makes it sound like you'd be
reverting back to the old value too early. The GUC stack is also
pretty complicated because of the SET/SAVE and LOCAL/non-LOCAL stuff.
I can't say there isn't a way to make something like what you're
talking about here work, but I bet it will be difficult to get all of
the corner cases right, and I suspect that trying to speed up the
existing mechanism is a better plan than trying to go around it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Extending SMgrRelation lifetimes

2023-08-15 Thread Heikki Linnakangas

On 14/08/2023 05:41, Thomas Munro wrote:

The new idea is to overload smgrrelease(it) so that it also clears the
owner, which means that AtEOXact_SMgr() will eventually smgrclose(it),
unless it is re-owned by a relation before then.  That choice stems
from the complete lack of information available via sinval in the case
of an overflow.  We must (1) close all descriptors because any file
might have been unlinked, (2) keep all pointers valid and yet (3) not
leak dropped smgr objects forever.  In this patch, smgrreleaseall()
achieves those goals.


Makes sense.

Some of the smgrclose() calls could perhaps still be smgrclose(). If you 
can guarantee that no-one is holding the SMgrRelation, it's still OK to 
call smgrclose(). There's little gain from closing earlier, though.



Proof-of-concept patch attached.  Are there holes in this scheme?
Better ideas welcome.  In terms of spelling, another option would be
to change the behaviour of smgrclose() to work as described, ie it
would call smgrrelease() and then also disown, so we don't have to
change most of those callers, and then add a new function
smgrdestroy() for the few places that truly need it.  Or something
like that.


If you change smgrclose() to do what smgrrelease() does now, then it 
will apply automatically to extensions.


If an extension is currently using smgropen()/smgrclose() correctly, 
this patch alone won't make it wrong, so it's not very critical for 
extensions to adopt the change. However, if after this we consider it OK 
to hold a pointer to SMgrRelation for longer, and start doing that in 
the backend, then extensions need to be adapted too.



While studying this I noticed a minor thinko in smgrrelease() in
15+16, so here's a fix for that also.  I haven't figured out a
sequence that makes anything bad happen, but we should really
invalidate smgr_targblock when a relfilenode is reused, since it might
point past the end.  This becomes more obvious once smgrrelease() is
used for truncation, as proposed here.


+1. You can move the smgr_targblock clearing out of the loop, though.

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Using defines for protocol characters

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 02:46:24PM +0900, Tatsuo Ishii wrote:
> Is it possible to put the new define staff into protocol.h then let
> pqcomm.h include protocol.h? This makes Pgpool-II and other middle
> ware/drivers written in C easier to use the defines so that they only
> include protocol.h to use the defines.

It is possible, of course, but are there any reasons such programs couldn't
include pqcomm.h?  It looks relatively inexpensive to me.  That being said,
I'm fine with the approach you describe if the folks in this thread agree.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Add function to_oct

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 07:58:17AM +0200, Vik Fearing wrote:
> On 8/15/23 06:11, Nathan Bossart wrote:
>> If there are no objections, I'd like to commit this patch soon.
> 
> I just took a look at this (and the rest of the thread).  I am a little bit
> disappointed that we don't have a generic base conversion function, but even
> if we did I think these specialized functions would still be useful.
> 
> No objection from me.

Thanks for taking a look.  I don't mean for this to preclude a generic base
conversion function that would supersede the functions added by this patch.
However, I didn't want to hold up $SUBJECT because of something that may or
may not happen after lots of discussion.  If it does happen within the v17
development cycle, we could probably just remove to_oct/binary.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: pgbench - adding pl/pgsql versions of tests

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 09:46:59AM +0200, Fabien COELHO wrote:
> I'm unclear about what variety of scripts that could be provided given the
> tables made available with pgbench. ISTM that other scenari would involve
> both an initialization and associated scripts, and any proposal would be
> bared because it would open the door to anything.

Why's that?  I'm not aware of any project policy that prohibits such
enhancements to pgbench.  It might take some effort to gather consensus on
a proposal like this, but IMHO that doesn't mean we shouldn't try.  If the
prevailing wisdom is that we shouldn't add more built-in scripts because
there is an existing way to provide custom ones, then it's not clear that
we should proceed with $SUBJECT, anyway.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Replace known_assigned_xids_lck by memory barrier

2023-08-15 Thread Nathan Bossart
On Tue, Aug 15, 2023 at 12:29:24PM +0200, Michail Nikolaev wrote:
>> What sort of benefits do you see from this patch? It might be worthwhile
>> in itself to remove spinlocks when possible, but IME it's much easier to
>> justify such changes when there is a tangible benefit we can point to.
> 
> Oh, it is not an easy question :)
> 
> The answer, probably, looks like this:
> 1) performance benefits of spin lock acquire removing in
> KnownAssignedXidsGetOldestXmin and KnownAssignedXidsSearch
> 2) it is closing 13-year-old tech depth
> 
> But in reality, it is not easy to measure performance improvement
> consistently for this change.

Okay.  Elsewhere, it seems like folks are fine with patches that reduce
shared memory space via atomics or barriers even if there's no immediate
benefit [0], so I think it's fine to proceed.

>> Are the assignments in question guaranteed to be atomic? IIUC we assume
>> that aligned 4-byte loads/stores are atomic, so we should be okay as long
>> as we aren't handling anything larger.
> 
> Yes, 4-bytes assignment are atomic, locking is used to ensure memory
> write ordering in this place.

Yeah, it looks like both the values that are protected by
known_assigned_xids_lck are integers, so this should be okay.  One
remaining question I have is whether it is okay if we see an updated value
for one of the head/tail variables but not the other.  It looks like the
tail variable is only updated with ProcArrayLock held exclusively, which
IIUC wouldn't prevent such mismatches even today, since we use a separate
spinlock for reading them in some cases.

[0] https://postgr.es/m/20230524214958.mt6f5xokpumvnrio%40awork3.anarazel.de

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Converting sql anywhere to postgres

2023-08-15 Thread Russell Rose | Passfield Data Systems
Hi there

I am trying to convert a SQL Anywhere database to postgres. Within SQL anywhere 
a field can have a default value of 'last user'. This means that when you 
perform an update on a table, if the field is not explicitly set then the 
current user is used. So for instance if I have a field called mod_user in a 
table, but when I do an update on the table and do not set mod_user then SQL 
Anywhere sets the field to current_uer. I have tried to replicate this using a 
postgres trigger in the before update. However, if I do not set the value then 
it automatically picks up the value that was already in the field. Is there a 
way to tell the difference between me setting the value to the same as the 
previous value and postgres automatically picking it up.

If the field myfield contains the word 'me'. Can I tell the difference between:
Update table1 set field1='something',myfield='me'
And
Update table1 set field1='something'




Re: Avoid a potential unstable test case: xmlmap.sql

2023-08-15 Thread Alexander Lakhin

Hi Andy,

15.08.2023 14:09, Andy Fan wrote:


Hi:

In the test case of xmlmap.sql, we have the query below under schema_to_xml.



Please look at the bug #18014:
https://www.postgresql.org/message-id/flat/18014-28c81cb79d44295d%40postgresql.org
There were other aspects of the xmlmap test failure discussed in that thread as 
well.

Best regards,
Alexander




Re: BUG #17946: LC_MONETARY & DO LANGUAGE plperl - BUG

2023-08-15 Thread Heikki Linnakangas

On 01/08/2023 16:48, Joe Conway wrote:

Any further comments on the posted patch[1]? I would like to apply/push
this prior to the beta and minor releases next week.


I'm not sure about the placement of the uselocale() calls. In 
plperl_spi_exec(), for example, I think we should switch to the global 
locale right after the check_spi_usage_allowed() call. Otherwise, if an 
error happens in BeginInternalSubTransaction() or in pg_verifymbstr(), 
the error would be processed with the perl locale. Maybe that's 
harmless, error processing hardly cares about LC_MONETARY, but seems 
wrong in principle.


Hmm, come to think of it, if BeginInternalSubTransaction() throws an 
error, we just jump out of the perl interpreter? That doesn't seem cool. 
But that's not new with this patch.


If I'm reading correctly, compile_plperl_function() calls 
select_perl_context(), which calls plperl_trusted_init(), which calls 
uselocale(). So it leaves locale set to the perl locale. Who sets it back?


How about adding a small wrapper around eval_pl() that sets and unsets 
the locale(), just when we enter the interpreter? It's easier to see 
that we are doing the calls in right places, if we make them as close as 
possible to entering/exiting the interpreter. Are there other functions 
in addition to eval_pl() that need to be called with the perl locale?



/*
 * plperl_xact_callback --- cleanup at main-transaction end.
 */
static void
plperl_xact_callback(XactEvent event, void *arg)
{
/* ensure global locale is the current locale */
if (uselocale((locale_t) 0) != LC_GLOBAL_LOCALE)
perl_locale_obj = uselocale(LC_GLOBAL_LOCALE);
}


So the assumption is that the if current locale is not LC_GLOBAL_LOCALE, 
then it was the perl locale. Seems true today, but this could confusion 
if anything else calls uselocale(). In particular, if another PL 
implementation copies this, and you use plperl and the other PL at the 
same time, they would get mixed up. I think we need another "bool 
perl_locale_obj_in_use" variable to track explicitly whether the perl 
locale is currently active.


If we are careful to put the uselocale() calls in the right places so 
that we never ereport() while in perl locale, this callback isn't 
needed. Maybe it's still a good idea, though, to be extra sure that 
things get reset to a sane state if something unexpected happens.


If multiple interpreters are used, is the single perl_locale_obj 
variable still enough? Each interpreter can have their own locale I believe.


PS. please pgindent

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: postgres_fdw: wrong results with self join + enable_nestloop off

2023-08-15 Thread Önder Kalacı
Hi Etsuro, all

The commit[1] seems to break some queries in Citus[2], which is an
extension which relies on set_join_pathlist_hook.

Although the comment says */*Finally, give extensions a chance to
manipulate the path list.*/  *we use it to extract lots of information
about the joins and do the planning based on the information.

Now,  for some joins where consider_join_pushdown=false, we cannot get the
information that we used to get, which prevents doing distributed planning
for certain queries.

We wonder if it is possible to allow extensions to access the join info
under all circumstances, as it used to be? Basically, removing the
additional check:

diff --git a/src/backend/optimizer/path/joinpath.c
b/src/backend/optimizer/path/joinpath.c
index 03b3185984..080e76cbe9 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -349,8 +349,7 @@ add_paths_to_joinrel(PlannerInfo *root,
/*
 * 6. Finally, give extensions a chance to manipulate the path list.
 */
-   if (set_join_pathlist_hook &&
-   consider_join_pushdown)
+   if (set_join_pathlist_hook)
set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
   jointype,
);
 }



Thanks,
Onder

[1]:
https://github.com/postgres/postgres/commit/b0e390e6d1d68b92e9983840941f8f6d9e083fe0
[2]: https://github.com/citusdata/citus/issues/7119


Etsuro Fujita , 15 Ağu 2023 Sal, 11:05 tarihinde
şunu yazdı:

> On Tue, Aug 8, 2023 at 6:30 PM Richard Guo  wrote:
> > On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita 
> wrote:
> >> I modified the code a bit further to use an if-test to avoid a useless
> >> function call, and added/tweaked comments and docs further.  Attached
> >> is a new version of the patch.  I am planning to commit this, if there
> >> are no objections.
>
> > +1 to the v4 patch.  It looks good to me.
>
> Pushed after some copy-and-paste editing of comments/documents.
>
> Thanks!
>
> Best regards,
> Etsuro Fujita
>
>
>


Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-08-15 Thread John Naylor
On Tue, Aug 15, 2023 at 9:34 AM Masahiko Sawada 
wrote:

> BTW cfbot reported that some regression tests failed due to OOM. I've
> attached the patch to fix it.

Seems worth doing now rather than later, so added this and squashed most of
the rest together. I wonder if that test uses too much memory in general.
Maybe using the full uint64 is too much.

> On Mon, Aug 14, 2023 at 8:05 PM John Naylor
>  wrote:

> > This is possibly faster than v38-0010, but looking like not worth the
complexity, assuming the other way avoids the bug going forward.
>
> I prefer 0010 but is it worth testing with other compilers such as clang?

Okay, keeping 0010 with a comment, and leaving out 0011 for now. Clang is
aggressive about unrolling loops, so may be worth looking globally at some
point.

> > v38-0012-Use-branch-free-coding-to-skip-new-element-index.patch

> > Within noise level of v38-0011, but it's small and simple, so I like
it, at least for small arrays.
>
> Agreed.

Keeping 0012 and not 0013.

--
John Naylor
EDB: http://www.enterprisedb.com


v39-ART.tar.gz
Description: application/gzip


Test case for parameterized remote path in postgres_fdw

2023-08-15 Thread Etsuro Fujita
Hi,

While working on the join pushdown issue, I noticed this bit in commit
e4106b252:

--- parameterized remote path
+-- parameterized remote path for foreign table
 EXPLAIN (VERBOSE, COSTS false)
-  SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
+  SELECT * FROM "S 1"."T 1" a, ft2 b WHERE a."C 1" = 47 AND b.c1 = a.c2;
 SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;
+

The first statement was modified to test the intended behavior, but
the second one was not.  The second one as-is performs a foreign join:

EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM ft2 a, ft2 b WHERE a.c1 = 47 AND b.c1 = a.c2;

 QUERY PLAN
---
 Foreign Scan
   Output: a.c1, a.c2, a.c3, a.c4, a.c5, a.c6, a.c7, a.c8, b.c1, b.c2,
b.c3, b.c4, b.c5, b.c6, b.c7, b.c8
   Relations: (public.ft2 a) INNER JOIN (public.ft2 b)
   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8, r2."C 1", r2.c2, r2.c3, r2.c4, r2.c5, r2.c6, r2.c7,
r2.c8 FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1.c2 =
r2."C 1")) AND ((r1."C 1" = 47
(4 rows)

So we should have modified the second one as well?  Attached is a
small patch for that.

Best regards,
Etsuro Fujita


postgres-fdw-parameterized-path-test-case.patch
Description: Binary data


Re: Avoid a potential unstable test case: xmlmap.sql

2023-08-15 Thread Andy Fan
I overlooked the fact even in the bitmap index scan loose mode,  the recheck
is still executed  before the qual, so bitmap index scan is OK in this case.

 Sort
   Output: oid, relname
   Sort Key: pg_class.relname
   ->  Bitmap Heap Scan on pg_catalog.pg_class
 Output: oid, relname
 Recheck Cond: (pg_class.relnamespace = '28601'::oid)
 Filter: (has_table_privilege(pg_class.oid, 'SELECT'::text) AND
(pg_class.relkind = ANY ('{r,m,v}'::"char"[])))
 ->  Bitmap Index Scan on pg_class_relname_nsp_index
   Index Cond: (pg_class.relnamespace = '28601'::oid)

v2 attached.

-- 
Best Regards
Andy Fan


v2-0001-Avoid-a-potential-unstable-testcase.patch
Description: Binary data


Avoid a potential unstable test case: xmlmap.sql

2023-08-15 Thread Andy Fan
Hi:

In the test case of  xmlmap.sql, we have the query below
under schema_to_xml.

explain (costs off, verbose)
SELECT oid FROM pg_catalog.pg_class
WHERE  relnamespace = 28601
AND relkind IN ('r','m','v')
AND pg_catalog.has_table_privilege (oid, 'SELECT')
ORDER BY relname;

If the query is using SeqScan, the execution order of the quals is:

has_table_privilege(pg_class.oid, 'SELECT'::text) AND
(pg_class.relnamespace = '28601'::oid) AND (pg_class.relkind = ANY
('{r,m,v}'::"char"[]))

based on current cost setting and algorithm.  With this plan,
has_table_privilege(pg_class.oid, 'SELECT'::text)  may be executed
against all the relations (not just the given namespace), so if a
tuple in pg_class is scanned and before has_table_privilege is called,
the relation is dropped, then we will get error:

ERROR:  relation with OID xxx does not exist

To overcome this,  if disabling the seqscan, then only index scan on
relnamespace is possible,  so relnamespace = '28601'::oid will be filtered
first before calling has_table_privilege.  and in this test case, we are
sure
the relation belonging to the current namespace will never be dropped, so
no error is possible.  Here is the plan for reference:

Seq Scan:

 Sort
   Output: oid, relname
   Sort Key: pg_class.relname
   ->  Seq Scan on pg_catalog.pg_class
 Output: oid, relname
 Filter: (has_table_privilege(pg_class.oid, 'SELECT'::text) AND
(pg_class.relnamespace = '28601'::oid) AND (pg_class.relkind = ANY
('{r,m,v}'::"char"[])))

enable_seqscan to off

QUERY PLAN

--
 Index Scan using pg_class_relname_nsp_index on pg_catalog.pg_class
   Output: oid, relname
   Index Cond: (pg_class.relnamespace = '28601'::oid)
   Filter: (has_table_privilege(pg_class.oid, 'SELECT'::text) AND
(pg_class.relkind = ANY ('{r,m,v}'::"char"[])))

Patch is attached.

-- 
Best Regards
Andy Fan


v1-0001-Avoid-a-potential-unstable-testcase.patch
Description: Binary data


Re: Replace known_assigned_xids_lck by memory barrier

2023-08-15 Thread Michail Nikolaev
Hello, Nathan.

> What sort of benefits do you see from this patch? It might be worthwhile
> in itself to remove spinlocks when possible, but IME it's much easier to
> justify such changes when there is a tangible benefit we can point to.

Oh, it is not an easy question :)

The answer, probably, looks like this:
1) performance benefits of spin lock acquire removing in
KnownAssignedXidsGetOldestXmin and KnownAssignedXidsSearch
2) it is closing 13-year-old tech depth

But in reality, it is not easy to measure performance improvement
consistently for this change.

> Are the assignments in question guaranteed to be atomic? IIUC we assume
> that aligned 4-byte loads/stores are atomic, so we should be okay as long
> as we aren't handling anything larger.

Yes, 4-bytes assignment are atomic, locking is used to ensure memory
write ordering in this place.

> This use of pg_write_barrier() looks correct to me, but don't we need
> corresponding read barriers wherever we obtain the pointers? FWIW I tend
> to review src/backend/storage/lmgr/README.barrier in its entirety whenever
> I deal with this stuff.

Oh, yeah, you're right! (1)
I'll prepare an updated version of the patch soon. I don't why I was
assuming pg_write_barrier is enough (⊙_⊙')


[1]: 
https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README.barrier#L125




Re: cataloguing NOT NULL constraints

2023-08-15 Thread Alvaro Herrera
On 2023-Aug-15, Dean Rasheed wrote:

> I think perhaps for ALTER TABLE INHERIT, it should check that the
> child has a NOT NULL constraint, and error out if not. That's the
> current behaviour, and also matches other constraints types (e.g.,
> CHECK constraints).

Yeah, I reached the same conclusion yesterday while trying it out, so
that's what I implemented.  I'll post later today.

> More generally though, I'm worried that this is starting to get very
> complicated. I wonder if there might be a different, simpler approach.
> One vague idea is to have a new attribute on the column that counts
> the number of constraints (local and inherited PK and NOT NULL
> constraints) that make the column not null.

Hmm.  I grant that this is different, but I don't see that it is
simpler.

> Something else I noticed when reading the SQL standard is that a
> user-defined CHECK (col IS NOT NULL) constraint should be recognised
> by the system as also making the column not null (setting its
> "nullability characteristic" to "known not nullable").

I agree with this view actually, but I've refrained from implementing
it(*) because our SQL-standards people have advised against it.  Insider
knowledge?  I don't know.  I think this is a comparatively smaller
consideration though, and we can adjust for it afterwards.

(*) Rather: at some point I removed the implementation of that from the
patch.

> I'm also wondering whether creating a pg_constraint entry for *every*
> not-nullable column is actually going too far. If we were to
> distinguish between "defined as NOT NULL" and being not null as a
> result of one or more constraints, in the way that the standard seems
> to suggest, perhaps the former (likely to be much more common) could
> simply be a new attribute stored on the column. I think we actually
> only need to create pg_constraint entries if a constraint name or any
> additional constraint properties such as NOT VALID are specified. That
> would lead to far fewer new constraints, less catalog bloat, and less
> noise in the \d output.

There is a problem if we do this, though, which is that we cannot use
the constraints for the things that we want them for -- for example,
remove_useless_groupby_columns() would like to use unique constraints,
not just primary keys; but it depends on the NOT NULL rows being there
for invalidation reasons (namely: if the NOT NULL constraint is dropped,
we need to be able to replan.  Without catalog rows, we don't have a
mechanism to let that happen).

If we don't add all those redundant catalog rows, then this is all for
naught.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree.  (Don Knuth)




Re: cataloguing NOT NULL constraints

2023-08-15 Thread Dean Rasheed
On Fri, 11 Aug 2023 at 14:54, Alvaro Herrera  wrote:
>
> Right, in the end I got around to that point of view.  I abandoned the
> idea of adding these dependency links, and I'm back at relying on the
> coninhcount/conislocal markers.  But there were a couple of bugs in the
> accounting for that, so I've fixed some of those, but it's not yet
> complete:
>
> - ALTER TABLE parent ADD PRIMARY KEY
>   needs to create NOT NULL constraints in children.  I added this, but
>   I'm not yet sure it works correctly (for example, if a child already
>   has a NOT NULL constraint, we need to bump its inhcount, but we
>   don't.)
> - ALTER TABLE parent ADD PRIMARY KEY USING index
>   Not sure if this is just as above or needs separate handling
> - ALTER TABLE DROP PRIMARY KEY
>   needs to decrement inhcount or drop the constraint if there are no
>   other sources for that constraint to exist.  I've adjusted the drop
>   constraint code to do this.
> - ALTER TABLE INHERIT
>   needs to create a constraint on the new child, if parent has PK. Not
>   implemented
> - ALTER TABLE NO INHERIT
>   needs to delink any constraints (decrement inhcount, possibly drop
>   the constraint).
>

I think perhaps for ALTER TABLE INHERIT, it should check that the
child has a NOT NULL constraint, and error out if not. That's the
current behaviour, and also matches other constraints types (e.g.,
CHECK constraints).

More generally though, I'm worried that this is starting to get very
complicated. I wonder if there might be a different, simpler approach.
One vague idea is to have a new attribute on the column that counts
the number of constraints (local and inherited PK and NOT NULL
constraints) that make the column not null.

Something else I noticed when reading the SQL standard is that a
user-defined CHECK (col IS NOT NULL) constraint should be recognised
by the system as also making the column not null (setting its
"nullability characteristic" to "known not nullable"). I think that's
more than just an artefact of how they say NOT NULL constraints should
be implemented, because the effect of such a CHECK constraint should
be exposed in the "columns" view of the information schema -- the
value of "is_nullable" should be "NO" if the column is "known not
nullable".

In this sense, the standard does allow multiple not null constraints
on a column, independently of whether the column is "defined as NOT
NULL". My understanding of the standard is that ALTER COLUMN ...
SET/DROP NOT NULL change whether or not the column is "defined as NOT
NULL", and manage a single system-generated constraint, but there may
be any number of other user-defined constraints that also make the
column "known not nullable", and they need to be tracked in some way.

I'm also wondering whether creating a pg_constraint entry for *every*
not-nullable column is actually going too far. If we were to
distinguish between "defined as NOT NULL" and being not null as a
result of one or more constraints, in the way that the standard seems
to suggest, perhaps the former (likely to be much more common) could
simply be a new attribute stored on the column. I think we actually
only need to create pg_constraint entries if a constraint name or any
additional constraint properties such as NOT VALID are specified. That
would lead to far fewer new constraints, less catalog bloat, and less
noise in the \d output.

Regards,
Dean




Re: pgbench: allow to exit immediately when any client is aborted

2023-08-15 Thread Fabien COELHO



About pgbench exit on abort v4:

Patch applies cleanly, compiles, local make check ok, doc looks ok.

This looks ok to me.

--
Fabien.




Re: pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Tomas Vondra



On 8/15/23 08:38, Michael Paquier wrote:
> Hi all,
> 
> While playing with pg_logical_emit_message() and WAL replay, I have
> noticed that LogLogicalMessage() inserts a record but forgets to make
> sure that the record has been flushed.  So, for example, if the system
> crashes the message inserted can get lost.
> 
> I was writing some TAP tests for it for the sake of a bug, and I have
> found this the current behavior annoying because one cannot really
> rely on it when emulating crashes.
> 
> This has been introduced in 3fe3511 (from 2016), and there is no
> mention of that on the original thread that led to this commit:
> https://www.postgresql.org/message-id/flat/5685F999.6010202%402ndquadrant.com
> 
> This could be an issue for anybody using LogLogicalMessage() out of
> core, as well, because it would mean some records lost.  So, perhaps
> this should be treated as a bug, sufficient for a backpatch?
> 

Shouldn't the flush be done only for non-transactional messages? The
transactional case will be flushed by regular commit flush.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: LLVM 16 (opaque pointers)

2023-08-15 Thread Fabien COELHO



[...] Further changes are already needed for their "main" branch (LLVM 
17-to-be), so this won't quite be enough to shut seawasp up.


For information, the physical server which was hosting my 2 bf animals 
(seawasp and moonjelly) has given up rebooting after a power cut a few 
weeks/months ago, and I have not setup a replacement (yet).


--
Fabien.




Generating memory trace from Postgres backend process

2023-08-15 Thread Muneeb Anwar
Hi,
Has anyone tried generating a dynamic memory trace of a backend Postgres
process while it's running a query?

I want to characterize the memory access pattern of the Postgres database
engine when it's running any given query. The usual way to do this would be
to attach a dynamic instrumentation tool like DynamoRIO or Intel Pin to the
backend process running the query (?). How could I find the exact backend
process to attach to? Also, if there's any prior experience with generating
such memory traces, it would be helpful to know what tools were used.

best regards,
Muneeb Khan


Re: remaining sql/json patches

2023-08-15 Thread jian he
Hi.
in v11, json_query:
+The returned data_type has the
same semantics
+as for constructor functions like json_objectagg;
+the default returned type is text.
+The ON EMPTY clause specifies the behavior if the
+path_expression yields no value at all; the
+default when ON ERROR is not specified is
to return a
+null value.

the default returned type is jsonb?  Also in above quoted second last
line should be ON EMPTY ?
Other than that, the doc looks good.




Re: New WAL record to detect the checkpoint redo location

2023-08-15 Thread Dilip Kumar
On Fri, Jul 14, 2023 at 8:46 PM Andres Freund  wrote:
>
> Hi,
>
> As I think I mentioned before, I like this idea. However, I don't like the
> implementation too much.

Thanks for looking into it.


> This might work right now, but doesn't really seem maintainable. Nor do I like
> adding branches into this code a whole lot.

Okay, Now I have moved the XlogInsert for the special record outside
the WalInsertLock so we don't need this special handling here.

> > @@ -6597,6 +6612,32 @@ CreateCheckPoint(int flags)
>
> I think the commentary above this function would need a fair bit of
> revising...
>
> >*/
> >   RedoRecPtr = XLogCtl->Insert.RedoRecPtr = checkPoint.redo;
> >
> > + /*
> > +  * Insert a special purpose CHECKPOINT_REDO record as the first 
> > record at
> > +  * checkpoint redo lsn.  Although we have the checkpoint record that
> > +  * contains the exact redo lsn, there might have been some other 
> > records
> > +  * those got inserted between the redo lsn and the actual checkpoint
> > +  * record.  So when processing the wal, we cannot rely on the 
> > checkpoint
> > +  * record if we want to stop at the checkpoint-redo LSN.
> > +  *
> > +  * This special record, however, is not required when we doing a 
> > shutdown
> > +  * checkpoint, as there will be no concurrent wal insertions during 
> > that
> > +  * time.  So, the shutdown checkpoint LSN will be the same as
> > +  * checkpoint-redo LSN.
> > +  *
> > +  * This record is guaranteed to be the first record at checkpoint 
> > redo lsn
> > +  * because we are inserting this while holding the exclusive wal 
> > insertion
> > +  * lock.
> > +  */
> > + if (!shutdown)
> > + {
> > + int dummy = 0;
> > +
> > + XLogBeginInsert();
> > + XLogRegisterData((char *) , sizeof(dummy));
> > + recptr = XLogInsert(RM_XLOG_ID, XLOG_CHECKPOINT_REDO);
> > + }
>
> It seems to me that we should be able to do better than this.
>
> I suspect we might be able to get rid of the need for exclusive inserts
> here. If we rid of that, we could determine the redoa location based on the
> LSN determined by the XLogInsert().

Yeah, good idea, actually we can do this insert outside of the
exclusive insert lock and set the LSN of this insert as the
checkpoint. redo location.  So now we do not need to compute the
checkpoint. redo based on the current insertion point we can directly
use the LSN of this record.  I have modified this and I have also
moved some other code that is not required to be inside the WAL
insertion lock.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v2-0001-New-WAL-record-for-checkpoint-redo-location.patch
Description: Binary data


Re: postgres_fdw: wrong results with self join + enable_nestloop off

2023-08-15 Thread Etsuro Fujita
On Tue, Aug 8, 2023 at 6:30 PM Richard Guo  wrote:
> On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita  wrote:
>> I modified the code a bit further to use an if-test to avoid a useless
>> function call, and added/tweaked comments and docs further.  Attached
>> is a new version of the patch.  I am planning to commit this, if there
>> are no objections.

> +1 to the v4 patch.  It looks good to me.

Pushed after some copy-and-paste editing of comments/documents.

Thanks!

Best regards,
Etsuro Fujita




Re: pgbench - adding pl/pgsql versions of tests

2023-08-15 Thread Fabien COELHO



Hello Nathan,


1. so I don't have to create the script and function manually each
time I want to test mainly the database (instead of the
client-database system)

2. so that new users of PostgreSQL can easily see how much better OLTP
workloads perform when packaged up as a server-side function


I'm not sure we should add micro-optimized versions of the existing scripts
to pgbench.  Your point about demonstrating the benefits of server-side
functions seems reasonable, but it also feels a bit like artifically
improving pgbench numbers.  I think I'd rather see some more variety in the
built-in scripts so that folks can more easily test a wider range of common
workloads.  Perhaps this could include a test that is focused on
server-side functions.


ISTM that your argument suggests to keep the tpcb-like PL/pgSQL version.
It is the more beneficial anyway as it merges 4/5 commands in one call, so 
it demonstrate the effect of investing in this kind of approach.


I'm unclear about what variety of scripts that could be provided given the 
tables made available with pgbench. ISTM that other scenari would involve 
both an initialization and associated scripts, and any proposal would be 
bared because it would open the door to anything.



In any case, it looks like there is unaddressed feedback for this patch, so
I'm marking it as "waiting on author."


Indeed.

--
Fabien.




Re: Extract numeric filed in JSONB more effectively

2023-08-15 Thread Pavel Stehule
út 15. 8. 2023 v 9:05 odesílatel Andy Fan  napsal:

>
>
>> a) effectiveness. The ending performance should be similar like your
>> current patch, but without necessity to use planner support API.
>>
>
> So the cost is we need to create a new & different framework.
>

yes, it can be less work, code than for example introduction of
"anycompatible".


>
>>
> b) because you can write only var := j->'f', and plpgsql forces cast
>> function execution, but not via planner.
>>
>
> var a := 1 needs going with planner,  IIUC,  same with j->'f'.
>

i was wrong, the planner is full, but the executor is reduced.



>
> c) nothing else. It should not to require to modify cast function
>> definitions
>>
>
> If you look at how the planner support function works,  that is
> pretty simple,  just modify the prosupport attribute. I'm not sure
> this should be called an issue or avoiding it can be described
> as a benefit.
>
> I don't think the current case is as bad as the other ones like
> users needing to modify their queries or type-safety system
> being broken. So personally I'm not willing to creating some
> thing new & heavy. However I'm open to see what others say.
>

ok

regards

Pavel


>
> --
> Best Regards
> Andy Fan
>


Re: Extract numeric filed in JSONB more effectively

2023-08-15 Thread Andy Fan
>
> a) effectiveness. The ending performance should be similar like your
> current patch, but without necessity to use planner support API.
>

So the cost is we need to create a new & different framework.

>
>
b) because you can write only var := j->'f', and plpgsql forces cast
> function execution, but not via planner.
>

var a := 1 needs going with planner,  IIUC,  same with j->'f'.

c) nothing else. It should not to require to modify cast function
> definitions
>

If you look at how the planner support function works,  that is
pretty simple,  just modify the prosupport attribute. I'm not sure
this should be called an issue or avoiding it can be described
as a benefit.

I don't think the current case is as bad as the other ones like
users needing to modify their queries or type-safety system
being broken. So personally I'm not willing to creating some
thing new & heavy. However I'm open to see what others say.

-- 
Best Regards
Andy Fan


Re: [PATCH] Add function to_oct

2023-08-15 Thread John Naylor
On Tue, Jul 25, 2023 at 05:16:56PM -0700, Nathan Bossart wrote:
> [v4]

If we're not going to have a general SQL conversion function, here are some
comments on the current patch.

+static char *convert_to_base(uint64 value, int base);

Not needed if the definition is above the callers.

+ * Workhorse for to_binary, to_oct, and to_hex.  Note that base must be
either
+ * 2, 8, or 16.

Why wouldn't it work with any base <= 16?

- *ptr = '\0';
+ Assert(base == 2 || base == 8 || base == 16);

+ *ptr = '\0';

Spurious whitespace change?

- char buf[32]; /* bigger than needed, but reasonable */
+ char   *buf = palloc(sizeof(uint64) * BITS_PER_BYTE + 1);

Why is this no longer allocated on the stack? Maybe needs a comment about
the size calculation.

+static char *
+convert_to_base(uint64 value, int base)

On my machine this now requires a function call and a DIV instruction, even
though the patch claims not to support anything but a power of two. It's
tiny enough to declare it inline so the compiler can specialize for each
call site.

+{ oid => '5101', descr => 'convert int4 number to binary',

Needs to be over 8000.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Extract numeric filed in JSONB more effectively

2023-08-15 Thread Pavel Stehule
út 15. 8. 2023 v 8:04 odesílatel Andy Fan  napsal:

>
>> My idea of an ideal solution is the introduction of the possibility to
>> use "any" pseudotype as return type with possibility to set default return
>> type. Now, "any" is allowed only for arguments. The planner can set the
>> expected type when it knows it, or can use the default type.
>>
>> so for extraction of jsonb field we can use FUNCTION
>> jsonb_extract_field(jsonb, text) RETURNS "any" DEFAULT jsonb
>>
>
>
Is this an existing framework or do you want to create something new?
>

This should be created


>
>> if we call SELECT jsonb_extract_field(..., 'x') -> then it returns jsonb,
>> if we use SELECT jsonb_extract_field('...', 'x')::date, then it returns date
>>
>
> If so, what is the difference from the current  jsonb->'f'   and
> (jsonb->'f' )::date?
>

a) effectiveness. The ending performance should be similar like your
current patch, but without necessity to use planner support API.

b) more generic usage. For example, the expressions in plpgsql are executed
a little bit differently than SQL queries. So there the optimization from
your patch probably should not work, because you can write only var :=
j->'f', and plpgsql forces cast function execution, but not via planner.

c) nothing else. It should not to require to modify cast function
definitions



>> With this possibility we don't need to touch to cast functions, and we
>> can simply implement similar functions for other non atomic types.
>>
>
> What do you mean by "atomic type" here?   If you want to introduce some
> new framework,  I think we need a very clear benefit.
>

Atomic types (skalar types like int, varchar, date), nonatomic types -
array, composite, xml, jsonb, hstore or arrays of composite types.



>
> --
> Best Regards
> Andy Fan
>


Re: A Question about InvokeObjectPostAlterHook

2023-08-15 Thread Michael Paquier
On Tue, Apr 18, 2023 at 01:34:00PM +0900, Michael Paquier wrote:
> Note that the development of PostgreSQL 16 has just finished, so now
> may not be the best moment to add these extra AOT calls, but these
> could be added in 17~ for sure at the beginning of July once the next
> development cycle begins.

The OAT hooks are added in ALTER TABLE for the following subcommands:
- { ENABLE | DISABLE | [NO] FORCE } ROW LEVEL SECURITY
- { ENABLE | DISABLE } TRIGGER
- { ENABLE | DISABLE } RULE

> Attached would be what I think would be required to add OATs for RLS,
> triggers and rules, for example.  There are much more of these at
> quick glance, still that's one step in providing more checks.  Perhaps
> you'd like to expand this patch with more ALTER TABLE subcommands
> covered?

Now that we are at the middle of the development cycle of 17~, it is
time to come back to this one (it was registered in the CF, but I did
not come back to it).  Would there be any objections if I apply this
patch with its tests?  This would cover most of the ground requested
by Legs at the start of this thread.

(The patch had one diff because of a namespace lookup not happening
anymore, so rebased.)
--
Michael
From b5bf282bdf68689670979846a963b2f11ae6d077 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 15 Aug 2023 15:47:30 +0900
Subject: [PATCH v2] Add OAT calls for more flavors of ALTER TABLE

---
 src/backend/commands/tablecmds.c  |  12 ++
 src/test/modules/test_oat_hooks/Makefile  |   2 +-
 .../test_oat_hooks/expected/alter_table.out   | 163 ++
 src/test/modules/test_oat_hooks/meson.build   |   1 +
 .../test_oat_hooks/sql/alter_table.sql|  48 ++
 5 files changed, 225 insertions(+), 1 deletion(-)
 create mode 100644 src/test/modules/test_oat_hooks/expected/alter_table.out
 create mode 100644 src/test/modules/test_oat_hooks/sql/alter_table.sql

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 727f151750..f77de4e7c9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -14843,6 +14843,9 @@ ATExecEnableDisableTrigger(Relation rel, const char *trigname,
 	EnableDisableTrigger(rel, trigname, InvalidOid,
 		 fires_when, skip_system, recurse,
 		 lockmode);
+
+	InvokeObjectPostAlterHook(RelationRelationId,
+			  RelationGetRelid(rel), 0);
 }
 
 /*
@@ -14855,6 +14858,9 @@ ATExecEnableDisableRule(Relation rel, const char *rulename,
 		char fires_when, LOCKMODE lockmode)
 {
 	EnableDisableRule(rel, rulename, fires_when);
+
+	InvokeObjectPostAlterHook(RelationRelationId,
+			  RelationGetRelid(rel), 0);
 }
 
 /*
@@ -16134,6 +16140,9 @@ ATExecSetRowSecurity(Relation rel, bool rls)
 	((Form_pg_class) GETSTRUCT(tuple))->relrowsecurity = rls;
 	CatalogTupleUpdate(pg_class, >t_self, tuple);
 
+	InvokeObjectPostAlterHook(RelationRelationId,
+			  RelationGetRelid(rel), 0);
+
 	table_close(pg_class, RowExclusiveLock);
 	heap_freetuple(tuple);
 }
@@ -16160,6 +16169,9 @@ ATExecForceNoForceRowSecurity(Relation rel, bool force_rls)
 	((Form_pg_class) GETSTRUCT(tuple))->relforcerowsecurity = force_rls;
 	CatalogTupleUpdate(pg_class, >t_self, tuple);
 
+	InvokeObjectPostAlterHook(RelationRelationId,
+			  RelationGetRelid(rel), 0);
+
 	table_close(pg_class, RowExclusiveLock);
 	heap_freetuple(tuple);
 }
diff --git a/src/test/modules/test_oat_hooks/Makefile b/src/test/modules/test_oat_hooks/Makefile
index 2b5d2687f8..dcaf3a7727 100644
--- a/src/test/modules/test_oat_hooks/Makefile
+++ b/src/test/modules/test_oat_hooks/Makefile
@@ -6,7 +6,7 @@ OBJS = \
 	test_oat_hooks.o
 PGFILEDESC = "test_oat_hooks - example use of object access hooks"
 
-REGRESS = test_oat_hooks
+REGRESS = test_oat_hooks alter_table
 
 # disable installcheck for now
 NO_INSTALLCHECK = 1
diff --git a/src/test/modules/test_oat_hooks/expected/alter_table.out b/src/test/modules/test_oat_hooks/expected/alter_table.out
new file mode 100644
index 00..19adb28ffb
--- /dev/null
+++ b/src/test/modules/test_oat_hooks/expected/alter_table.out
@@ -0,0 +1,163 @@
+--
+-- OAT checks for ALTER TABLE
+--
+-- This test script fails if debug_discard_caches is enabled, because cache
+-- flushes cause extra calls of the OAT hook in recomputeNamespacePath,
+-- resulting in more NOTICE messages than are in the expected output.
+SET debug_discard_caches = 0;
+LOAD 'test_oat_hooks';
+SET test_oat_hooks.audit = true;
+NOTICE:  in object_access_hook_str: superuser attempting alter (subId=0x1000, set) [test_oat_hooks.audit]
+NOTICE:  in object_access_hook_str: superuser finished alter (subId=0x1000, set) [test_oat_hooks.audit]
+NOTICE:  in process utility: superuser finished SET
+CREATE SCHEMA test_oat_schema;
+NOTICE:  in process utility: superuser attempting CREATE SCHEMA
+NOTICE:  in object access: superuser attempting create (subId=0x0) [explicit]
+NOTICE:  in object access: superuser finished create (subId=0x0) [explicit]
+NOTICE:  in process utility: 

pg_logical_emit_message() misses a XLogFlush()

2023-08-15 Thread Michael Paquier
Hi all,

While playing with pg_logical_emit_message() and WAL replay, I have
noticed that LogLogicalMessage() inserts a record but forgets to make
sure that the record has been flushed.  So, for example, if the system
crashes the message inserted can get lost.

I was writing some TAP tests for it for the sake of a bug, and I have
found this the current behavior annoying because one cannot really
rely on it when emulating crashes.

This has been introduced in 3fe3511 (from 2016), and there is no
mention of that on the original thread that led to this commit:
https://www.postgresql.org/message-id/flat/5685F999.6010202%402ndquadrant.com

This could be an issue for anybody using LogLogicalMessage() out of
core, as well, because it would mean some records lost.  So, perhaps
this should be treated as a bug, sufficient for a backpatch?

Thoughts?
--
Michael
diff --git a/src/backend/replication/logical/message.c b/src/backend/replication/logical/message.c
index c5de14afc6..cebe6c5d05 100644
--- a/src/backend/replication/logical/message.c
+++ b/src/backend/replication/logical/message.c
@@ -47,6 +47,7 @@ LogLogicalMessage(const char *prefix, const char *message, size_t size,
   bool transactional)
 {
 	xl_logical_message xlrec;
+	XLogRecPtr	lsn;
 
 	/*
 	 * Force xid to be allocated if we're emitting a transactional message.
@@ -71,7 +72,11 @@ LogLogicalMessage(const char *prefix, const char *message, size_t size,
 	/* allow origin filtering */
 	XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN);
 
-	return XLogInsert(RM_LOGICALMSG_ID, XLOG_LOGICAL_MESSAGE);
+	lsn = XLogInsert(RM_LOGICALMSG_ID, XLOG_LOGICAL_MESSAGE);
+
+	/* Make sure that the message hits disk before leaving */
+	XLogFlush(lsn);
+	return lsn;
 }
 
 /*


signature.asc
Description: PGP signature


Re: proposal: jsonb_populate_array

2023-08-15 Thread Pavel Stehule
út 15. 8. 2023 v 8:04 odesílatel Vik Fearing 
napsal:

> On 8/15/23 07:53, Pavel Stehule wrote:
> > út 15. 8. 2023 v 7:48 odesílatel Vik Fearing 
> > napsal:
> >
> >> On 8/14/23 15:37, Pavel Stehule wrote:
> >>> po 14. 8. 2023 v 15:09 odesílatel Erik Rijkers  napsal:
> >>>
> >>> I think so this can be +/- 40 lines of C code
> >>
> >> It seems to me like a good candidate for an extension.
> >
> > Unfortunately, these small extensions have zero chance to be available
> for
> > users that use some cloud postgres.
>
> Then those people can use the Standard SQL syntax.  I am strongly
> against polluting PostgreSQL because of what third party vendors do and
> do not allow on their platforms.
>

ok


> --
> Vik Fearing
>
>


Re: proposal: jsonb_populate_array

2023-08-15 Thread Vik Fearing

On 8/15/23 07:53, Pavel Stehule wrote:

út 15. 8. 2023 v 7:48 odesílatel Vik Fearing 
napsal:


On 8/14/23 15:37, Pavel Stehule wrote:

po 14. 8. 2023 v 15:09 odesílatel Erik Rijkers  napsal:

I think so this can be +/- 40 lines of C code


It seems to me like a good candidate for an extension.


Unfortunately, these small extensions have zero chance to be available for
users that use some cloud postgres.


Then those people can use the Standard SQL syntax.  I am strongly 
against polluting PostgreSQL because of what third party vendors do and 
do not allow on their platforms.

--
Vik Fearing





Re: Extract numeric filed in JSONB more effectively

2023-08-15 Thread Andy Fan
>
>
> My idea of an ideal solution is the introduction of the possibility to use
> "any" pseudotype as return type with possibility to set default return
> type. Now, "any" is allowed only for arguments. The planner can set the
> expected type when it knows it, or can use the default type.
>
> so for extraction of jsonb field we can use FUNCTION
> jsonb_extract_field(jsonb, text) RETURNS "any" DEFAULT jsonb
>

Is this an existing framework or do you want to create something new?

>
> if we call SELECT jsonb_extract_field(..., 'x') -> then it returns jsonb,
> if we use SELECT jsonb_extract_field('...', 'x')::date, then it returns date
>

If so, what is the difference from the current  jsonb->'f'   and
(jsonb->'f' )::date?

>
> With this possibility we don't need to touch to cast functions, and we can
> simply implement similar functions for other non atomic types.
>

What do you mean by "atomic type" here?   If you want to introduce some new
framework,  I think we need a very clear benefit.

-- 
Best Regards
Andy Fan


Re: ECPG Semantic Analysis

2023-08-15 Thread Michael Meskes
Hi,

sorry for the double reply, but it seems my mail client's default is
now to send to the list only when hitting group reply. Not good, sigh.

> I have a modified version of ECPG, to which I gave the ability to
> do semantic analysis of SQL statements. Where i can share it or with
> whom can I discuss it?

Feel free to send it to me.

Thanks,
Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De
Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org




Re: proposal: jsonb_populate_array

2023-08-15 Thread Pavel Stehule
út 15. 8. 2023 v 5:12 odesílatel jian he 
napsal:

> \df jsonb_populate_record
>  List of functions
>Schema   | Name  | Result data type | Argument data
> types | Type
>
> +---+--+-+--
>  pg_catalog | jsonb_populate_record | anyelement   | anyelement,
> jsonb   | func
> (1 row)
>
> manual:
> > anyelement  Indicates that a function accepts any data type.
> > For the “simple” family of polymorphic types, the matching and deduction
> rules work like this:
> > Each position (either argument or return value) declared as anyelement
> is allowed to have any specific actual data type, but in any given call
> they must all be the same actual type.
>
> So jsonb_populate_record signature can handle cases like
> jsonb_populate_record(anyarray, jsonb)? obviously this is a cast, it
> may fail.
> also if input is anyarray, so the output anyarray will have the same
> base type as input anyarray.
>

It fails (what is expected - else be too strange to use function in name
"record" for arrays)

 (2023-08-15 07:57:40) postgres=# select
jsonb_populate_record(null::varchar[], '[1,2,3]');
ERROR:  first argument of jsonb_populate_record must be a row type

regards

Pavel