date:20240126

Re: proposal: psql: show current user in prompt

2024-01-26 Thread Pavel Stehule

so 27. 1. 2024 v 0:04 odesílatel Jelte Fennema-Nio 
napsal:

> On Fri, 26 Jan 2024 at 21:35, Pavel Stehule 
> wrote:
> > I see a possibility of disabling reporting as possibly dangerous.
> Without reporting encoding you can break psql. So it should be limited just
> to few values where is known behave.
>
> I agree that client_encoding is a GUC that likely all clients would
> want to request reporting for, so I can see the argument for always
> sending it. But I wouldn't call it dangerous for a client to be able
> to disable reporting for it. Ultimately the client is the one that
> decides. A client might just as well completely ignore the reported
> value.
>

Until now, it is not possible to disable reporting. So clients can expect
so reporting is workable.

Do you have a use case, when disabling some of the defaultly reported GUC
makes sense?

My patch doesn't protect these GUC, and now I think it is a mistake.

>
> >> While I agree it's a little bit less friendly, I think you're
> >> overestimating the difficulty of using my proposed approach. Most
> >> importantly there's no need to parse the current GUC value. A client
> >> always knows what variables it wants to have reported. So anytime that
> >> changes the client can simply regenerate the full list of gucs that it
> >> wants to report and send that. So something similar to the following
> >> pseudo code (using += for string concatenation):
> >
> >
> > I disagree with this - I can imagine some proxies add some own reported
> GUC and the client can know nothing about it.
>
> I've definitely thought about this case, since it's the main case I
> care about as maintainer of PgBouncer. And a client wouldn't need to
> know about the extra GUCs that the proxy requires for the proxy to
> work correctly. A proxy can quite simply handle this itself in the
> following manner: Whenever a client sends a ParameterSet for
> _pq_.report_parameters, the proxy could forward to the server after
> prepending its own extra GUCs at the front. The proxy wouldn't even
> need to parse the list from the client to be able to do that. An even
> better behaving proxy, should parse the list of GUCs though and would
> only forward the ParameterStatus messages that it receives from the
> server if the client requested ParameterStatus updates for them.
>

yes, inside gradual connect you can enhance the list of custom reported GUC
easily.

but for use cases like prompt in psql, I need to enable, disable reporting
- but this use case should be independent of "protected" connection related
GUC reporting.

For example - when I disable %N, I can disable reporting "role" and disable
showing role in prompt. But when "role" should be necessary for pgbouncer,
then surely the disabling reporting should be ignored. The user by setting
a prompt should not break communication.  And it can be ignored without any
issue, there is not performance issue, because "role" is still necessary
for pgbouncer that is used for connection. Without described behaviour we
should not implement controlling reporting to psql, because there can be a
lot of unhappy side effects in dependency if the user set or unset custom
prompt or some other future feature.

Re: cleanup patches for incremental backup

2024-01-26 Thread Alexander Lakhin


Hello Robert,

26.01.2024 21:37, Robert Haas wrote:

On Fri, Jan 26, 2024 at 12:39 PM Nathan Bossart
  wrote:

On Fri, Jan 26, 2024 at 11:04:37AM -0500, Robert Haas wrote:

Here is v2 with that addition.

Looks reasonable.

Thanks for the report & review. I have committed that version.


While trying to reproduce the 002_blocks test failure, I've encountered
another anomaly (or two):
make -s check -C src/bin/pg_walsummary/ PROVE_TESTS="t/002*" 
PROVE_FLAGS="--timer"
# +++ tap check in src/bin/pg_walsummary +++
[05:40:38] t/002_blocks.pl .. # poll_query_until timed out executing this query:
# SELECT EXISTS (
# SELECT * from pg_available_wal_summaries()
# WHERE tli = 0 AND end_lsn > '0/0'
# )
#
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
[05:40:38] t/002_blocks.pl .. ok   266739 ms ( 0.00 usr  0.01 sys + 17.51 cusr 
26.79 csys = 44.31 CPU)
[05:45:05]
All tests successful.
Files=1, Tests=3, 267 wallclock secs ( 0.02 usr  0.02 sys + 17.51 cusr 26.79 
csys = 44.34 CPU)
Result: PASS

It looks like the test may call pg_get_wal_summarizer_state() when
WalSummarizerCtl->initialized is false yet, i. e. before the first
GetOldestUnsummarizedLSN() call.
I could reproduce the issue easily (within 10 test runs) with
pg_usleep(10L);
added inside WalSummarizerMain() just below:
sigprocmask(SIG_SETMASK, , NULL);

But the fact that the test passes regardless of the timeout, make me
wonder, whether any test should fail when such timeout occurs?

Best regards,
Alexander

Re: [ psql - review request ] review request for \d+ tablename, \d+ indexname indenting

2024-01-26 Thread vignesh C

On Wed, 22 Nov 2023 at 21:54, Daniel Verite  wrote:
>
> Shlok Kyal wrote:
>
> > > The error was corrected and a new diff file was created.
> > > The diff file was created based on 16 RC1.
> > > We confirmed that 5 places where errors occurred when performing
> > > make check were changed to ok.
>
> Reviewing the patch, I see these two problems in the current version
> (File: psql-slashDplus-partition-indentation.diff, Date: 2023-09-19 00:19:34)
>
> * There are changes in the regression tests that do not concern this
> feature and should not be there.
>
> For instance this hunk:
>
> --- a/src/test/regress/expected/foreign_data.out
> +++ b/src/test/regress/expected/foreign_data.out
> @@ -742,8 +742,6 @@ COMMENT ON COLUMN ft1.c1 IS 'ft1.c1';
>  Check constraints:
>  "ft1_c2_check" CHECK (c2 <> ''::text)
>  "ft1_c3_check" CHECK (c3 >= '01-01-1994'::date AND c3 <=
> '01-31-1994'::date)
> -Not-null constraints:
> -"ft1_c1_not_null" NOT NULL "c1"
>  Server: s0
>  FDW options: (delimiter ',', quote '"', "be quoted" 'value')
>
> It seems to undo a test for a recent feature adding "Not-null
> constraints" to \d, which suggests that you've been running tests
> against and older version than the source tree you're diffing
> against. These should be the same version, and also the latest
> one (git HEAD) or as close as possible to the latest when the
> patch is submitted.
>
> * The new query with \d on partitioned tables does not work with
>   Postgres servers 12 or 13:
>
>
> postgres=# CREATE TABLE measurement (
> city_id int not null,
> logdate date not null,
> peaktempint,
> unitsales   int
> ) PARTITION BY RANGE (logdate);
>
> postgres=# \d measurement
> ERROR:  syntax error at or near "."
> LINE 2: ...  0 AS level,c.relkind, false AS i.inhdetach...
>
>
> Setting the CommitFest status to WoA.

I have changed the status of the CommitFest entry to "Returned with
Feedback" as Shlok's and Daniel's suggestions are not handled. Feel
free to address them and add a new commitfest entry for the same.

Regards,
Vignesh

Re: MultiXact\SLRU buffers configuration

2024-01-26 Thread Andrey Borodin

> On 20 Jan 2024, at 08:31, vignesh C  wrote:
> 
> On Mon, 9 Jan 2023 at 09:49, Andrey Borodin  wrote:
>> 
>> On Tue, Jan 3, 2023 at 5:02 AM vignesh C  wrote:
>>> does not apply on top of HEAD as in [1], please post a rebased patch:
>>> 
>> Thanks! Here's the rebase.
> 
> I'm seeing that there has been no activity in this thread for more
> than 1 year now, I'm planning to close this in the current commitfest
> unless someone is planning to take it forward.

Hi Vignesh,

thanks for the ping! Most important parts of this patch set are discussed in 
[0]. If that patchset will be committed, I'll withdraw entry for this thread 
from commitfest.
There's a version of Multixact-specific optimizations [1], but I hope they will 
not be necessary with effective caches developed in [0]. It seems to me that 
most important part of those optimization is removing sleeps under SLRU lock on 
standby [2] by Kyotaro Horiguchi. But given that cache optimizations took 4 
years to get closer to commit, I'm not sure we will get this optimization any 
time soon...

Thanks!

Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/flat/CAFiTN-vzDvNz%3DExGXz6gdyjtzGixKSqs0mKHMmaQ8sOSEFZ33A%40mail.gmail.com
[1] 
https://www.postgresql.org/message-id/flat/2ECE132B-C042-4489-930E-DBC5D0DAB84A%40yandex-team.ru#5f7e7022647be9eeecfc2ae75d765500
[2] 
https://www.postgresql.org/message-id/flat/20200515.090333.24867479329066911.horikyota.ntt%40gmail.com#855f8bb7205890579a363d2344b4484d

Re: A performance issue with Memoize

2024-01-26 Thread Alexander Lakhin


Hello,

27.01.2024 00:09, Tom Lane wrote:

David Rowley  writes:

On Sat, 27 Jan 2024 at 09:41, Tom Lane  wrote:

drongo and fairywren are consistently failing the test case added
by this commit.  I'm not quite sure why the behavior of Memoize
would be platform-specific when we're dealing with integers,
but ...

Maybe snprintf(buf, "%.*f", 0, 5.0 / 2.0); results in "3" on those
rather than "2"?
Looking at the code in fmtfloat(), we fallback on the built-in snprintf.

Maybe ... I don't have a better theory.


FWIW, I've found where this behaviour is documented:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/sprintf-sprintf-l-swprintf-swprintf-l-swprintf-l?view=msvc-170

(I've remembered a case with test/sql/partition_prune from 2020, where
sprintf on Windows worked the other way.)


Best regards,
Alexander

Re: Synchronizing slots from primary to standby

2024-01-26 Thread Amit Kapila

On Thu, Jan 25, 2024 at 6:42 PM Zhijie Hou (Fujitsu)
 wrote:
>
> Here is the V69 patch set which includes the following changes.
>
> V69-0001, V69-0002
>

Few minor comments on v69-0001
1. In libpqrcv_create_slot(), I see we are using two types of syntaxes
based on 'use_new_options_syntax' (aka server_version >= 15) whereas
this new 'failover' option doesn't follow that. What is the reason of
the same? I thought it is because older versions anyway won't support
this option. However, I guess we should follow the syntax of the old
server and let it error out. BTW, did you test this patch with old
server versions (say < 15 and >=15) by directly using replication
commands, if so, what is the behavior of same?

2.
  }
-
+ if (failover)
+ appendStringInfoString(, "FAILOVER, ");

Spurious line removal. Also, to follow a coding pattern similar to
nearby code, let's have one empty line after handling of failover.

3.
+/* ALTER_REPLICATION_SLOT slot */
+alter_replication_slot:
+ K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')'

I think it would be better if we follow the create style by specifying
syntax in comments as that can make the code easier to understand
after future extensions to this command if any. See
create_replication_slot:
/* CREATE_REPLICATION_SLOT slot [TEMPORARY] PHYSICAL [options] */
K_CREATE_REPLICATION_SLOT IDENT opt_temporary K_PHYSICAL create_slot_options

-- 
With Regards,
Amit Kapila.

Re: Reducing connection overhead in pg_upgrade compat check phase

2024-01-26 Thread vignesh C

On Fri, 27 Oct 2023 at 18:50, Daniel Gustafsson  wrote:
>
> Attached is a v10 rebase of this patch which had undergone significant bitrot
> due to recent changes in the pg_upgrade check phase.  This brings in the
> changes into the proposed structure without changes to queries, with no
> additional changes to the proposed functionality.
>
> Testing with a completely empty v11 cluster fresh from initdb as the old
> cluster shows a significant speedup (averaged over multiple runs, adjusted for
> outliers):
>
> patched:  53.59ms (52.78ms, 52.49ms, 55.49ms)
> master : 125.87ms (125.23 ms, 125.67ms, 126.67ms)
>
> Using a similarly empty cluster from master as the old cluster shows a smaller
> speedup, which is expected since many checks only run for older versions:
>
> patched: 33.36ms (32.82ms, 33.78ms, 33.47ms)
> master : 44.87ms (44.73ms, 44.90ms 44.99ms)
>
> The latter case is still pretty interesting IMO since it can speed up testing
> where every millisecond gained matters.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
55627ba2d334ce98e1f5916354c46472d414bda6 ===
=== applying patch
./v10-0001-pg_upgrade-run-all-data-type-checks-per-connecti.patch
patching file src/bin/pg_upgrade/check.c
Hunk #2 FAILED at 24.
...
1 out of 7 hunks FAILED -- saving rejects to file src/bin/pg_upgrade/check.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4200.log

Regards,
Vignesh

Re: Improving btree performance through specializing by key shape, take 2

2024-01-26 Thread vignesh C

On Mon, 30 Oct 2023 at 21:50, Matthias van de Meent
 wrote:
>
> On Thu, 26 Oct 2023 at 00:36, Peter Geoghegan  wrote:
> > Most of the work with something like
> > this is the analysis of the trade-offs, not writing code. There are
> > all kinds of trade-offs that you could make with something like this,
> > and the prospect of doing that myself is kind of daunting. Ideally
> > you'd have made a significant start on that at this point.
>
> I believe I'd already made most trade-offs clear earlier in the
> threads, along with rationales for the changes in behaviour. But here
> goes again:
>
> _bt_compare currently uses index_getattr() on each attribute of the
> key. index_getattr() is O(n) for the n-th attribute if the index tuple
> has any null or non-attcacheoff attributes in front of the current
> one. Thus, _bt_compare costs O(n^2) work with n=the number of
> attributes, which can cost several % of performance, but is very very
> bad in extreme cases, and doesO(n) calls to opclass-supplied compare
> operations.
>
> To solve most of the O(n) compare operations, we can optimize
> _bt_compare to only compare "interesting" attributes, i.e. we can
> apply "dynamic prefix truncation". This is implemented by patch 0001.
> This is further enhanced with 0002, where we skip the compare
> operations if the HIKEY is the same as the right separator of the
> downlink we followed (due to our page split code, this case is
> extremely likely).
>
> However, the above only changes the attribute indexing code in
> _bt_compare to O(n) for at most about 76% of the index tuples on the
> page (1 - (2 / log2(max_tuples_per_page))), while the other on average
> 20+% of the compare operations still have to deal with the O(n^2)
> total complexity of index_getattr.
> To fix this O(n^2) issue (the issue this thread was originally created
> for) the approach I implemented originally is to not use index_getattr
> but an "attribute iterator" that incrementally extracts the next
> attribute, while keeping track of the current offset into the tuple,
> so each next attribute would be O(1). That is implemented in the last
> patches of the patchset.
>
> This attribute iterator approach has an issue: It doesn't perform very
> well for indexes that make full use of attcacheoff. The bookkeeping
> for attribute iteration proved to be much more expensive than just
> reading attcacheoff from memory. This is why the latter patches
> (patchset 14 0003+) adapt the btree code to generate different paths
> for different "shapes" of key index attributes, to allow the current
> attcacheoff code to keep its performance, but to get higher
> performance for indexes where the attcacheoff optimization can not be
> applied. In passing, it also specializes the code for single-attribute
> indexes, so that they don't have to manage branching code, increasing
> their performance, too.
>
> TLDR:
> The specialization in 0003+ is applied because index_getattr is good
> when attcacheoff applies, but very bad when it doesn't. Attribute
> iteration is worse than index_getattr when attcacheoff applies, but is
> significantly better when attcacheoff does not work. By specializing
> we get the best of both worlds.
>
> The 0001 and 0002 optimizations were added later to further remove
> unneeded calls to the btree attribute compare functions, thus further
> reducing the total time spent in _bt_compare.
>
> Anyway.
>
> PFA v14 of the patchset. v13's 0001 is now split in two, containing
> prefix truncation in 0001, and 0002 containing the downlink's right
> separator/HIKEY optimization.
>
> Performance numbers (data attached):
> 0001 has significant gains in multi-column indexes with shared
> prefixes, where the prefix columns are expensive to compare, but
> otherwise doesn't have much impact.
> 0002 further improves performance across the board, but again mostly
> for indexes with expensive compare operations.
> 0007 sees performance improvements almost across the board, with only
> the 'ul' and 'tnt' indexes getting some worse results than master (but
> still better average results),
>
> All patches applied, per-index average performance improvements on 15
> runs range from 3% to 290% across the board for INSERT benchmarks, and
> -2.83 to 370% for REINDEX.
>
> Configured with autoconf: config.log:
> > It was created by PostgreSQL configure 17devel, which was
> > generated by GNU Autoconf 2.69.  Invocation command line was
> >
> >   $ ./configure --enable-tap-tests --enable-depend --with-lz4 --with-zstd 
> > COPT=-ggdb -O3 --prefix=/home/matthias/projects/postgresql/pg_install 
> > --no-create --no-recursion
>
> Benchmark was done on 1m random rows of the pp-complete dataset, as
> found on UK Gov's S3 bucket [0]: using a parallel and threaded
> downloader is preferred because the throughput is measured in kBps per
> client.
>
> I'll do a few runs on the full dataset of 29M rows soon too, but
> master's performance is so bad for the 'worstcase' index that I can't
> finish its

Re: User functions for building SCRAM secrets

2024-01-26 Thread vignesh C

On Sat, 2 Dec 2023 at 12:22, John Naylor  wrote:
>
> On Wed, Mar 22, 2023 at 9:06 PM Jonathan S. Katz  wrote:
> >
> > Maybe I'm not conveying the problem this is solving -- I'm happy to go
> > one more round trying to make it clearer -- but if this is not clear,
> > it'd be good to at develop an alternative approach to this before
> > withdrawing the patch.
>
> This thread had some lively discussion, but it doesn't seem to have
> converged towards consensus, and hasn't had activity since April. That
> being the case, maybe it's time to withdraw and reconsider the
> approach later?

I have changed the status of this commitfest entry to "Returned with
Feedback" as currently nobody pursued the discussion to get a
conclusion. Feel free to discuss more on this and once it reaches a
better shape, add a new entry for this to take it forward.

Regards,
Vignesh

Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs

2024-01-26 Thread vignesh C

On Tue, 2 Jan 2024 at 20:28, Daniel Verite  wrote:
>
>   Hi,
>
> PFA a rebased version.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
a3a836fb5e51183eae624d43225279306c2285b8 ===
=== applying patch
./v5-0001-Implement-retrieval-of-results-in-chunks-with-lib.patch
patching file doc/src/sgml/libpq.sgml
...
patching file src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
...
patching file src/interfaces/libpq/exports.txt
Hunk #1 FAILED at 191.
1 out of 1 hunk FAILED -- saving rejects to file
src/interfaces/libpq/exports.txt.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4233.log

Regards,
Vignesh

Re: Support run-time partition pruning for hash join

2024-01-26 Thread vignesh C

On Tue, 7 Nov 2023 at 13:25, Richard Guo  wrote:
>
>
> On Mon, Nov 6, 2023 at 11:00 PM Alexander Lakhin  wrote:
>>
>> Please look at a warning and an assertion failure triggered by the
>> following script:
>> set parallel_setup_cost = 0;
>> set parallel_tuple_cost = 0;
>> set min_parallel_table_scan_size = '1kB';
>>
>> create table t1 (i int) partition by range (i);
>> create table t1_1 partition of t1 for values from (1) to (2);
>> create table t1_2 partition of t1 for values from (2) to (3);
>> insert into t1 values (1), (2);
>>
>> create table t2(i int);
>> insert into t2 values (1), (2);
>> analyze t1, t2;
>>
>> select * from t1 right join t2 on t1.i = t2.i;
>>
>> 2023-11-06 14:11:37.398 UTC|law|regression|6548f419.392cf5|WARNING:  Join 
>> partition pruning $0 has not been performed yet.
>> TRAP: failed Assert("node->as_prune_state"), File: "nodeAppend.c", Line: 
>> 846, PID: 3747061
>
>
> Thanks for the report!  I failed to take care of the parallel-hashjoin
> case, and I have to admit that it's not clear to me yet how we should do
> join partition pruning in that case.
>
> For now I think it's better to just avoid performing join partition
> pruning for parallel hashjoin, so that the patch doesn't become too
> complex for review.  We can always extend it in the future.
>
> I have done that in v5.  Thanks for testing!

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
924d046dcf55887c98a1628675a30f4b0eebe556 ===
=== applying patch
./v5-0001-Support-run-time-partition-pruning-for-hash-join.patch
...
patching file src/include/nodes/plannodes.h
...
patching file src/include/optimizer/cost.h
Hunk #1 FAILED at 211.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/optimizer/cost.h.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4512.log

Regards,
Vignesh

Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx

2024-01-26 Thread vignesh C

On Fri, 19 Jan 2024 at 10:50, Michael Paquier  wrote:
>
> On Thu, Jan 18, 2024 at 09:25:16AM +0100, Anthonin Bonnefoy wrote:
> > From what I can tell, there's no change in the behaviour. All paths
> > would eventually go through HandleSlashCmds's cleaning logic. This is
> > also mentioned in ignore_slash_options's comment.
>
> Yeah, I can confirm that.  I would be really tempted to backpatch that
> because that's a bug: we have to call ignore_slash_options() for
> inactive branches when a command parses options with OT_NORMAL.  Now,
> I cannot break things, either.
>
> > Done. patch 1 adds ignore_slash_options to bind. patch 2 adds the new
> > \bindx, \close and \parse commands.
>
> 0001 has been applied on HEAD.

Since the 0001 patch has been applied, sending only 0002 as v5-0001 so
that CFBot can apply and run.

Regards,
Vignesh
From fed9e757c793815251fe41069cc8f01a5f8aacb8 Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy 
Date: Thu, 18 Jan 2024 08:46:33 +0100
Subject: [PATCH v5] psql: Add support for prepared stmt with extended protocol

Currently, only unnamed prepared statement is supported by psql with the
\bind command so it's not possible to test named statement creation and
execution through extended protocol.

This commit introduces three additional commands: \parse, \bindx and
\close.
\parse creates a prepared statement using extended protocol.
\bindx binds and execute an existing prepared statement using extended
protocol.
\close closes an existing prepared statement using extended protocol.
---
 doc/src/sgml/ref/psql-ref.sgml |  86 +
 src/bin/psql/command.c | 115 +
 src/bin/psql/common.c  |  28 ++-
 src/bin/psql/help.c|   4 +
 src/bin/psql/settings.h|   6 ++
 src/bin/psql/tab-complete.c|   6 +-
 src/test/regress/expected/psql.out |  49 
 src/test/regress/sql/psql.sql  |  27 +++
 8 files changed, 316 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index cc7d797159..b0f383ab2a 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -916,6 +916,35 @@ INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value' 'second value' \g

   
 
+  
+   \bindx prepared_statement_name [ parameter ] ... 
+
+   
+
+\bindx is equivalent to \bind,
+ except that it takes the name of an explicit existing prepared
+ statement as first parameter.
+
+
+
+ Example:
+
+INSERT INTO tbls1 VALUES ($1, $2) \parse stmt1
+\bindx stmt1 'first value' 'second value' \g
+
+
+
+
+ This command causes the extended query protocol (see ) to be used, unlike normal
+ psql operation, which uses the simple
+ query protocol. So this command can be useful to test the extended
+ query protocol from psql.
+
+
+   
+  
+
   
 \c or \connect [ -reuse-previous=on|off ] [ dbname [ username ] [ host ] [ port ] | conninfo ]
 
@@ -1037,6 +1066,35 @@ INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value' 'second value' \g
 
   
 
+  
+   \close [ prepared_statement_name ] 
+
+   
+
+Closes the specified prepared statement. Without argument, closes the unnamed statement.
+If no prepared statement exists with this name, the operation is a no-op.
+
+
+
+ Example:
+
+SELECT $1 \parse stmt1
+\close stmt1
+
+
+
+
+ This command causes the extended query protocol (see ) to be used, unlike normal
+ psql operation, which uses the simple
+ query protocol. So this command can be useful to test the extended
+ query protocol from psql.
+
+
+   
+  
+
+
   
 \copy { table [ ( column_list ) ] }
 from
@@ -2779,6 +2837,34 @@ lo_import 152801
 
   
 
+  
+\parse prepared_statement_name
+
+
+Creates a prepared statement from the current query buffer
+
+
+
+ Example:
+
+SELECT $1 \parse stmt1
+
+
+
+
+ This command causes the extended query protocol (see ) to be used, unlike normal
+ psql operation, which uses the simple
+ query protocol. A 
+ message will be issued by this command so it can be useful to
+ test the extended query protocol from psql. This command affects
+ only the next query executed; all subsequent queries will use the
+ simple query protocol by default.
+
+
+
+  
+
   
 \password [ username ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5c906e4806..07d58ec66f 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -64,10 +64,14 @@ static backslashResult exec_command(const char *cmd,
 	PQExpBuffer

Re: Some revises in adding sorting path

2024-01-26 Thread vignesh C

On Mon, 17 Jul 2023 at 14:25, Richard Guo  wrote:
>
>
> On Wed, Mar 29, 2023 at 4:00 AM David Rowley  wrote:
>>
>> If you write some tests for this code, it will be useful to prove that
>> it actually does something, and also that it does not break again in
>> the future. I don't really want to just blindly copy the pattern used
>> in 3c6fc5820 for creating incremental sort paths if it's not useful
>> here. It would be good to see tests that make an Incremental Sort path
>> using the code you're changing.
>
>
> Thanks for the suggestion.  I've managed to come up with a query that
> gets the codes being changed here to perform an incremental sort.
>
> set min_parallel_index_scan_size to 0;
> set enable_seqscan to off;
>
> Without making those parallel paths:
>
> explain (costs off)
> select * from tenk1 where four = 2 order by four, hundred, 
> parallel_safe_volatile(thousand);
>   QUERY PLAN
> --
>  Incremental Sort
>Sort Key: hundred, (parallel_safe_volatile(thousand))
>Presorted Key: hundred
>->  Gather Merge
>  Workers Planned: 3
>  ->  Parallel Index Scan using tenk1_hundred on tenk1
>Filter: (four = 2)
> (7 rows)
>
> and with those parallel paths:
>
> explain (costs off)
> select * from tenk1 where four = 2 order by four, hundred, 
> parallel_safe_volatile(thousand);
>   QUERY PLAN
> ---
>  Gather Merge
>Workers Planned: 3
>->  Incremental Sort
>  Sort Key: hundred, (parallel_safe_volatile(thousand))
>  Presorted Key: hundred
>  ->  Parallel Index Scan using tenk1_hundred on tenk1
>Filter: (four = 2)
> (7 rows)
>
> I've added two tests for the code changes in create_ordered_paths in the
> new patch.
>
>>
>> Same for the 0003 patch.
>
>
> For the code changes in gather_grouping_paths, I've managed to come up
> with a query that makes an explicit Sort atop cheapest partial path.
>
> explain (costs off)
> select count(*) from tenk1 group by twenty, parallel_safe_volatile(two);
>  QUERY PLAN
> 
>  Finalize GroupAggregate
>Group Key: twenty, (parallel_safe_volatile(two))
>->  Gather Merge
>  Workers Planned: 4
>  ->  Sort
>Sort Key: twenty, (parallel_safe_volatile(two))
>->  Partial HashAggregate
>  Group Key: twenty, parallel_safe_volatile(two)
>  ->  Parallel Seq Scan on tenk1
> (9 rows)
>
> Without this logic the plan would look like:
>
> explain (costs off)
> select count(*) from tenk1 group by twenty, parallel_safe_volatile(two);
>  QUERY PLAN
> 
>  Finalize GroupAggregate
>Group Key: twenty, (parallel_safe_volatile(two))
>->  Sort
>  Sort Key: twenty, (parallel_safe_volatile(two))
>  ->  Gather
>Workers Planned: 4
>->  Partial HashAggregate
>  Group Key: twenty, parallel_safe_volatile(two)
>  ->  Parallel Seq Scan on tenk1
> (9 rows)
>
> This test is also added in the new patch.
>
> But I did not find a query that makes an incremental sort in this case.
> After trying for a while it seems to me that we do not need to consider
> incremental sort in this case, because for a partial path of a grouped
> or partially grouped relation, it is either unordered (HashAggregate or
> Append), or it has been ordered by the group_pathkeys (GroupAggregate).
> It seems there is no case that we'd have a partial path that is
> partially sorted.
>
> So update the patches to v2.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
f2bf8fb04886e3ea82e7f7f86696ac78e06b7e60 ===
...
=== applying patch
./v2-0002-Revise-how-we-sort-partial-paths-in-gather_grouping_paths.patch
patching file src/backend/optimizer/plan/planner.c
Hunk #1 succeeded at 7289 (offset -91 lines).
Hunk #2 FAILED at 7411.
1 out of 2 hunks FAILED -- saving rejects to file
src/backend/optimizer/plan/planner.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4119.log

Regards,
Vignesh

Re: Separate memory contexts for relcache and catcache

2024-01-26 Thread vignesh C

On Wed, 3 Jan 2024 at 16:56, Melih Mutlu  wrote:
>
> Hi,
>
> torikoshia , 4 Ara 2023 Pzt, 07:59 tarihinde şunu 
> yazdı:
>>
>> Hi,
>>
>> I also think this change would be helpful.
>>
>> I imagine you're working on the Andres's comments and you already notice
>> this, but v1 patch cannot be applied to HEAD.
>> For the convenience of other reviewers, I marked it 'Waiting on Author'.
>
>
> Thanks for letting me know. I rebased the patch. PFA new version.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
729439607ad210dbb446e31754e8627d7e3f7dda ===
=== applying patch
./v2-0001-Separate-memory-contexts-for-relcache-and-catcach.patch
patching file src/backend/utils/cache/catcache.c
...
Hunk #8 FAILED at 1933.
Hunk #9 succeeded at 2253 (offset 84 lines).
1 out of 9 hunks FAILED -- saving rejects to file
src/backend/utils/cache/catcache.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4554.log

Regards,
Vignesh

Re: POC, WIP: OR-clause support for indexes

2024-01-26 Thread vignesh C

On Tue, 5 Dec 2023 at 16:25, Andrei Lepikhov  wrote:
>
> Here is fresh version with the pg_dump.pl regex fixed. Now it must pass
> buildfarm.
>
> Under development:
> 1. Explanation of the general idea in comments (Robert's note)
> 2. Issue with hiding some optimizations (Alexander's note and example
> with overlapping clauses on two partial indexes)

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
6ce071f6b04d3fc836f436fa08108a6d11e2 ===
=== applying patch ./v14-1-0001-Transform-OR-clause-to-ANY-expressions.patch

patching file src/test/regress/expected/sysviews.out
Hunk #1 succeeded at 124 (offset 1 line).
Hunk #2 FAILED at 134.
1 out of 2 hunks FAILED -- saving rejects to file
src/test/regress/expected/sysviews.out.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4450.log

Regards,
Vignesh

Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

2024-01-26 Thread vignesh C

On Tue, 31 Oct 2023 at 10:48, Ashutosh Bapat
 wrote:
>
> On Thu, Sep 14, 2023 at 4:39 PM Ashutosh Bapat
>  wrote:
> >
> > The patch set is thus
> > 0001 - patch used to measure memory used during planning
> >
> > 0002 - Patch to free intermediate Relids computed by
> > adjust_child_relids_multilevel(). I didn't test memory consumption for
> > multi-level partitioning. But this is clear improvement. In that
> > function we free the AppendInfos array which as many pointers long as
> > the number of relids. So it doesn't make sense not to free the Relids
> > which can be {largest relid}/8 bytes long at least.
> >
> > 0003 - patch to save and reuse commuted RestrictInfo. This patch by
> > itself shows a small memory saving (3%) in the query below where the
> > same clause is commuted twice. The query does not contain any
> > partitioned tables.
> > create table t2 (a int primary key, b int, c int);
> > create index t2_a_b on t2(a, b);
> > select * from t2 where 10 = a
> > Memory used without patch: 13696 bytes
> > Memory used with patch: 13264 bytes
> >
> > 0004 - Patch which implements the hash table of hash table described
> > above and also code to avoid repeated RestrictInfo list translations.
> >
> > I will add this patchset to next commitfest.
> >
> > --
> > Best Wishes,
> > Ashutosh Bapat
>
> PFA rebased patches. Nothing changes in 0002, 0003 and 0004. 0001 is
> the squashed version of the latest patch set at
> https://www.postgresql.org/message-id/CAExHW5sCJX7696sF-OnugAiaXS=Ag95=-m1csrjcmyyj8pd...@mail.gmail.com.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
7014c9a4bba2d1b67d60687afb5b2091c1d07f73 ===
=== applying patch
./0001-Report-memory-used-for-planning-a-query-in--20231031.patch
...
patching file src/test/regress/expected/explain.out
Hunk #5 FAILED at 290.
Hunk #6 succeeded at 545 (offset 4 lines).
1 out of 6 hunks FAILED -- saving rejects to file
src/test/regress/expected/explain.out.rej
patching file src/tools/pgindent/typedefs.list
Hunk #1 succeeded at 1562 (offset 18 lines).

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4564.log

Regards,
Vignesh

Re: dblink query interruptibility

2024-01-26 Thread Noah Misch

On Thu, Jan 25, 2024 at 12:28:43PM +0900, Fujii Masao wrote:
> On Thu, Jan 25, 2024 at 5:45 AM Noah Misch  wrote:
> > On Thu, Jan 25, 2024 at 04:23:39AM +0900, Fujii Masao wrote:
> > > I found that this patch was committed at d3c5f37dd5 and changed the
> > > error message in postgres_fdw slightly. Here's an example:
> > >
> > > #1. Begin a new transaction.
> > > #2. Execute a query accessing to a foreign table, like SELECT * FROM
> > > 
> > > #3. Terminate the *remote* session corresponding to the foreign table.
> > > #4. Commit the transaction, and then currently the following error
> > > message is output.
> > >
> > > ERROR:  FATAL:  terminating connection due to administrator command
> > > server closed the connection unexpectedly
> > > This probably means the server terminated abnormally
> > > before or while processing the request.
> > >invalid socket
> > >
> > > Previously, before commit d3c5f37dd5, the error message at #4 did not
> > > include "invalid socket." Now, after the commit, it does.

Other clients have witnessed the extra "invalid socket" message:
https://dba.stackexchange.com/questions/335081/how-to-investigate-intermittent-postgres-connection-errors
https://stackoverflow.com/questions/77781358/pgbackrest-backup-error-with-exit-code-57
https://github.com/timescale/timescaledb/issues/102

> > > + /* Consume whatever data is available from the socket */
> > > + if (PQconsumeInput(conn) == 0)
> > > + {
> > > + /* trouble; expect PQgetResult() to return NULL */
> > > + break;
> > > + }
> > > + }
> > > +
> > > + /* Now we can collect and return the next PGresult */
> > > + return PQgetResult(conn);
> > >
> > > This code appears to cause the change. When the remote session ends,
> > > PQconsumeInput() returns 0 and marks conn->socket as invalid.
> > > Subsequent PQgetResult() calls pqWait(), detecting the invalid socket
> > > and appending "invalid socket" to the error message.

What do you think of making PQconsumeInput() set PGASYNC_READY and
CONNECTION_BAD in this case?  Since libpq appended "server closed the
connection unexpectedly", it knows those indicators are correct.  That way,
PQgetResult() won't issue a pointless pqWait() call.

> > > I think the "invalid socket" message is unsuitable in this scenario,
> > > and PQgetResult() should not be called after PQconsumeInput() returns
> > > 0. Thought?
> >
> > The documentation is absolute about the necessity of PQgetResult():
> 
> The documentation looks unclear to me regarding what should be done
> when PQconsumeInput() returns 0. So I'm not sure if PQgetResult()
> must be called even in that case.

I agree PQconsumeInput() docs don't specify how to interpret it returning 0.

> As far as I read some functions like libpqrcv_PQgetResult() that use
> PQconsumeInput(), it appears that they basically report the error message
> using PQerrorMessage(), without calling PQgetResult(),
> when PQconsumeInput() returns 0.

libpqrcv_PQgetResult() is part of walreceiver, where any ERROR becomes FATAL.
Hence, it can't hurt anything by eagerly skipping to ERROR.  I designed
libpqsrv_exec() to mimic PQexec() as closely as possible, so it would be a
drop-in replacement for arbitrary callers.  Ideally, accepting interrupts
would be the only caller-visible difference.

I know of three ways PQconsumeInput() can return 0, along with my untested
estimates of how they work:

a. Protocol violation.  handleSyncLoss() sets PGASYNC_READY and
   CONNECTION_BAD.  PQgetResult() is optional.

b. Connection broken.  PQgetResult() is optional.

c. ENOMEM.  PGASYNC_ and CONNECTION_ status don't change.  Applications choose
   among (c1) free memory and retry, (c2) close the connection, or (c3) call
   PQgetResult() to break protocol sync and set PGASYNC_IDLE:

Comparing PQconsumeInput() with the PQgetResult() block under "while
(conn->asyncStatus == PGASYNC_BUSY)", there's a key difference that
PQgetResult() sets PGASYNC_IDLE on most errors, including ENOMEM.  That
prevents PQexec() subroutine PQexecFinish() from busy looping on ENOMEM, but I
suspect that also silently breaks protocol sync.  While we could change it
from (c3) to (c2) by dropping the connection via handleSyncLoss() or
equivalent, I'm not confident about that being better.

libpqsrv_exec() implements (c3) by way of calling PQgetResult() after
PQconsumeInput() fails.  If PQisBusy(), the same ENOMEM typically will repeat,
yielding (c3).  If memory became available in that brief time, PQgetResult()
may instead block.  That blocking is unwanted but unimportant.

Re: Request for comment on setting binary format output per session

2024-01-26 Thread vignesh C

On Mon, 31 Jul 2023 at 21:58, Dave Cramer  wrote:
>
>
> Dave Cramer
>
>
> On Mon, 10 Jul 2023 at 03:56, Daniel Gustafsson  wrote:
>>
>> > On 25 Apr 2023, at 16:47, Dave Cramer  wrote:
>>
>> > Patch attached with comments removed
>>
>> This patch no longer applies, please submit a rebased version on top of HEAD.
>
>
> Rebased see attached

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
fba2112b1569fd001a9e54dfdd73fd3cb8f16140 ===
=== applying patch ./0001-Created-protocol.h.patch
patching file src/backend/access/common/printsimple.c
Hunk #1 succeeded at 22 with fuzz 2 (offset 1 line).
Hunk #2 FAILED at 33.
Hunk #3 FAILED at 66.
2 out of 3 hunks FAILED -- saving rejects to file
src/backend/access/common/printsimple.c.rej
patching file src/backend/access/transam/parallel.c
Hunk #1 succeeded at 34 (offset 1 line).
Hunk #2 FAILED at 1128.
Hunk #3 FAILED at 1138.
Hunk #4 FAILED at 1184.
Hunk #5 succeeded at 1205 (offset 4 lines).
Hunk #6 FAILED at 1218.
Hunk #7 FAILED at 1373.
Hunk #8 FAILED at 1551.
6 out of 8 hunks FAILED -- saving rejects to file
src/backend/access/transam/parallel.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_3777.log

Regards,
Vignesh

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-26 Thread vignesh C

On Mon, 8 Jan 2024 at 22:21, Tom Lane  wrote:
>
> Richard Guo  writes:
> > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane  wrote:
> >> Thanks for the report!  I guess we need something like the attached.
>
> > +1.
>
> Pushed, thanks for looking at it.

I have changed the status of the commitfest entry to "Committed" as I
noticed the patch has already been committed.

Regards,
Vignesh

Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

2024-01-26 Thread vignesh C

On Thu, 7 Dec 2023 at 05:41, fujii.y...@df.mitsubishielectric.co.jp
 wrote:
>
> Hi Mr.Haas.
>
> > -Original Message-
> > From: Robert Haas 
> > Sent: Wednesday, December 6, 2023 10:25 PM
> > On Wed, Dec 6, 2023 at 3:41 AM fujii.y...@df.mitsubishielectric.co.jp
> >  wrote:
> > > Are you concerned about the hassle and potential human errors of
> > > manually adding new partial aggregation functions, rather than the 
> > > catalog becoming bloated?
> >
> > I'm concerned about both.
> Understood. Thank you for your response.
>
> > > The process of creating partial aggregation functions from aggregation
> > > functions can be automated, so I believe this issue can be resolved.
> > > However, automating it may increase the size of the patch even more, so 
> > > overall, approach#2 might be better.
> > > To implement approach #2, it would be necessary to investigate how much 
> > > additional code is required.
> >
> > Yes. Unfortunately I fear that there is quite a lot of work left to do here 
> > in order to produce a committable feature. To me it
> > seems necessary to conduct an investigation of approach #2. If the result 
> > of that investigation is that nothing major
> > stands in the way of approach #2, then I think we should adopt it, which is 
> > more work. In addition, the problems around
> > transmitting serialized bytea blobs between machines that can't be assumed 
> > to fully trust each other will need to be
> > addressed in some way, which seems like it will require a good deal of 
> > design work, forming some kind of consensus, and
> > then implementation work to follow. In addition to that there may be some 
> > small problems that need to be solved at a
> > detail level, such as the HAVING issue. I think the last category won't be 
> > too hard to sort out, but that still leaves two really
> > major areas to address.
> Yes, I agree with you. It is clear that further investigation and discussion 
> are still needed.
> I would be grateful if we can resolve this issue gradually. I would also like 
> to continue the discussion if possible in the future.

Thanks for all the efforts on this patch. I have changed the status of
the commitfest entry to "Returned with Feedback" as there is still
some work to get this patch out. Feel free to continue the discussion
and add a new entry when the patch is in a reviewable shape.

Regards,
Vignesh

Re: [PoC/RFC] Multiple passwords, interval expirations

2024-01-26 Thread vignesh C

On Tue, 10 Oct 2023 at 16:37, Gurjeet Singh  wrote:
>
> > On Mon, Oct 9, 2023 at 2:31 AM Gurjeet Singh  wrote:
> > >
> > > Next steps:
> > > - Break the patch into a series of smaller patches.
> > > - Add TAP tests (test the ability to actually login with these passwords)
> > > - Add/update documentation
> > > - Add more regression tests
>
> Please see attached the v4 of the patchset that introduces the notion
> of named passwords slots, namely 'first' and 'second' passwords, and
> allows users to address each of these passwords separately for the
> purposes of adding, dropping, or assigning expiration times.
>
> Apart from the changes described by each patch's commit title, one
> significant change since v3 is that now (included in v4-0002...patch)
> it is not allowed for a role to have a mix of a types of passwords.
> When adding a password, the patch ensures that the password being
> added uses the same hashing algorithm (md5 or scram-sha-256) as the
> existing password, if any.  Having all passwords of the same type
> helps the server pick the corresponding authentication method during
> connection attempt.
>
> The v3 patch also had a few bugs that were exposed by cfbot's
> automatic run. All those bugs have now been fixed, and the latest run
> on the v4 branch [1] on my private Git repo shows a clean run [1].
>
> The list of patches, and their commit titles are as follows:
>
> > v4-0001-...patch Add new columns to pg_authid
> > v4-0002-...patch Update password verification infrastructure to handle two 
> > passwords
> > v4-0003-...patch Added SQL support for ALTER ROLE to manage two passwords
> > v4-0004-...patch Updated pg_dumpall to support exporting a role's second 
> > password
> > v4-0005-...patch Update system views pg_roles and pg_shadow
> > v4-0006-...patch Updated pg_authid catalog documentation
> > v4-0007-...patch Updated psql's describe-roles meta-command
> > v4-0008-...patch Added documentation for ALTER ROLE command
> > v4-0009-...patch Added TAP tests to prove that a role can use two passwords 
> > to login
> > v4-0010-...patch pgindent run
> > v4-0011-...patch Run pgperltidy on files changed by this patchset
>
> Running pgperltidy updated many perl files unrelated to this patch, so
> in the last patch I chose to include only the one perl file that is
> affected by this patchset.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
4d969b2f85e1fd00e860366f101fd3e3160aab41 ===
=== applying patch
./v4-0002-Update-password-verification-infrastructure-to-ha.patch
...
patching file src/backend/libpq/auth.c
Hunk #4 FAILED at 828.
Hunk #5 succeeded at 886 (offset -2 lines).
Hunk #6 succeeded at 907 (offset -2 lines).
1 out of 6 hunks FAILED -- saving rejects to file src/backend/libpq/auth.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4432.log

Regards,
Vignesh

Re: Opportunistically pruning page before update

2024-01-26 Thread James Coleman

On Tue, Jan 23, 2024 at 2:46 AM Dilip Kumar  wrote:
>
> On Tue, Jan 23, 2024 at 7:18 AM James Coleman  wrote:
> >
> > On Mon, Jan 22, 2024 at 8:21 PM James Coleman  wrote:
> > >
> > > See rebased patch attached.
> >
> > I just realized I left a change in during the rebase that wasn't necessary.
> >
> > v4 attached.
>
> I have noticed that you are performing the opportunistic pruning after
> we decided that the updated tuple can not fit in the current page and
> then we are performing the pruning on the new target page.  Why don't
> we first perform the pruning on the existing page of the tuple itself?
>  Or this is already being done before this patch? I could not find
> such existing pruning so got this question because such pruning can
> convert many non-hot updates to the HOT update right?

First off I noticed that I accidentally sent a different version of
the patch I'd originally worked on. Here's the one from the proper
branch. It's still similar, but I want to make sure the right one is
being reviewed.

I'm working on a demo case for updates (to go along with the insert
case I sent earlier) to test out your question, and I'll reply when I
have that.

Regards,
James Coleman


v5-0004-log-when-pruning-2.patch
Description: Binary data


v5-0002-log-when-pruning.patch
Description: Binary data


v5-0003-Opportunistically-prune-to-avoid-building-a-new-p.patch
Description: Binary data


v5-0001-Allow-getting-lock-before-calling-heap_page_prune.patch
Description: Binary data

Re: Bug: The "directory" control parameter does not work

2024-01-26 Thread David E. Wheeler

On Jan 26, 2024, at 5:05 PM, David E. Wheeler  wrote:

> I made a simpler test case here:
> 
>  https://github.com/theory/test-extension-directory
> 
> Not the difference between actual and expected output.

Bah! Need to also set `MODULEDIR = extension/click` and then it works.

D

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Jelte Fennema-Nio

On Fri, 26 Jan 2024 at 18:19, Alvaro Herrera  wrote:
> Yeah, I see that point of view as well.  I like the end result; the
> additional protos in libpq-int.h don't bother me.  Does anybody else
> wants to share their opinion on it?  If none, then I'd consider going
> ahead with this version.

To be clear, I'm +1 on the new file structure (although if people feel
strongly against it, I don't care enough to make a big deal out of
it).

@Alvaro did you have any other comments on the contents of the patch btw?

Re: proposal: psql: show current user in prompt

2024-01-26 Thread Jelte Fennema-Nio

On Fri, 26 Jan 2024 at 21:35, Pavel Stehule  wrote:
> I see a possibility of disabling reporting as possibly dangerous.  Without 
> reporting encoding you can break psql. So it should be limited just to few 
> values where is known behave.

I agree that client_encoding is a GUC that likely all clients would
want to request reporting for, so I can see the argument for always
sending it. But I wouldn't call it dangerous for a client to be able
to disable reporting for it. Ultimately the client is the one that
decides. A client might just as well completely ignore the reported
value.

>> While I agree it's a little bit less friendly, I think you're
>> overestimating the difficulty of using my proposed approach. Most
>> importantly there's no need to parse the current GUC value. A client
>> always knows what variables it wants to have reported. So anytime that
>> changes the client can simply regenerate the full list of gucs that it
>> wants to report and send that. So something similar to the following
>> pseudo code (using += for string concatenation):
>
>
> I disagree with this - I can imagine some proxies add some own reported GUC 
> and the client can know nothing about it.

I've definitely thought about this case, since it's the main case I
care about as maintainer of PgBouncer. And a client wouldn't need to
know about the extra GUCs that the proxy requires for the proxy to
work correctly. A proxy can quite simply handle this itself in the
following manner: Whenever a client sends a ParameterSet for
_pq_.report_parameters, the proxy could forward to the server after
prepending its own extra GUCs at the front. The proxy wouldn't even
need to parse the list from the client to be able to do that. An even
better behaving proxy, should parse the list of GUCs though and would
only forward the ParameterStatus messages that it receives from the
server if the client requested ParameterStatus updates for them.

Re: Bug: The "directory" control parameter does not work

2024-01-26 Thread David E. Wheeler

On Jan 26, 2024, at 4:40 PM, David E. Wheeler  wrote:

> But it doesn’t seem to work. I tried this experiment:

I made a simpler test case here:

  https://github.com/theory/test-extension-directory

Not the difference between actual and expected output.

Best,

David

Bug: The "directory" control parameter does not work

2024-01-26 Thread David E. Wheeler

Hackers,

I wanted to try to customize the subdirectory for an extension. The docs[1] say:

> directory (string)
> 
> The directory containing the extension's SQL script file(s). Unless an 
> absolute path is given, the name is relative to the installation's SHAREDIR 
> directory. The default behavior is equivalent to specifying directory = 
> 'extension'.


  [1]: https://www.postgresql.org/docs/16/extend-extensions.html

But it doesn’t seem to work. I tried this experiment:

❯ pg_config --version
PostgreSQL 16.1

❯ git clone g...@github.com:theory/kv-pair.git
❯ cd kv-pair
❯ echo directory = 'blah' >> pair.control
❯ cat pair.control
# pair extension
comment = 'A key/value pair data type'
default_version = '0.1.2'
module_pathname = '$libdir/pair'
relocatable = true
directory = blah

❯ make install
cp sql/pair.sql sql/pair--0.1.2.sql
/opt/homebrew/bin/gmkdir -p '/Users/david/.pgenv/pgsql-16.1/share/extension'
/opt/homebrew/bin/gmkdir -p '/Users/david/.pgenv/pgsql-16.1/share/extension'
/opt/homebrew/bin/gmkdir -p '/Users/david/.pgenv/pgsql-16.1/share/doc/extension'
/opt/homebrew/bin/ginstall -c -m 644 .//pair.control 
'/Users/david/.pgenv/pgsql-16.1/share/extension/'
/opt/homebrew/bin/ginstall -c -m 644 .//sql/pair--unpackaged--0.1.2.sql  
'/Users/david/.pgenv/pgsql-16.1/share/extension/'
/opt/homebrew/bin/ginstall -c -m 644 .//doc/pair.md 
'/Users/david/.pgenv/pgsql-16.1/share/doc/extension/‘

❯ find ~/.pgenv/pgsql-16.1/ -name '*pair*.*'
/Users/david/.pgenv/pgsql-16.1//include/server/lib/pairingheap.h
/Users/david/.pgenv/pgsql-16.1//share/extension/pair--unpackaged--0.1.2.sql
/Users/david/.pgenv/pgsql-16.1//share/extension/pair.control
/Users/david/.pgenv/pgsql-16.1//share/doc/extension/pair.md

I see two problems:

1. pair--0.1.2.sql is not installed. It does install without `directory` set.
2. pair--unpackaged--0.1.2.sql is installed, but not in a `blah` directory. 
Shouldn’t it be?

Am I missing something?

Best,

David

Re: make dist using git archive

2024-01-26 Thread Eli Schwartz

Hello, meson developer here.

On 1/23/24 4:30 AM, Peter Eisentraut wrote:
> On 22.01.24 21:04, Tristan Partin wrote:
>> I am not really following why we can't use the builtin Meson dist
>> command. The only difference from my testing is it doesn't use a
>> --prefix argument.
> 
> Here are some problems I have identified:
> 
> 1. meson dist internally runs gzip without the -n option.  That makes
> the tar.gz archive include a timestamp, which in turn makes it not
> reproducible.

Well, it uses python tarfile which uses python gzip support under the
hood, but yes, that is true, python tarfile doesn't expose this tunable.

> 2. Because gzip includes a platform indicator in the archive, the
> produced tar.gz archive is not reproducible across platforms.  (I don't
> know if gzip has an option to avoid that.  git archive uses an internal
> gzip implementation that handles this.)

This appears to be https://github.com/python/cpython/issues/112346

> 3. Meson does not support tar.bz2 archives.

Simple enough to add, but I'm a bit surprised as usually people seem to
want either gzip for portability or xz for efficient compression.

> 4. Meson uses git archive internally, but then unpacks and repacks the
> archive, which loses the ability to use git get-tar-commit-id.

What do you use this for? IMO a more robust way to track the commit used
is to use gitattributes export-subst to write a `.git_archival.txt` file
containing the commit sha1 and other info -- this can be read even after
the file is extracted, which means it can also be used to bake the ID
into the built binaries e.g. as part of --version output.

> 5. I have found that the tar archives created by meson and git archive
> include the files in different orders.  I suspect that the Python
> tarfile module introduces some either randomness or platform dependency.

Different orders is meaningless, the question is whether the order is
internally consistent. Python uses sorted() to guarantee a stable order,
which may be a different algorithm than the one git-archive uses to
guarantee a stable order. But the order should be stable and that is
what matters.

> 6. meson dist is also slower because of the additional work.

I'm amenable to skipping the extraction/recombination of subprojects and
running of dist scripts in the event that neither exist, as Tristan
offered to do, but...

> 7. meson dist produces .sha256sum files but we have called them .sha256.
>  (This is obviously trivial, but it is something that would need to be
> dealt with somehow nonetheless.)
> 
> Most or all of these issues are fixable, either upstream in Meson or by
> adjusting our own requirements.  But for now this route would have some
> significant disadvantages.

Overall I feel like much of this is about requiring dist tarballs to be
byte-identical to other dist tarballs, although reproducible builds is
mainly about artifacts, not sources, and for sources it doesn't
generally matter unless the sources are ephemeral and generated
on-demand (in which case it is indeed very important to produce the same
tarball each time). A tarball is usually generated once, signed, and
uploaded to release hosting. Meson already guarantees the contents are
strictly based on the built tag.

-- 
Eli Schwartz

OpenPGP_0x84818A6819AF4A9B.asc
Description: OpenPGP public key

OpenPGP_signature.asc
Description: OpenPGP digital signature

Re: A performance issue with Memoize

2024-01-26 Thread Tom Lane

David Rowley  writes:
> On Sat, 27 Jan 2024 at 09:41, Tom Lane  wrote:
>> drongo and fairywren are consistently failing the test case added
>> by this commit.  I'm not quite sure why the behavior of Memoize
>> would be platform-specific when we're dealing with integers,
>> but ...

> Maybe snprintf(buf, "%.*f", 0, 5.0 / 2.0); results in "3" on those
> rather than "2"?
> Looking at the code in fmtfloat(), we fallback on the built-in snprintf.

Maybe ... I don't have a better theory.

> I can try changing the unique1 < 5 to unique1 < 4 to see that's more stable.

Worth a try.

regards, tom lane

Re: A performance issue with Memoize

2024-01-26 Thread David Rowley

On Sat, 27 Jan 2024 at 09:41, Tom Lane  wrote:
>
> David Rowley  writes:
> > I've adjusted the comments to what you mentioned and also leaned out
> > the pretty expensive test case to something that'll run much faster
> > and pushed the result.
>
> drongo and fairywren are consistently failing the test case added
> by this commit.  I'm not quite sure why the behavior of Memoize
> would be platform-specific when we're dealing with integers,
> but ...

Maybe snprintf(buf, "%.*f", 0, 5.0 / 2.0); results in "3" on those
rather than "2"?

Looking at the code in fmtfloat(), we fallback on the built-in snprintf.

I can try changing the unique1 < 5 to unique1 < 4 to see that's more stable.

David

Re: Functions to return random numbers in a given range

2024-01-26 Thread David Zhang


Thank you for the patch.

I applied this patch manually to the master branch, resolving a conflict 
in `numeric.h`. It successfully passed both `make check` and `make 
check-world`.



Best regards,

David

Re: A performance issue with Memoize

2024-01-26 Thread Tom Lane

David Rowley  writes:
> I've adjusted the comments to what you mentioned and also leaned out
> the pretty expensive test case to something that'll run much faster
> and pushed the result.

drongo and fairywren are consistently failing the test case added
by this commit.  I'm not quite sure why the behavior of Memoize
would be platform-specific when we're dealing with integers,
but ...

regards, tom lane

Re: proposal: psql: show current user in prompt

2024-01-26 Thread Pavel Stehule

pá 26. 1. 2024 v 11:40 odesílatel Jelte Fennema-Nio 
napsal:

> On Thu, 25 Jan 2024 at 21:43, Pavel Stehule 
> wrote:
> > 2. using GUC for all reported GUC looks not too readable. Maybe it
> should be used just for customized reporting, not for default
>
> I thought about this too, because indeed the default list is quite
> long. But I decided against it because it seemed strange that you
> cannot disable reporting for the options that are currently reporting
> by default. Especially since the current default essentially boils
> down to: "whatever the psql client needed"
>

I see a possibility of disabling reporting as possibly dangerous.  Without
reporting encoding you can break psql. So it should be limited just to few
values where is known behave.


> > 3. Another issue of your proposal is less friendly enabling disabling
> reporting of specific GUC. Maintaining a list needs more work than just
> enabling and disabling one specific GUC.
> > I think this is the main disadvantage of your proposal. In my proposal I
> don't need to know the state of any GUC. Just I send PQlinkParameterStatus
> or PQunlinkParameterStatus. With your proposal, I need to read
> _pq_.report_parameters, parse it, and modify, and send it back. This
> doesn't look too practical.
>
> While I agree it's a little bit less friendly, I think you're
> overestimating the difficulty of using my proposed approach. Most
> importantly there's no need to parse the current GUC value. A client
> always knows what variables it wants to have reported. So anytime that
> changes the client can simply regenerate the full list of gucs that it
> wants to report and send that. So something similar to the following
> pseudo code (using += for string concatenation):
>

I disagree with this - I can imagine some proxies add some own reported GUC
and the client can know nothing about it.


>
> char *report_parameters = "server_version,client_encoding"
> if (show_user_in_prompt)
>report_parameters += ",user"
> if (show_search_path_in_prompt)
>report_parameters += ",search_path"
> PQsetParameter("_pq_.report_parameters", report_parameters)
>

It is simple only when default value of report_parameters is constant, but
when not it can be fragile


>
> > Personally I prefer usage of a more generic API than my
> PQlinkParameterStatus and PQunlinkParameterStatus. You talked about it with
> Robert If I remember.
> >
> > Can be nice if I can write just
> >
> > /* similar princip is used inside ncurses */
> > set_report_guc_message_no = PQgetMessageNo("set_report_guc");
> > /* the result can be processed by server and by all proxies on the line
> */
> >
> > if (set_report_guc_message_no == -1)
> >   fprintf(stderr, "feature is not supported");
> > result = PQsendMessage(set_report_guc_message, "user");
> > if (result == -1)
> >   fprintf(stderr, "some error ...");
> >
> > With some API like this it can be easier to do some small protocol
> enhancement. Maybe this is overengineering. Enhancing protocol is not too
> common, and with usage PQsendTypedCommand enhancing protocol is less work
> too.
>
> Honestly, I don't see a benefit to this over protocol extension
> parameters using the ParameterSet message. Afaict this is also
> essentially a key + value message type. And protocol extension
> parameters have the benefit that they are already an established thing
> in the protocol, and that they can easily be integrated with the
> current GUC system.
>

It was an idea.

Personally I like an idea that I described in mail to Peter. Using
dedicated connection related GUC for immutably reported GUC, and using
possibility to set dedicated function in protocol for enabling, disabling
other GUC. It looks (to me) like the most robust solution.

Regards

Pavel

Re: psql: add \create_function command

2024-01-26 Thread Adam S

idea: what about custom functions for (each) IDE, which calls psql -c
"CREATE FUNCTION ..." when the user saves the file?  (it would easy to
prototype for emacs...)
(obviously, this isn't a core feature...)


On Fri, Jan 26, 2024 at 3:19 PM Pavel Stehule 
wrote:

>
>
> pá 26. 1. 2024 v 21:17 odesílatel Tom Lane  napsal:
>
>> Pavel Stehule  writes:
>> > I don't know, maybe I have a problem with the described use case. I
>> cannot
>> > imagine holding the body and head of PL routines in different places
>> and I
>> > don't understand the necessity to join it.
>>
>> It seems a little weird to me too, and I would vote against accepting
>> \create_function as described because I think too few people would
>> want to use it.  However, the idea of an easy way to pull in a file
>> and convert it to a SQL literal seems like it has many applications.
>>
>
> +1
>
> Pavel
>
>
>> regards, tom lane
>>
>

Re: psql: add \create_function command

2024-01-26 Thread Pavel Stehule

pá 26. 1. 2024 v 21:17 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > I don't know, maybe I have a problem with the described use case. I
> cannot
> > imagine holding the body and head of PL routines in different places and
> I
> > don't understand the necessity to join it.
>
> It seems a little weird to me too, and I would vote against accepting
> \create_function as described because I think too few people would
> want to use it.  However, the idea of an easy way to pull in a file
> and convert it to a SQL literal seems like it has many applications.
>

+1

Pavel


> regards, tom lane
>

Re: psql: add \create_function command

2024-01-26 Thread Tom Lane

Pavel Stehule  writes:
> I don't know, maybe I have a problem with the described use case. I cannot
> imagine holding the body and head of PL routines in different places and I
> don't understand the necessity to join it.

It seems a little weird to me too, and I would vote against accepting
\create_function as described because I think too few people would
want to use it.  However, the idea of an easy way to pull in a file
and convert it to a SQL literal seems like it has many applications.

regards, tom lane

Re: psql: add \create_function command

2024-01-26 Thread Pavel Stehule

pá 26. 1. 2024 v 21:04 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > but why you need to do in psql? - you can prepare content outside and
> > execute just like echo "CREATE FUNCTION " | psql
>
> The bit that's probably hard if you're trying to do this in a shell
> script is "quote this data as a SQL string literal".  psql can get
> that right even in the face of encoding considerations,
> standard_conforming_strings, etc.  Not sure you can build a
> fully bulletproof solution outside.
>

I don't know, maybe I have a problem with the described use case. I cannot
imagine holding the body and head of PL routines in different places and I
don't understand the necessity to join it.

On second hand, few years ago (if I remember well, I proposed some like
`:{file}`. I don't remember the syntax. But it was not finished, and then I
wrote

https://github.com/okbob/pgimportdoc

The possibility for some simple import external data can be nice

> regards, tom lane
>

Re: psql: add \create_function command

2024-01-26 Thread Tom Lane

Pavel Stehule  writes:
> but why you need to do in psql? - you can prepare content outside and
> execute just like echo "CREATE FUNCTION " | psql

The bit that's probably hard if you're trying to do this in a shell
script is "quote this data as a SQL string literal".  psql can get
that right even in the face of encoding considerations,
standard_conforming_strings, etc.  Not sure you can build a
fully bulletproof solution outside.

regards, tom lane

Re: Add minimal C example and SQL registration example for custom table access methods.

2024-01-26 Thread Fabrízio de Royes Mello

On Wed, Nov 15, 2023 at 8:29 PM Roberto Mello 
wrote:
>
> Suggestion:
>
> In the C example you added you mention in the comment:
>
> +  /* Methods from TableAmRoutine omitted from example, but all
> + non-optional ones must be provided here. */
>
> Perhaps you could provide a "see " to point the reader finding your
example where he could find these non-optional methods he must provide?
>
> Nitpicking a little: your patch appears to change more lines than it
does, because it added line breaks earlier in the lines. I would generally
avoid that unless there's good reason to do so.

Hey folks,

There is a previous patch [1] around the same topic. What about joining
efforts on pointing these documentation changes to the proposed test module?

[1] https://commitfest.postgresql.org/46/4588/

-- 
Fabrízio de Royes Mello

Re: Add minimal C example and SQL registration example for custom table access methods.

2024-01-26 Thread Tristen Raab

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, passed

Hello,

I've reviewed your patch and it applies correctly and the documentation builds 
without any error. The built documentation also looks good with no formatting 
errors. It's always great to see more examples when reading through 
documentation so I think this patch is a good addition.

thanks,

---
Tristen Raab
Highgo Software Canada
www.highgo.ca

Re: psql: add \create_function command

2024-01-26 Thread Tom Lane

walt...@technowledgy.de writes:
> Pavel Stehule:
>> looks a little bit obscure - why do you need to do it from psql? And how 
>> frequently do you do it?

> I store all my SQL code in git and use "psql -e" to "bundle" it into an 
> extension, which is then deployed to production.

> The code is spread over many files, which include other files via \ir. 

That reminds me: if we do either \file_read or :{file}, we should
define relative paths as working like \ir, that is it's relative
to the current script's directory when we're reading from a script.
This is almost always the behavior you want, and the principal
functional problem with the `cat ...` solution is that it doesn't
work that way.

regards, tom lane

Re: psql: add \create_function command

2024-01-26 Thread Pavel Stehule

pá 26. 1. 2024 v 20:45 odesílatel  napsal:

> Pavel Stehule:
> > looks a little bit obscure - why do you need to do it from psql? And how
> > frequently do you do it?
>
> I store all my SQL code in git and use "psql -e" to "bundle" it into an
> extension, which is then deployed to production.
>

this is good way


>
> The code is spread over many files, which include other files via \ir.
> Sometimes you need to include other types of files, though - for example
> code in other languages as Steve mentioned, but I have also had cases
> for yaml files, markdown templates, even binary assets which should
> still be considered "code" and not data.
>
> So anything in that direction would help.
>

but why you need to do in psql? - you can prepare content outside and
execute just like echo "CREATE FUNCTION " | psql

Re: psql: add \create_function command

2024-01-26 Thread David G. Johnston

On Fri, Jan 26, 2024 at 12:23 PM Tom Lane  wrote:

>
> \set fbody `cat source_file.txt`
> CREATE FUNCTION foo() RETURNS whatever AS :'fbody' LANGUAGE ...;
>
> and maybe we should say that that's sufficient.

I really don't have a problem, and kinda prefer, using psql variables this
way but feel much more comfortable not having to invoke a shell.

>   It's a bit
> klugy though.  One level of improvement could be to get rid
> of the dependency on "cat" by inventing a backslash command
> to read a file into a variable:
>
> \file_read fbody source_file.txt
>

This I would use to reliably read external json text files into a psql
variable so that I could use jsonb_to_recordset(:var) on the contents.

> (\file_write to go the other way seems potentially useful too.)
>

The nearby discussions regarding trying to produce json into files would
support this claim.

> Or we could cut out the intermediate variable altogether
> by inventing something that works like :'...' but reads
> from a file not a variable.  That might be too specialized
> though, and I'm not sure about good syntax for it either.
> Maybe like
>
> CREATE FUNCTION foo() RETURNS whatever AS :{source_file.txt} LANGUAGE ...;
>
>
IMO, not enough improvement to be had over letting psql variables act as
the intermediary to justify the effort.

David J.

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-01-26 Thread Bharath Rupireddy

On Thu, Jan 11, 2024 at 10:48 AM Bharath Rupireddy
 wrote:
>
> Hi,
>
> Replication slots in postgres will prevent removal of required
> resources when there is no connection using them (inactive). This
> consumes storage because neither required WAL nor required rows from
> the user tables/system catalogs can be removed by VACUUM as long as
> they are required by a replication slot. In extreme cases this could
> cause the transaction ID wraparound.
>
> Currently postgres has the ability to invalidate inactive replication
> slots based on the amount of WAL (set via max_slot_wal_keep_size GUC)
> that will be needed for the slots in case they become active. However,
> the wraparound issue isn't effectively covered by
> max_slot_wal_keep_size - one can't tell postgres to invalidate a
> replication slot if it is blocking VACUUM. Also, it is often tricky to
> choose a default value for max_slot_wal_keep_size, because the amount
> of WAL that gets generated and allocated storage for the database can
> vary.
>
> Therefore, it is often easy for developers to do the following:
> a) set an XID age (age of slot's xmin or catalog_xmin) of say 1 or 1.5
> billion, after which the slots get invalidated.
> b) set a timeout of say 1 or 2 or 3 days, after which the inactive
> slots get invalidated.
>
> To implement (a), postgres needs a new GUC called max_slot_xid_age.
> The checkpointer then invalidates all the slots whose xmin (the oldest
> transaction that this slot needs the database to retain) or
> catalog_xmin (the oldest transaction affecting the system catalogs
> that this slot needs the database to retain) has reached the age
> specified by this setting.
>
> To implement (b), first postgres needs to track the replication slot
> metrics like the time at which the slot became inactive (inactive_at
> timestamptz) and the total number of times the slot became inactive in
> its lifetime (inactive_count numeric) in ReplicationSlotPersistentData
> structure. And, then it needs a new timeout GUC called
> inactive_replication_slot_timeout. Whenever a slot becomes inactive,
> the current timestamp and inactive count are stored in
> ReplicationSlotPersistentData structure and persisted to disk. The
> checkpointer then invalidates all the slots that are lying inactive
> for about inactive_replication_slot_timeout duration starting from
> inactive_at.
>
> In addition to implementing (b), these two new metrics enable
> developers to improve their monitoring tools as the metrics are
> exposed via pg_replication_slots system view. For instance, one can
> build a monitoring tool that signals when replication slots are lying
> inactive for a day or so using inactive_at metric, and/or when a
> replication slot is becoming inactive too frequently using inactive_at
> metric.
>
> I’m attaching the v1 patch set as described below:
> 0001 - Tracks invalidation_reason in pg_replication_slots. This is
> needed because slots now have multiple reasons for slot invalidation.
> 0002 - Tracks inactive replication slot information inactive_at and
> inactive_timeout.
> 0003 - Adds inactive_timeout based replication slot invalidation.
> 0004 - Adds XID based replication slot invalidation.
>
> Thoughts?

Needed a rebase due to c393308b. Please find the attached v2 patch set.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 26c5b7762abc8fd92df0376ec68c81d7064891fb Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 26 Jan 2024 18:11:01 +
Subject: [PATCH v2] Track invalidation_reason in pg_replication_slots

Currently the reason for replication slot invalidation is not
tracked in pg_replication_slots. A recent commit 007693f2a added
conflict_reason to show the reasons for slot invalidation, but
only for logical slots. This commit renames conflict_reason to
invalidation_reason, and adds the support to show invalidation
reasons for both physical and logical slots.
---
 doc/src/sgml/system-views.sgml| 11 +++---
 src/backend/catalog/system_views.sql  |  2 +-
 src/backend/replication/slotfuncs.c   | 37 ---
 src/bin/pg_upgrade/info.c |  4 +-
 src/include/catalog/pg_proc.dat   |  2 +-
 .../t/035_standby_logical_decoding.pl | 32 
 src/test/regress/expected/rules.out   |  4 +-
 7 files changed, 44 insertions(+), 48 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index dd468b31ea..c61312793c 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2525,13 +2525,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
  
   
-   conflict_reason text
+   invalidation_reason text
   
   
-   The reason for the logical slot's conflict with recovery. It is always
-   NULL for physical slots, as well as for logical slots which are not
-   invalidated. The

Re: make dist using git archive

2024-01-26 Thread Tristan Partin

On Fri Jan 26, 2024 at 12:28 AM CST, Peter Eisentraut wrote:

On 25.01.24 17:25, Tristan Partin wrote:
> For what it's worth, I run Meson 1.3, and the behavior of generating the 
> tarballs even though it is a dirty tree still occurred. In the new patch 
> you seem to say it was fixed in 0.60.

The problem I'm referring to is that before 0.60, alias_target cannot 
depend on run_target (only "build target").  This is AFAICT not 
documented and might not have been an intentional change, but you can 
trace it in the meson source code, and it shows in the PostgreSQL CI. 
That's also why for the above bzip2 issue I have to use custom_target in 
place of your run_target.

https://github.com/mesonbuild/meson/pull/12783

Thanks for finding these issues.

--
Tristan Partin
Neon (https://neon.tech)

Re: psql: add \create_function command

2024-01-26 Thread walther


Pavel Stehule:
looks a little bit obscure - why do you need to do it from psql? And how 
frequently do you do it?


I store all my SQL code in git and use "psql -e" to "bundle" it into an 
extension, which is then deployed to production.


The code is spread over many files, which include other files via \ir. 
Sometimes you need to include other types of files, though - for example 
code in other languages as Steve mentioned, but I have also had cases 
for yaml files, markdown templates, even binary assets which should 
still be considered "code" and not data.


So anything in that direction would help.

Re: psql: add \create_function command

2024-01-26 Thread walther


Tom Lane:

Or we could cut out the intermediate variable altogether
by inventing something that works like :'...' but reads
from a file not a variable.  That might be too specialized
though, and I'm not sure about good syntax for it either.
Maybe like

CREATE FUNCTION foo() RETURNS whatever AS :{source_file.txt} LANGUAGE ...;


That would indeed be very useful! I would immediately use this in a lot 
of places.

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-26 Thread Jeff Davis

On Fri, 2024-01-26 at 19:31 +0530, Bharath Rupireddy wrote:
> Are you suggesting to error out instead of returning 0?

We'd do neither of those things, because no caller should actually call
it while RecoveryInProgress() or on a different timeline.

>  How about
> returning a negative value instead of just 0 or returning true/false
> just like WALRead?

All of these things are functionally equivalent -- the same thing is
happening at the end. This is just a discussion about API style and how
that will interact with hypothetical callers that don't exist today.
And it can also be easily changed later, so we aren't stuck with
whatever decision happens here.

> 
> Imagine, implementing an extension (may be for fun or learning or
> educational or production purposes) to read unflushed WAL directly
> from WAL buffers using XLogReadFromBuffers as page_read callback with
> xlogreader facility.

That makes sense, I didn't realize you intended to use this fron an
extension. I'm fine considering that as a separate patch that could
potentially be committed soon after this one.

I'd like some more details, but can I please just commit the basic
functionality now-ish?

> Tried to keep wal_writer quiet with wal_writer_delay=1ms and
> wal_writer_flush_after = 1GB to not to flush WAL in the background.
> Also, disabled autovacuum, and set checkpoint_timeout to a higher
> value. All of this is done to generate minimal WAL so that WAL
> buffers
> don't get overwritten. Do you see any problems with it?

Maybe check it against pg_current_wal_lsn(), and see if the Write
pointer moved ahead? Perhaps even have a (limited) loop that tries
again to catch it at the right time?

> Can the WAL summarizer ever read the WAL on current TLI? I'm not so
> sure about it, I haven't explored it in detail.

Let's just not call XLogReadFromBuffers from there.

Regards,
Jeff Davis

Re: psql: add \create_function command

2024-01-26 Thread Tom Lane

Pavel Stehule  writes:
> pá 26. 1. 2024 v 19:41 odesílatel Steve Chavez  napsal:
>> To solve the above issue, this patch adds a psql command to create a
>> function and obtain its body from another file. It is used as:
>> \create_function from ./data/max.py max(int,int) returns int LANGUAGE
>> plpython3u

> looks a little bit obscure - why do you need to do it from psql? And how
> frequently do you do it?
> I think so this is fix on wrong place - you should to fix linters, not psql
> - more without header you cannot do correct linting

It feels wrong to me too.  I'm not sure where is a better place to
implement something like this though.  We can't support it server-side
because of permissions issues, so if there's to be any merging of
files it has to happen on the client side.

It strikes me though that thinking about this in terms of CREATE
FUNCTION is thinking too small.  ISTM that the requirement of
"grab the content of a file, quote it as a string literal, and
embed it into a SQL command" exists elsewhere.  For one thing
there's CREATE PROCEDURE, but I've needed this occasionally
just as a way of feeding data into SELECT, INSERT, etc.

Now, you can do it today:

\set fbody `cat source_file.txt`
CREATE FUNCTION foo() RETURNS whatever AS :'fbody' LANGUAGE ...;

and maybe we should say that that's sufficient.  It's a bit
klugy though.  One level of improvement could be to get rid
of the dependency on "cat" by inventing a backslash command
to read a file into a variable:

\file_read fbody source_file.txt
CREATE FUNCTION foo() RETURNS whatever AS :'fbody' LANGUAGE ...;

(\file_write to go the other way seems potentially useful too.)

Or we could cut out the intermediate variable altogether
by inventing something that works like :'...' but reads
from a file not a variable.  That might be too specialized
though, and I'm not sure about good syntax for it either.
Maybe like

CREATE FUNCTION foo() RETURNS whatever AS :{source_file.txt} LANGUAGE ...;

regards, tom lane

Re: Patch: Improve Boolean Predicate JSON Path Docs

2024-01-26 Thread David E. Wheeler

On Jan 25, 2024, at 11:03, Tom Lane  wrote:

> I changed the preceding para to say "... check expressions are
> required in ...", which I thought was sufficient to cover that.
> Also, the tabular description of the operator tells you not to do it.

Yeah, that’s good. I was perhaps leaning into being over-explicit after it took 
me a while to even figure out that there was a difference, let alone where 
matters.

>> What do you think of also dropping the article from all the references to 
>> “the strict mode” or “the lax mode”, to make them “strict mode” and “lax 
>> mode”, respectively?
> 
> Certainly most of 'em don't need it.  I'll make it so.

Nice, thanks!

David

Re: psql: add \create_function command

2024-01-26 Thread Pavel Stehule

Hi

pá 26. 1. 2024 v 19:41 odesílatel Steve Chavez  napsal:

> Hello hackers,
>
> Currently a function definition must include its body inline. Because of
> this, when storing function definitions in files, linters and syntax
> highlighters for non-SQL languages (python, perl, tcl, etc) won't work. An
> example can be seen on:
>
>
> https://github.com/postgres/postgres/blob/5eafacd2797dc0b04a0bde25fbf26bf79903e7c2/src/pl/plpython/sql/plpython_test.sql#L15-L24
>
> To solve the above issue, this patch adds a psql command to create a
> function and obtain its body from another file. It is used as:
>
> \create_function from ./data/max.py max(int,int) returns int LANGUAGE
> plpython3u
>
> Its design is similar to the `\copy` command, which is a frontend version
> of the COPY statement.
>
> This patch is at an initial stage but includes tests with plpython3u,
> pltcl, plperl and tab completion.
>
> Any feedback is welcomed.
>

looks a little bit obscure - why do you need to do it from psql? And how
frequently do you do it?

I think so this is fix on wrong place - you should to fix linters, not psql
- more without header you cannot do correct linting

Regards

Pavel



>
> Best regards,
> Steve Chavez
>

psql: add \create_function command

2024-01-26 Thread Steve Chavez

Hello hackers,

Currently a function definition must include its body inline. Because of
this, when storing function definitions in files, linters and syntax
highlighters for non-SQL languages (python, perl, tcl, etc) won't work. An
example can be seen on:

https://github.com/postgres/postgres/blob/5eafacd2797dc0b04a0bde25fbf26bf79903e7c2/src/pl/plpython/sql/plpython_test.sql#L15-L24

To solve the above issue, this patch adds a psql command to create a
function and obtain its body from another file. It is used as:

\create_function from ./data/max.py max(int,int) returns int LANGUAGE
plpython3u

Its design is similar to the `\copy` command, which is a frontend version
of the COPY statement.

This patch is at an initial stage but includes tests with plpython3u,
pltcl, plperl and tab completion.

Any feedback is welcomed.

Best regards,
Steve Chavez
From 98f505ba81e8ef317d2a9b764348b523346d7f24 Mon Sep 17 00:00:00 2001
From: steve-chavez 
Date: Mon, 8 Jan 2024 18:57:26 -0500
Subject: [PATCH] psql: add \create_function command

Currently a function definition must include its body inline.
Because of this, when storing function definitions in files,
linters and syntax highlighters for non-SQL languages
(python, perl, tcl, etc) won't work.

This patch adds a psql command to create a function and obtain its body
from another file. It is used as:

\create_function from ./data/max.py max(int,int) returns int LANGUAGE plpython3u

Its design is similar to the `\copy` command, which is a frontend
version of the COPY statement.

Includes tests with plpython3u, pltcl, plperl and tab completion.
---
 src/bin/psql/Makefile |   1 +
 src/bin/psql/command.c|  26 
 src/bin/psql/create_function.c| 128 ++
 src/bin/psql/create_function.h|  15 ++
 src/bin/psql/meson.build  |   1 +
 src/bin/psql/tab-complete.c   |  17 ++-
 src/pl/plperl/data/max.pl |   2 +
 src/pl/plperl/expected/plperl.out |   7 +
 src/pl/plperl/sql/plperl.sql  |   4 +
 src/pl/plpython/data/max.py   |   3 +
 src/pl/plpython/expected/plpython_test.out|   7 +
 src/pl/plpython/sql/plpython_test.sql |   4 +
 src/pl/tcl/data/max.tcl   |   2 +
 src/pl/tcl/expected/pltcl_setup.out   |   7 +
 src/pl/tcl/sql/pltcl_setup.sql|   4 +
 src/test/regress/data/max.sql |   1 +
 .../regress/expected/create_function_sql.out  |  10 +-
 src/test/regress/sql/create_function_sql.sql  |   4 +
 18 files changed, 241 insertions(+), 2 deletions(-)
 create mode 100644 src/bin/psql/create_function.c
 create mode 100644 src/bin/psql/create_function.h
 create mode 100644 src/pl/plperl/data/max.pl
 create mode 100644 src/pl/plpython/data/max.py
 create mode 100644 src/pl/tcl/data/max.tcl
 create mode 100644 src/test/regress/data/max.sql

diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index 374c4c3ab8..285291b8ab 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -29,6 +29,7 @@ OBJS = \
 	command.o \
 	common.o \
 	copy.o \
+	create_function.o \
 	crosstabview.o \
 	describe.o \
 	help.o \
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5c906e4806..d2c5799ed0 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -30,6 +30,7 @@
 #include "common/logging.h"
 #include "common/string.h"
 #include "copy.h"
+#include "create_function.h"
 #include "crosstabview.h"
 #include "describe.h"
 #include "fe_utils/cancel.h"
@@ -71,6 +72,7 @@ static backslashResult exec_command_cd(PsqlScanState scan_state, bool active_bra
 static backslashResult exec_command_conninfo(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_copy(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_copyright(PsqlScanState scan_state, bool active_branch);
+static backslashResult exec_command_create_function(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_crosstabview(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_d(PsqlScanState scan_state, bool active_branch,
 	  const char *cmd);
@@ -323,6 +325,8 @@ exec_command(const char *cmd,
 		status = exec_command_copy(scan_state, active_branch);
 	else if (strcmp(cmd, "copyright") == 0)
 		status = exec_command_copyright(scan_state, active_branch);
+	else if (strcmp(cmd, "create_function") == 0)
+		status = exec_command_create_function(scan_state, active_branch);
 	else if (strcmp(cmd, "crosstabview") == 0)
 		status = exec_command_crosstabview(scan_state, active_branch);
 	else if (cmd[0] == 'd')
@@ -720,6 +724,28 @@ exec_command_copyright(PsqlScanState scan_state, bool active_branch)
 	return PSQL_CMD_SKIP_LINE;
 }
 
+/*
+ * \create_function -- create a function obtaining its body from a file
+ */
+static backslashResult

Re: cleanup patches for incremental backup

2024-01-26 Thread Robert Haas

On Fri, Jan 26, 2024 at 12:39 PM Nathan Bossart
 wrote:
> On Fri, Jan 26, 2024 at 11:04:37AM -0500, Robert Haas wrote:
> > Here is v2 with that addition.
>
> Looks reasonable.

Thanks for the report & review. I have committed that version.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Printing backtrace of postgres processes

2024-01-26 Thread Bharath Rupireddy

On Fri, Jan 26, 2024 at 4:11 PM Alvaro Herrera  wrote:
>

Thanks for reviewing.

> > +You can get the file name and line number from the logged details by 
> > using
> > +gdb/addr2line in linux platforms (users must ensure gdb/addr2line is
> > +already installed).
>
> This doesn't read great.  I mean, what is gdb/addr2line?  I think you
> mean either GDB or addr2line; this could be clearer.

Wrapped them in  tag and reworded the comment a bit.

> >   * internal backtrace support functions in the backtrace.  This requires 
> > that
> > - * this and related functions are not inlined.
> > + * this and related functions are not inlined. If the edata pointer is 
> > valid,
> > + * backtrace information will be set in edata.
> >   */
> > -static void
> > +void
> >  set_backtrace(ErrorData *edata, int num_skip)
> >  {
> >   StringInfoData errtrace;
>
> This seems like a terrible API choice, and the comment change is no
> good.  I suggest you need to create a function that deals only with a
> StringInfo, maybe
>   append_backtrace(StringInfo buf, int num_skip)
> which is used by set_backtrace to print the backtrace in
> edata->backtrace, and a new function log_backtrace() that does the
> ereport(LOG_SERVER_ONLY) thing.

You probably were looking at v21, the above change isn't there in
versions after that. Can you please review the latest version v26
attached here?

We might want this patch extended to the newly added walsummarizer
process which I'll do so in the next version.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From c00bc969b7e55dff92cdd5b7fef355dc2bc98f8f Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 26 Jan 2024 15:51:02 +
Subject: [PATCH v26] Move sending multiplexed-SIGUSR1 signal code to a
 function

Add a new function hosting the common code for sending
multiplexed-SIGUSR1 signal to a backend process. This function
will also be used as-is by an upcoming commit reducing the code
duplication.
---
 src/backend/storage/ipc/procarray.c | 60 +
 src/backend/utils/adt/mcxtfuncs.c   | 49 ++-
 src/include/storage/procarray.h |  1 +
 3 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ee2d7f8585..f6465529e7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3097,6 +3097,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)
 	return result;
 }
 
+/*
+ * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a
+ * backend or an auxiliary process and send it the SIGUSR1 signal for a given
+ * reason.
+ *
+ * Returns true if sending the signal was successful, false otherwise.
+ */
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{
+	PGPROC	   *proc;
+	BackendId	backendId = InvalidBackendId;
+
+	proc = BackendPidGetProc(pid);
+
+	/*
+	 * See if the process with given pid is a backend or an auxiliary process.
+	 *
+	 * If the given process is a backend, use its backend id in
+	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
+	 * that because auxiliary processes (except the startup process) don't
+	 * have a valid backend id.
+	 */
+	if (proc != NULL)
+		backendId = proc->backendId;
+	else
+		proc = AuxiliaryPidGetProc(pid);
+
+	/*
+	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+	 * isn't valid; but by the time we reach kill(), a process for which we
+	 * get a valid proc here might have terminated on its own.  There's no way
+	 * to acquire a lock on an arbitrary process to prevent that. But since
+	 * this mechanism is usually used to debug a backend or an auxiliary
+	 * process running and consuming lots of memory or a long running process,
+	 * that it might end on its own first and its memory contexts are not
+	 * logged or backtrace not logged is not a problem.
+	 */
+	if (proc == NULL)
+	{
+		/*
+		 * This is just a warning so a loop-through-resultset will not abort
+		 * if one backend terminated on its own during the run.
+		 */
+		ereport(WARNING,
+(errmsg("PID %d is not a PostgreSQL server process", pid)));
+		return false;
+	}
+
+	if (SendProcSignal(pid, reason, backendId) < 0)
+	{
+		/* Again, just a warning to allow loops */
+		ereport(WARNING,
+(errmsg("could not send signal to process %d: %m", pid)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * BackendPidGetProc -- get a backend's PGPROC given its PID
  *
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 4708d73f5f..f7b4f8dac1 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -145,51 +145,8 @@ Datum
 pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
 	int			pid = PG_GETARG_INT32(0);
-	PGPROC	   *proc;
-	BackendId	backendId =

Re: Hide exposed impl detail of wchar.c

2024-01-26 Thread Tom Lane

Nathan Bossart  writes:
> I see that I was planning on back-patching this to v16, but since
> is_valid_ascii() was introduced in v15, I'm wondering if it'd be better to
> back-patch it there so that is_valid_ascii() lives in the same file for all
> versions where it exists.  Thoughts?

Yeah, if we're going to back-patch at all, that probably makes sense.

regards, tom lane

Re: [PATCH] Add sortsupport for range types and btree_gist

2024-01-26 Thread Bernd Helmle

Am Freitag, dem 26.01.2024 um 18:31 +0530 schrieb vignesh C:
> CFBot shows that the patch does not apply anymore as in [1]:
> === Applying patches on top of PostgreSQL commit ID

I've started working on it and planning to submit a polished patch for
the upcoming CF.

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-26 Thread Andrey Borodin

> On 26 Jan 2024, at 22:38, Alvaro Herrera  wrote:
> 
> This is OK because in the
> default compilation each file only has 32 segments, so that requires
> only 32 lwlocks held at once while the file is being deleted.

Do we account somehow that different subsystems do not accumulate 
MAX_SIMUL_LWLOCKS together?
E.g. GiST during split can combine 75 locks, and somehow commit_ts will be 
deactivated by this backend at the same moment and add 32 locks more :)
I understand that this sounds fantastic, these subsystems do not interfere. But 
this is fantastic only until something like that actually happens.
If possible, I'd prefer one lock at a time, any maybe sometimes two-three with 
some guarantees that this is safe.
So, from my POV first solution that you proposed seems much better to me.

Thanks for working on this!

Best regard, Andrey Borodin.

Re: Hide exposed impl detail of wchar.c

2024-01-26 Thread Nathan Bossart

On Thu, Jan 04, 2024 at 04:43:29PM -0600, Nathan Bossart wrote:
> On Mon, Nov 20, 2023 at 10:39:43PM -0600, Nathan Bossart wrote:
>> Alright.  The next minor release isn't until February, so I'll let this one
>> sit a little while longer in case anyone objects to back-patching something
>> like this [0].
>> 
>> [0] https://postgr.es/m/attachment/152305/move_is_valid_ascii_v2.patch
> 
> Barring objections, I plan to commit this and back-patch it to v16 in the
> next few days.

Apologies for the delay.  We're getting close to the February release, so I
should probably take care of this one soon...

I see that I was planning on back-patching this to v16, but since
is_valid_ascii() was introduced in v15, I'm wondering if it'd be better to
back-patch it there so that is_valid_ascii() lives in the same file for all
versions where it exists.  Thoughts?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-26 Thread Peter Geoghegan

On Fri, Jan 26, 2024 at 11:44 AM Robert Haas  wrote:
> Thanks to you for the patches, and to Peter for participating in the
> discussion which, IMHO, was very helpful in clarifying things.

Glad I could help.

-- 
Peter Geoghegan

Re: cleanup patches for incremental backup

2024-01-26 Thread Nathan Bossart

On Fri, Jan 26, 2024 at 11:04:37AM -0500, Robert Haas wrote:
> Here is v2 with that addition.

Looks reasonable.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-26 Thread Alvaro Herrera

I've continued to review this and decided that I don't like the mess
this patch proposes in order to support pg_commit_ts's deletion of all
files.  (Yes, I know that I was the one that proposed this idea. It's
still a bad one).  I'd like to change that code by removing the limit
that we can only have 128 bank locks in a SLRU.  To recap, the reason we
did this is that commit_ts sometimes wants to delete all files while
running (DeactivateCommitTs), and for this it needs to acquire all bank
locks.  Since going above the MAX_SIMUL_LWLOCKS limit is disallowed, we
added the SLRU limit making multiple banks share lwlocks.

I propose two alternative solutions:

1. The easiest is to have DeactivateCommitTs continue to hold
CommitTsLock until the end, including while SlruScanDirectory does its
thing.  This sounds terrible, but considering that this code only runs
when the module is being deactivated, I don't think it's really all that
bad in practice.  I mean, if you deactivate the commit_ts module and
then try to read commit timestamp data, you deserve to wait for a few
seconds just as a punishment for your stupidity.  AFAICT the cases where
anything is done in the module mostly check without locking that
commitTsActive is set, so we're not slowing down any critical
operations.  Also, if you don't like to be delayed for a couple of
seconds, just don't deactivate the module.

2. If we want some refinement, the other idea is to change
SlruScanDirCbDeleteAll (the callback that SlruScanDirectory uses in this
case) so that it acquires the bank lock appropriate for all the slots
used by the file that's going to be deleted.  This is OK because in the
default compilation each file only has 32 segments, so that requires
only 32 lwlocks held at once while the file is being deleted.  I think
we don't need to bother with this, but it's an option if we see the
above as unworkable for whatever reason.


The only other user of SlruScanDirCbDeleteAll is async.c (the LISTEN/
NOTIFY code), and what that does is delete all the files at server
start, where nothing is running concurrently anyway.  So whatever we do
for commit_ts, it won't affect async.c.


So, if we do away with the SLRU_MAX_BANKLOCKS idea, then we're assured
one LWLock per bank (instead of sharing some lwlocks to multiple banks),
and that makes the code simpler to reason about.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Alvaro Herrera

On 2024-Jan-26, Jelte Fennema-Nio wrote:

> Okay I tried doing that. I think the end result is indeed quite nice,
> having all the cancellation related functions together in a file. But
> it did require making a bunch of static functions in fe-connect
> extern, and adding them to libpq-int.h. On one hand that seems fine to
> me, on the other maybe that indicates that this cancellation logic
> makes sense to be in the same file as the other connection functions
> (in a sense, connecting is all that a cancel request does).

Yeah, I see that point of view as well.  I like the end result; the
additional protos in libpq-int.h don't bother me.  Does anybody else
wants to share their opinion on it?  If none, then I'd consider going
ahead with this version.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"We’ve narrowed the problem down to the customer’s pants being in a situation
 of vigorous combustion" (Robert Haas, Postgres expert extraordinaire)

Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-26 Thread Melanie Plageman

On Fri, Jan 26, 2024 at 11:44 AM Robert Haas  wrote:
>
> On Thu, Jan 25, 2024 at 12:25 PM Robert Haas  wrote:
> > I think you're probably correct. I just didn't realize what was meant.
>
> I tweaked your v12 based on this discussion and committed the result.
>
> Thanks to you for the patches, and to Peter for participating in the
> discussion which, IMHO, was very helpful in clarifying things.

Thanks! I've marked the CF entry as committed.

- Melanie

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-26 Thread Jelte Fennema-Nio

On Fri, 26 Jan 2024 at 13:11, Alvaro Herrera  wrote:
> I wonder, would it make sense to put all these new functions in a
> separate file fe-cancel.c?

Okay I tried doing that. I think the end result is indeed quite nice,
having all the cancellation related functions together in a file. But
it did require making a bunch of static functions in fe-connect
extern, and adding them to libpq-int.h. On one hand that seems fine to
me, on the other maybe that indicates that this cancellation logic
makes sense to be in the same file as the other connection functions
(in a sense, connecting is all that a cancel request does).

v26-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data

v26-0002-libpq-Add-pq_release_conn_hosts-function.patch
Description: Binary data

v26-0003-libpq-Change-some-static-functions-to-extern.patch
Description: Binary data

v26-0001-libpq-Move-cancellation-related-functions-to-fe-.patch
Description: Binary data

v26-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data

Re: Supporting MERGE on updatable views

2024-01-26 Thread Alvaro Herrera

Thanks for working on this.  The patch looks well finished.  I didn't
try to run it, though.  I gave it a read and found nothing to complain
about, just these two pretty minor comments:

On 2024-Jan-26, Dean Rasheed wrote:

> diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
> new file mode 100644
> index 13a9b7d..5a99e4a
> --- a/src/backend/executor/execMain.c
> +++ b/src/backend/executor/execMain.c
> @@ -1021,11 +1021,13 @@ InitPlan(QueryDesc *queryDesc, int eflag
>   * CheckValidRowMarkRel.
>   */
>  void
> -CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
> +CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation,
> + List *mergeActions)
>  {

I suggest to add commentary explaining the new argument of this function.

> @@ -1080,6 +1083,51 @@ CheckValidResultRel(ResultRelInfo *resul
>   
> RelationGetRelationName(resultRel)),
>errhint("To 
> enable deleting from the view, provide an INSTEAD OF DELETE trigger or an 
> unconditional ON DELETE DO INSTEAD rule.")));
>   break;
> + case CMD_MERGE:
> +
> + /*
> +  * Must have a suitable INSTEAD OF 
> trigger for each MERGE
> +  * action.  Note that the error hints 
> here differ from
> +  * above, since MERGE doesn't support 
> rules.
> +  */
> + foreach(lc, mergeActions)
> + {
> + MergeAction *action = 
> (MergeAction *) lfirst(lc);
> +
> + switch (action->commandType)
> + {
> + case CMD_INSERT:
> + if (!trigDesc 
> || !trigDesc->trig_insert_instead_row)
> + 
> ereport(ERROR,
> + 
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + 
>  errmsg("cannot insert into view \"%s\"",
> + 
> RelationGetRelationName(resultRel)),
> + 
>  errhint("To enable inserting into the view using MERGE, provide an 
> INSTEAD OF INSERT trigger.")));

Looking at the comments in ereport_view_not_updatable and the code
coverate reports, it appears that these checks here are dead code?  If
so, maybe it would be better to turn them into elog() calls?  I don't
mean to touch the existing code, just that I don't see that it makes
sense to introduce the new ones as ereport().

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas/ desprovistas, por cierto
 de blandos atenuantes"  (Patricio Vogel)

Re: pg_upgrade failing for 200+ million Large Objects

2024-01-26 Thread Tom Lane

vignesh C  writes:
> CFBot shows that the patch does not apply anymore as in [1]:
> === Applying patches on top of PostgreSQL commit ID
> 46a0cd4cefb4d9b462d8cc4df5e7ecdd190bea92 ===
> === applying patch ./v9-005-parallel_pg_restore.patch
> patching file src/bin/pg_upgrade/pg_upgrade.c
> Hunk #3 FAILED at 650.
> 1 out of 3 hunks FAILED -- saving rejects to file
> src/bin/pg_upgrade/pg_upgrade.c.rej

That's because v9-005 was posted by itself.  But I don't think
we should use it anyway.

Here's 0001-0004 again, updated to current HEAD (only line numbers
changed) and with Nathan's suggestion to define some macros for
the magic constants.

regards, tom lane

From c48b11547c6eb95ab217dddc047da5378042452c Mon Sep 17 00:00:00 2001
From: Tom Lane 
Date: Fri, 26 Jan 2024 11:10:00 -0500
Subject: [PATCH v10 1/4] Some small preliminaries for pg_dump changes.

Centralize management of the lo_buf used to hold data while restoring
blobs.  The code previously had each format handler create lo_buf,
which seems rather pointless given that the format handlers all make
it the same way.  Moreover, the format handlers never use lo_buf
directly, making this setup a failure from a separation-of-concerns
standpoint.  Let's move the responsibility into pg_backup_archiver.c,
which is the only module concerned with lo_buf.  The main reason to do
this now is that it allows a centralized fix for the soon-to-be-false
assumption that we never restore blobs in parallel.

Also, get rid of dead code in DropLOIfExists: it's been a long time
since we had any need to be able to restore to a pre-9.0 server.
---
 src/bin/pg_dump/pg_backup_archiver.c  |  9 +
 src/bin/pg_dump/pg_backup_custom.c|  7 ---
 src/bin/pg_dump/pg_backup_db.c| 27 +--
 src/bin/pg_dump/pg_backup_directory.c |  6 --
 src/bin/pg_dump/pg_backup_null.c  |  4 
 src/bin/pg_dump/pg_backup_tar.c   |  4 
 6 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 256d1e35a4..26c2c684c8 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -1343,6 +1343,12 @@ StartRestoreLO(ArchiveHandle *AH, Oid oid, bool drop)
 	AH->loCount++;
 
 	/* Initialize the LO Buffer */
+	if (AH->lo_buf == NULL)
+	{
+		/* First time through (in this process) so allocate the buffer */
+		AH->lo_buf_size = LOBBUFSIZE;
+		AH->lo_buf = (void *) pg_malloc(LOBBUFSIZE);
+	}
 	AH->lo_buf_used = 0;
 
 	pg_log_info("restoring large object with OID %u", oid);
@@ -4748,6 +4754,9 @@ CloneArchive(ArchiveHandle *AH)
 	/* clone has its own error count, too */
 	clone->public.n_errors = 0;
 
+	/* clones should not share lo_buf */
+	clone->lo_buf = NULL;
+
 	/*
 	 * Connect our new clone object to the database, using the same connection
 	 * parameters used for the original connection.
diff --git a/src/bin/pg_dump/pg_backup_custom.c b/src/bin/pg_dump/pg_backup_custom.c
index b576b29924..7c6ac89dd4 100644
--- a/src/bin/pg_dump/pg_backup_custom.c
+++ b/src/bin/pg_dump/pg_backup_custom.c
@@ -140,10 +140,6 @@ InitArchiveFmt_Custom(ArchiveHandle *AH)
 	ctx = (lclContext *) pg_malloc0(sizeof(lclContext));
 	AH->formatData = (void *) ctx;
 
-	/* Initialize LO buffering */
-	AH->lo_buf_size = LOBBUFSIZE;
-	AH->lo_buf = (void *) pg_malloc(LOBBUFSIZE);
-
 	/*
 	 * Now open the file
 	 */
@@ -902,9 +898,6 @@ _Clone(ArchiveHandle *AH)
 	 * share knowledge about where the data blocks are across threads.
 	 * _PrintTocData has to be careful about the order of operations on that
 	 * state, though.
-	 *
-	 * Note: we do not make a local lo_buf because we expect at most one BLOBS
-	 * entry per archive, so no parallelism is possible.
 	 */
 }
 
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index f766b65059..b297ca049d 100644
--- a/src/bin/pg_dump/pg_backup_db.c
+++ b/src/bin/pg_dump/pg_backup_db.c
@@ -544,26 +544,9 @@ CommitTransaction(Archive *AHX)
 void
 DropLOIfExists(ArchiveHandle *AH, Oid oid)
 {
-	/*
-	 * If we are not restoring to a direct database connection, we have to
-	 * guess about how to detect whether the LO exists.  Assume new-style.
-	 */
-	if (AH->connection == NULL ||
-		PQserverVersion(AH->connection) >= 9)
-	{
-		ahprintf(AH,
- "SELECT pg_catalog.lo_unlink(oid) "
- "FROM pg_catalog.pg_largeobject_metadata "
- "WHERE oid = '%u';\n",
- oid);
-	}
-	else
-	{
-		/* Restoring to pre-9.0 server, so do it the old way */
-		ahprintf(AH,
- "SELECT CASE WHEN EXISTS("
- "SELECT 1 FROM pg_catalog.pg_largeobject WHERE loid = '%u'"
- ") THEN pg_catalog.lo_unlink('%u') END;\n",
- oid, oid);
-	}
+	ahprintf(AH,
+			 "SELECT pg_catalog.lo_unlink(oid) "
+			 "FROM pg_catalog.pg_largeobject_metadata "
+			 "WHERE oid = '%u';\n",
+			 oid);
 }
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
index

Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-26 Thread Robert Haas

On Thu, Jan 25, 2024 at 12:25 PM Robert Haas  wrote:
> I think you're probably correct. I just didn't realize what was meant.

I tweaked your v12 based on this discussion and committed the result.

Thanks to you for the patches, and to Peter for participating in the
discussion which, IMHO, was very helpful in clarifying things.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-01-26 Thread Alvaro Herrera

On 2024-Jan-26, Alvaro Herrera wrote:

> On 2024-Jan-26, vignesh C wrote:
> 
> > Please post an updated version for the same.
> 
> Here's a rebase.  I only fixed the conflicts, didn't review.

Hmm, but I got the attached regression.diffs with it.  I didn't
investigate further, but it looks like the recent changes to replication
identity for partitioned tables has broken the regression tests.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php
diff -U3 /pgsql/source/master/src/test/regress/expected/partition_split.out 
/home/alvherre/Code/pgsql-build/master/src/test/regress/results/partition_split.out
--- /pgsql/source/master/src/test/regress/expected/partition_split.out  
2024-01-26 14:57:39.549730792 +0100
+++ 
/home/alvherre/Code/pgsql-build/master/src/test/regress/results/partition_split.out
 2024-01-26 15:22:15.007059433 +0100
@@ -777,8 +777,12 @@
 -- Create new partition with identity-column:
 CREATE TABLE salesmans2_5(salesman_id INT GENERATED ALWAYS AS IDENTITY PRIMARY 
KEY, salesman_name VARCHAR(30));
 ALTER TABLE salesmans ATTACH PARTITION salesmans2_5 FOR VALUES FROM (2) TO (5);
+ERROR:  table "salesmans2_5" being attached contains an identity column 
"salesman_id"
+DETAIL:  The new partition may not contain an identity column.
 INSERT INTO salesmans (salesman_name) VALUES ('Poirot');
 INSERT INTO salesmans (salesman_name) VALUES ('Ivanov');
+ERROR:  no partition of relation "salesmans" found for row
+DETAIL:  Partition key of the failing row contains (salesman_id) = (2).
 SELECT attname, attidentity, attgenerated FROM pg_attribute WHERE attnum > 0 
AND attrelid = 'salesmans'::regclass::oid;
 attname| attidentity | attgenerated 
 ---+-+--
@@ -789,7 +793,7 @@
 SELECT attname, attidentity, attgenerated FROM pg_attribute WHERE attnum > 0 
AND attrelid = 'salesmans1_2'::regclass::oid;
 attname| attidentity | attgenerated 
 ---+-+--
- salesman_id   | | 
+ salesman_id   | a   | 
  salesman_name | | 
 (2 rows)
 
@@ -805,8 +809,13 @@
   (PARTITION salesmans2_3 FOR VALUES FROM (2) TO (3),
PARTITION salesmans3_4 FOR VALUES FROM (3) TO (4),
PARTITION salesmans4_5 FOR VALUES FROM (4) TO (5));
+ERROR:  partition bound for relation "salesmans2_5" is null
 INSERT INTO salesmans (salesman_name) VALUES ('May');
+ERROR:  no partition of relation "salesmans" found for row
+DETAIL:  Partition key of the failing row contains (salesman_id) = (3).
 INSERT INTO salesmans (salesman_name) VALUES ('Ford');
+ERROR:  no partition of relation "salesmans" found for row
+DETAIL:  Partition key of the failing row contains (salesman_id) = (4).
 SELECT * FROM salesmans1_2;
  salesman_id | salesman_name 
 -+---
@@ -814,23 +823,17 @@
 (1 row)
 
 SELECT * FROM salesmans2_3;
- salesman_id | salesman_name 
--+---
-   2 | Ivanov
-(1 row)
-
+ERROR:  relation "salesmans2_3" does not exist
+LINE 1: SELECT * FROM salesmans2_3;
+  ^
 SELECT * FROM salesmans3_4;
- salesman_id | salesman_name 
--+---
-   3 | May
-(1 row)
-
+ERROR:  relation "salesmans3_4" does not exist
+LINE 1: SELECT * FROM salesmans3_4;
+  ^
 SELECT * FROM salesmans4_5;
- salesman_id | salesman_name 
--+---
-   4 | Ford
-(1 row)
-
+ERROR:  relation "salesmans4_5" does not exist
+LINE 1: SELECT * FROM salesmans4_5;
+  ^
 SELECT attname, attidentity, attgenerated FROM pg_attribute WHERE attnum > 0 
AND attrelid = 'salesmans'::regclass::oid;
 attname| attidentity | attgenerated 
 ---+-+--
@@ -841,32 +844,23 @@
 SELECT attname, attidentity, attgenerated FROM pg_attribute WHERE attnum > 0 
AND attrelid = 'salesmans1_2'::regclass::oid;
 attname| attidentity | attgenerated 
 ---+-+--
- salesman_id   | | 
+ salesman_id   | a   | 
  salesman_name | | 
 (2 rows)
 
 -- New partitions have identity-columns:
 SELECT attname, attidentity, attgenerated FROM pg_attribute WHERE attnum > 0 
AND attrelid = 'salesmans2_3'::regclass::oid;
-attname| attidentity | attgenerated 
+-+--
- salesman_id   | a   | 
- salesman_name | | 
-(2 rows)
-
+ERROR:  relation "salesmans2_3" does not exist
+LINE 1: ...FROM pg_attribute WHERE attnum > 0 AND attrelid = 'salesmans...
+ ^
 SELECT attname, attidentity, attgenerated FROM pg_attribute WHERE attnum > 0 
AND attrelid = 'salesmans3_4'::regclass::oid;
-attname|

Re: cleanup patches for incremental backup

2024-01-26 Thread Robert Haas

On Thu, Jan 25, 2024 at 11:08 AM Nathan Bossart
 wrote:
> On Thu, Jan 25, 2024 at 10:06:41AM -0500, Robert Haas wrote:
> > On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart  
> > wrote:
> >> That seems like a reasonable starting point.  Even if it doesn't help
> >> determine the root cause, it should at least help rule out concurrent
> >> summary removal.
> >
> > Here is a patch for that.
>
> LGTM.  The only thing I might add is the cutoff_time in that LOG.

Here is v2 with that addition.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v2-0001-Temporary-patch-to-help-debug-pg_walsummary-test-.patch
Description: Binary data

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread Alvaro Herrera

On 2024-Jan-26, Dean Rasheed wrote:

> I think it has had a decent amount of review and all the review
> comments have been addressed. I'm not quite sure from Alvaro's last
> comment whether he was implying that he thought it was ready for
> commit.

Well, firstly this is clearly a feature we want to have, even though
it's non-standard, because people use it and other implementations have
it.  (Eh, so maybe somebody should be talking to the SQL standard
committee about it).  As for code quality, I didn't do a comprehensive
review, but I think it is quite reasonable.  Therefore, my inclination
would be to get it committed soonish, and celebrate it widely so that
people can test it soon and complain if they see something they don't
like.

I have to say that I find the idea of booting patches as Returned with
Feedback just because of inactivity (as opposed to unresponsive authors)
rather wrong-headed, and I wish we didn't do it.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread Dean Rasheed

n Fri, 26 Jan 2024 at 14:59, vignesh C  wrote:
>
> CFBot shows that the patch does not apply anymore as in [1]:
>

Rebased version attached.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index f8f83d4..6ef0c2b
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 655f7dc..f421716
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by definition there is no matching target row.
   Only the system attributes from the target table are accessible.
  
@@ -382,8 +409,10 @@ DELETE
   WHEN MATCHED clause, the expression can use values
   from the original row in the target table, and values from the
   data_source row.
-  If used in a WHEN NOT MATCHED clause, the
-  expression can use values from the data_source.
+  If used in a WHEN NOT MATCHED BY SOURCE clause, the
+  expression can only use values from the original row in the target table.
+  If used in a WHEN NOT MATCHED [BY TARGET] clause, the
+  expression can only use values from the data_source.
  
 

@@ -452,8 +481,9 @@ MERGE tot

 
  
-  Evaluate whether each row is MATCHED or
-  NOT MATCHED.
+  Evaluate whether each row is MATCHED,
+  NOT MATCHED BY SOURCE, or
+  NOT MATCHED [BY TARGET].
  
 
 
@@ -528,7 +558,8 @@ MERGE tot
   
If a

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread Dean Rasheed

On Mon, 22 Jan 2024 at 02:10, Peter Smith  wrote:
>
> Hi, this patch was marked in CF as "Needs Review" [1], but there has
> been no activity on this thread for 6+ months.
>
> Is anything else planned? Can you post something to elicit more
> interest in the latest patch? Otherwise, if nothing happens then the
> CF entry will be closed ("Returned with feedback") at the end of this
> CF.
>

I think it has had a decent amount of review and all the review
comments have been addressed. I'm not quite sure from Alvaro's last
comment whether he was implying that he thought it was ready for
commit.

Looking back through the thread, the general sentiment seems to be in
favour of adding this feature, and I still think it's worth doing, but
I haven't managed to find much time to progress it recently.

Regards,
Dean

Re: Add new error_action COPY ON_ERROR "log"

2024-01-26 Thread David G. Johnston

On Thu, Jan 25, 2024 at 9:42 AM torikoshia 
wrote:

> Hi,
>
> As described in 9e2d870119, COPY ON_EEOR is expected to have more
> "error_action".
> (Note that option name was changed by b725b7eec)
>
> I'd like to have a new option "log", which skips soft errors and logs
> information that should have resulted in errors to PostgreSQL log.
>
>
Seems like an easy win but largely unhelpful in the typical case.  I
suppose ETL routines using this feature may be running on their machine
under root or "postgres" but in a system where they are not this very
useful information is inaccessible to them.  I suppose the DBA could set up
an extractor to send these specific log lines elsewhere but that seems like
enough hassle to disfavor this approach and favor one that can place the
soft error data and feedback into user-specified tables in the same
database.  Setting up temporary tables or unlogged tables probably is going
to be a more acceptable methodology than trying to get to the log files.

David J.

Re: POC: GROUP BY optimization

2024-01-26 Thread Robert Haas

On Fri, Jan 26, 2024 at 10:38 AM Tom Lane  wrote:
> Sadly, that's not a small task:
>
> * We'd need to put effort into assigning more realistic procost
> values --- preferably across the board, not just comparison functions.
> As long as all the comparison functions have procost 1.0, you're
> just flying blind.
>
> * As you mentioned, there'd need to be some accounting for the
> likely size of varlena inputs, and especially whether they might
> be toasted.
>
> * cost_sort knows nothing of the low-level sort algorithm improvements
> we've made in recent years, such as abbreviated keys.
>
> That's a lot of work, and I think it has to be done before we try
> to build infrastructure on top, not afterwards.

OK, that makes sense to me. Thanks.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: POC: GROUP BY optimization

2024-01-26 Thread Tom Lane

Robert Haas  writes:
> On Tue, Dec 26, 2023 at 10:23 PM Tom Lane  wrote:
>> I think it's a fool's errand to even try to separate different sort
>> column orderings by cost.  We simply do not have sufficiently accurate
>> cost information.  The previous patch in this thread got reverted because
>> of that (well, also some implementation issues, but mostly that), and
>> nothing has happened to make me think that another try will fare any
>> better.

> I'm late to the party, but I'd like to better understand what's being
> argued here.

What I am saying is that we don't have sufficiently accurate cost
information to support the sort of logic that got committed and
reverted before.  I did not mean to imply that it's not possible
to have such info, only that it is not present today.  IOW, what
I'm saying is that if you want to write code that tries to make
a cost-based preference of one sorting over another, you *first*
need to put in a bunch of legwork to create more accurate cost
numbers.  Trying to make such logic depend on the numbers we have
today is just going to result in garbage in, garbage out.

Sadly, that's not a small task:

* We'd need to put effort into assigning more realistic procost
values --- preferably across the board, not just comparison functions.
As long as all the comparison functions have procost 1.0, you're
just flying blind.

* As you mentioned, there'd need to be some accounting for the
likely size of varlena inputs, and especially whether they might
be toasted.

* cost_sort knows nothing of the low-level sort algorithm improvements
we've made in recent years, such as abbreviated keys.

That's a lot of work, and I think it has to be done before we try
to build infrastructure on top, not afterwards.

regards, tom lane

Re: Add new COPY option REJECT_LIMIT

2024-01-26 Thread David G. Johnston

On Fri, Jan 26, 2024 at 2:49 AM torikoshia 
wrote:

> Hi,
>
> 9e2d870 enabled the COPY command to skip soft error, and I think we can
> add another option which specifies the maximum tolerable number of soft
> errors.
>
> I remember this was discussed in [1], and feel it would be useful when
> loading 'dirty' data but there is a limit to how dirty it can be.
>
> Attached a patch for this.
>
> What do you think?
>
>
I'm opposed to adding this particular feature.

When implementing this kind of business rule I'd need the option to specify
a percentage, not just an absolute value.

I would focus on trying to put the data required to make this kind of
determination into a place where applications implementing such business
rules and monitoring can readily get at it.  The "ERRORS TO" and maybe a
corresponding "STATS TO" option where a table can be specified for the
system to place the problematic data and stats about the copy itself.

David J.

Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning

2024-01-26 Thread vignesh C

On Wed, 4 Oct 2023 at 04:02, Ashutosh Bapat
 wrote:
>
> On Fri, Sep 29, 2023 at 8:36 AM Amit Langote  wrote:
> > IOW, something
> > like the following would have sufficed:
> >
> > @@ -1735,6 +1735,10 @@ build_child_join_sjinfo(PlannerInfo *root,
> > SpecialJoinInfo *parent_sjinfo,
> >  /*
> >   * free_child_sjinfo_members
> >   * Free memory consumed by members of a child SpecialJoinInfo.
> > + *
> > + * Only members that are translated copies of their counterpart in the 
> > parent
> > + * SpecialJoinInfo are freed here.  However, members that could be 
> > referenced
> > + * elsewhere are not freed.
> >   */
> >  static void
> >  free_child_sjinfo_members(SpecialJoinInfo *child_sjinfo)
> > @@ -1755,10 +1759,7 @@ free_child_sjinfo_members(SpecialJoinInfo 
> > *child_sjinfo)
> > bms_free(child_sjinfo->syn_lefthand);
> > bms_free(child_sjinfo->syn_righthand);
> >
> > -   /*
> > -* But the list of operator OIDs and the list of expressions may be
> > -* referenced somewhere else. Do not free those.
> > -*/
> > +   /* semi_rhs_exprs may be referenced, so don't free. */
> >  }
>
> Works for me. PFA patchset with these changes. I have still left the
> changes addressing your comments as a separate patch for easier
> review.

CFBot shows that the patch does not apply anymore as in [1]:

=== Applying patches on top of PostgreSQL commit ID
55627ba2d334ce98e1f5916354c46472d414bda6 ===
=== applying patch
./0001-Report-memory-used-for-planning-a-query-in--20231003.patch
...
Hunk #1 succeeded at 89 (offset -3 lines).
patching file src/test/regress/expected/explain.out
Hunk #5 FAILED at 285.
Hunk #6 succeeded at 540 (offset 4 lines).
1 out of 6 hunks FAILED -- saving rejects to file
src/test/regress/expected/explain.out.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4500.log

Regards,
Vignesh

Re: POC: GROUP BY optimization

2024-01-26 Thread Robert Haas

On Tue, Dec 26, 2023 at 10:23 PM Tom Lane  wrote:
> I think it's a fool's errand to even try to separate different sort
> column orderings by cost.  We simply do not have sufficiently accurate
> cost information.  The previous patch in this thread got reverted because
> of that (well, also some implementation issues, but mostly that), and
> nothing has happened to make me think that another try will fare any
> better.

I'm late to the party, but I'd like to better understand what's being
argued here. If you're saying that, for some particular planner
problem, we should prefer a solution that doesn't need to know about
the relative cost of various sorts over one that does, I agree, for
exactly the reason that you state: our knowledge of sort costs won't
be reliable, and we will make mistakes. That's true in lots of
situations, not just related to sorts,
because estimation is a hard problem. Heuristics not based on cost are
going to be, in many cases, more accurate than heuristics based on
cost. They're also often cheaper, since they often let us reject
possible approaches very early, without all the bother of a cost
comparison.

But if you're saying that it's utterly impossible to know whether
sorting text will be cheaper or more expensive than sorting 4-byte
integers, and that if a particular problem can be solved only by
knowing which one is cheaper we should just give up, then I disagree.
In the absence of any other information, it must be right, at the very
least, to bank on varlena data types being more expensive to sort than
fixed-length data types. How much more expensive is hard to know,
because toasted blobs are going to be more expensive to sort than
short varlenas. But even before you reach the comparison function, a
pass-by-value datum has a significantly lower access cost than a
pass-by-reference datum. The fact that the pass-by-reference value
might be huge only compounds the problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

2024-01-26 Thread David G. Johnston

Hi,

The option choice of "ignore" in the COPY ON_ERROR clause seems overly
generic.  There would seem to be two relevant ways to ignore bad column
input data - drop the entire row or just set the column value to null.  I
can see us wanting to provide the set to null option and in any case having
the option name be explicit that it ignores the row seems like a good idea.

David J.

Re: Small fix on COPY ON_ERROR document

2024-01-26 Thread David G. Johnston

On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA  wrote:

> On Fri, 26 Jan 2024 00:00:57 -0700
> "David G. Johnston"  wrote:
>
> > I will need to make this tweak and probably a couple others to my own
> > suggestions in 12 hours or so.
> >
>
>
And here is my v2.

Notably I choose to introduce the verbiage "soft error" and then define in
the ON_ERROR clause the specific soft error that matters here - "invalid
input syntax".

I also note the log message behavior when ignore mode is chosen.  I haven't
confirmed that it is accurate but that is readily tweaked if approved of.

David J.
From 2d656fe0a69ea1349472d88794c18f8a2e2d37e9 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Thu, 25 Jan 2024 23:35:44 -0700
Subject: [PATCH] v2 Improve ON_ERROR verbiage in COPY documentation

---
 doc/src/sgml/ref/copy.sgml | 48 +++---
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..38ce0b0a4c 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -90,6 +90,13 @@ COPY { table_name [ ( pg_stat_progress_copy view. See
  for details.
   
+
+  
+By default, COPY will fail if it encounters an error
+during processing. For use cases where a best-effort attempt at loading
+the entire file is desired, the ON_ERROR clause can
+be used to specify some other behavior.
+  
  
 
  
@@ -378,17 +385,23 @@ COPY { table_name [ ( ON_ERROR
 
  
-  Specifies which 
-  error_action to perform when there is malformed data in the input.
-  Currently, only stop (default) and ignore
-  values are supported.
-  If the stop value is specified,
-  COPY stops operation at the first error.
-  If the ignore value is specified,
-  COPY skips malformed data and continues copying data.
-  The option is allowed only in COPY FROM.
-  Only stop value is allowed when
-  using binary format.
+  Specifies which how to behave when encountering a soft error.
+  An error_action value of
+  stop means fail the command, while
+  ignore means discard the input row and continue with the next one.
+  The default is stop
+ 
+ 
+  The ignore option is applicable only for COPY FROM
+  when the FORMAT is text or csv.
+ 
+ 
+  The only relevant soft error is "invalid input syntax", which manifests when attempting
+  to create a column value from the text input.
+ 
+ 
+  An INFO level context message containing the ignored row count is
+  emitted at the end of the COPY FROM if at least one row was discarded.
  
 

@@ -576,15 +589,12 @@ COPY count

 

-COPY stops operation at the first error when
-ON_ERROR is not specified. This
-should not lead to problems in the event of a COPY
-TO, but the target table will already have received
-earlier rows in a COPY FROM. These rows will not
-be visible or accessible, but they still occupy disk space. This might
+The COPY FROM command physically inserts input rows
+into the table as it progresses.  If the command fails these rows are
+left in a deleted state, still occupying disk space. This might
 amount to a considerable amount of wasted disk space if the failure
-happened well into a large copy operation. You might wish to invoke
-VACUUM to recover the wasted space.
+happened well into a large copy operation. VACUUM
+should be used to recover the wasted space.

 

-- 
2.34.1

Re: A failure in t/038_save_logical_slots_shutdown.pl

2024-01-26 Thread Bharath Rupireddy

On Thu, Jan 25, 2024 at 5:07 PM Amit Kapila  wrote:
>
> On Thu, Jan 25, 2024 at 4:27 PM Bharath Rupireddy
>  wrote:
> >
> > Thanks. I'll wait a while and then add it to CF to not lose it in the wild.
> >
>
> Feel free to add it to CF. However, I do plan to look at it in the
> next few days.

Thanks. CF entry is here https://commitfest.postgresql.org/47/4796/.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Supporting MERGE on updatable views

2024-01-26 Thread vignesh C

On Mon, 30 Oct 2023 at 15:04, Dean Rasheed  wrote:
>
> On Sun, 29 Oct 2023 at 17:17, Dean Rasheed  wrote:
> >
> > On Sat, 28 Oct 2023 at 09:35, jian he  wrote:
> > >
> > > Do we need to add  > >
> > We don't want to include MERGE in that sentence, because MERGE isn't
> > supported on views or tables with rules, but I guess we could add
> > another sentence after that one, to make that clear.
> >
>
> Here's an updated patch doing that, plus another couple of minor
> updates to that page.
>
> I also noticed that the code to detect rules ought to ignore disabled
> rules, so I've updated it to do so, and added a new regression test to
> cover that case.
>
> Arguably that's a pre-existing bug, so the fix could be extracted and
> applied separately, but I'm not sure that it's worth the effort.

CFBot shows that the patch does not apply anymore as in [1]:

=== Applying patches on top of PostgreSQL commit ID
55627ba2d334ce98e1f5916354c46472d414bda6 ===
=== applying patch ./support-merge-into-view-v10.patch

patching file src/backend/executor/nodeModifyTable.c
...
Hunk #7 FAILED at 2914.
...
1 out of 15 hunks FAILED -- saving rejects to file
src/backend/executor/nodeModifyTable.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4076.log

Regards,
Vignesh

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2024-01-26 Thread vignesh C

On Sat, 1 Jul 2023 at 18:04, Dean Rasheed  wrote:
>
> On Tue, 21 Mar 2023 at 12:26, Alvaro Herrera  wrote:
> >
> > On 2023-Mar-21, Dean Rasheed wrote:
> >
> > > Looking at it with fresh eyes though, I realise that I could have just 
> > > written
> > >
> > > action->qual = make_and_qual((Node *) ntest, action->qual);
> > >
> > > which is equivalent, but more concise.
> >
> > Nice.
> >
> > I have no further observations about this patch.
> >
>
> Looking at this one afresh, it seems that the change to make Vars
> outer-join aware broke it -- the Var in the qual to test whether the
> source row is null needs to be marked as nullable by the join added by
> transform_MERGE_to_join(). That's something that needs to be done in
> transform_MERGE_to_join(), so it makes more sense to add the new qual
> there rather than in transformMergeStmt().
>
> Also, now that MERGE has ruleutils support, it's clear that adding the
> qual in transformMergeStmt() isn't right anyway, since it would then
> appear in the deparsed output.
>
> So attached is an updated patch doing that, which seems neater all
> round, since adding the qual is closely related to the join-type
> choice, which is now a decision taken entirely in
> transform_MERGE_to_join(). This requires a new "mergeSourceRelation"
> field on the Query structure, but as before, it does away with the
> "mergeUseOuterJoin" field.
>
> I've also updated the ruleutils support. In the absence of any WHEN
> NOT MATCHED BY SOURCE actions, this will output not-matched actions
> simply as "WHEN NOT MATCHED" for backwards compatibility, and to be
> SQL-standard-compliant. If there are any WHEN NOT MATCHED BY SOURCE
> actions though, I think it's preferable to output explicit "BY SOURCE"
> and "BY TARGET" qualifiers for all not-matched actions, to make the
> meaning clearer.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
f2bf8fb04886e3ea82e7f7f86696ac78e06b7e60 ===
=== applying patch ./support-merge-when-not-matched-by-source-v8.patch
...
patching file doc/src/sgml/ref/merge.sgml
Hunk #5 FAILED at 409.
Hunk #9 FAILED at 673.
2 out of 9 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/merge.sgml.rej
..
patching file src/include/nodes/parsenodes.h
Hunk #1 succeeded at 175 (offset -8 lines).
Hunk #2 succeeded at 1657 (offset -6 lines).
Hunk #3 succeeded at 1674 (offset -6 lines).
Hunk #4 FAILED at 1696.
1 out of 4 hunks FAILED -- saving rejects to file
src/include/nodes/parsenodes.h.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4092.log

Regards,
Vignesh

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-26 Thread reid . thompson

On Fri, 2024-01-26 at 13:51 +0900, Masahiko Sawada wrote:
> On Fri, Jan 26, 2024 at 1:24 PM  wrote:
> > 
> > On Thu, 2024-01-25 at 16:11 -0500, reid.thomp...@crunchydata.com wrote:
> > > 
> > > I walked through v6 and didn't note any issues.
> 
> Thank you for reviewing the patch!
> 
Happy to.
> 
> I'm not sure these changes are really beneficial. They contribute to
> improving neither readability and performance IMO.
> 
> Regards,
> 
I thought that may be the case, but wanted to ask to be sure. Thank you
for the followup.

Reid

Re: Transaction timeout

2024-01-26 Thread Japin Li



On Fri, 26 Jan 2024 at 14:44, Andrey M. Borodin  wrote:
>> On 22 Jan 2024, at 11:23, Peter Smith  wrote:
>>
>> Hi, This patch has a CF status of "Needs Review" [1], but it seems
>> there was a CFbot test failure last time it was run [2]. Please have a
>> look and post an updated version if necessary.
> Thanks Peter!
>

Thanks for updating the patch.  Here are some comments for v24.

+   
+Terminate any session that spans longer than the specified amount of
+time in transaction. The limit applies both to explicit transactions
+(started with BEGIN) and to implicitly started
+transaction corresponding to single statement. But this limit is not
+applied to prepared transactions.
+If this value is specified without units, it is taken as milliseconds.
+A value of zero (the default) disables the timeout.
+   
The sentence "But this limit is not applied to prepared transactions" is 
redundant,
since we have a paragraph to describe this later.

+
+   
+If transaction_timeout is shorter than
+idle_in_transaction_session_timeout or 
statement_timeout
+transaction_timeout will invalidate longer timeout.
+   
+

Since we are already try to disable the timeouts, should we try to disable
them even if they are equal.

+
+   
+Prepared transactions are not subject for this timeout.
+   

Maybe wrap this with  is a good idea.

> I’ve inspected CI fails and they were caused by two different problems:
> 1. It’s unsafe for isaoltion tester to await transaction_timeout within a 
> query. Usually it gets
> FATAL: terminating connection due to transaction timeout
> But if VM is a bit slow it can get occasional
> PQconsumeInput failed: server closed the connection unexpectedly
> So, currently all tests use “passive waiting”, in a session that will not 
> timeout.
>
> 2. In some cases pg_sleep(0.1) were sleeping up to 200 ms. That was making s7 
> and s8 fail, because they rely on this margin.

I'm curious why this happened.

> I’ve separated these tests into different test timeouts-long and increased 
> margin to 300ms. Now tests run horrible 2431 ms. Moreover I’m afraid that on 
> buildfarm we can have much randomly-slower machines so this test might be 
> excluded.
> This test checks COMMIT AND CHAIN and flow of small queries (Nik’s case).
>
> Also I’ve verified that every "enable_timeout_after(TRANSACTION_TIMEOUT)” and 
> “disable_timeout(TRANSACTION_TIMEOUT)” is necessary and found that case of 
> aborting "idle in transaction (aborted)” is not covered by tests. I’m not 
> sure we need a test for this.

I see there is a test about idle_in_transaction_timeout and transaction_timeout.

Both of them only check the session, but don't check the reason, so we cannot
distinguish the reason they are terminated.  Right?

> Japin, Junwang, what do you think?

However, checking the reason on the timeout session may cause regression test
failed (as you point in 1), I don't strongly insist on it.

--
Best regards,
Japin Li.

Re: Lockless queue of waiters in LWLock

2024-01-26 Thread vignesh C

On Sat, 20 Jan 2024 at 07:28, vignesh C  wrote:
>
> On Sat, 26 Nov 2022 at 00:24, Pavel Borisov  wrote:
> >
> > Hi, hackers!
> > In the measurements above in the thread, I've been using LIFO wake
> > queue in a primary lockless patch (and it was attached as the previous
> > versions of a patch) and an "inverted wake queue" (in faсt FIFO) as
> > the alternative benchmarking option. I think using the latter is more
> > fair and natural and provided they show no difference in the speed,
> > I'd make the main patch using it (attached as v6). No other changes
> > from v5, though.
>
> There has not been much interest on this as the thread has been idle
> for more than a year now. I'm not sure if we should take it forward or
> not. I would prefer to close this in the current commitfest unless
> someone wants to take it further.

I have returned this patch in commitfest as nobody had shown any
interest in pursuing it. Feel free to add a new entry when someone
wants to work on this more actively.

Regards,
Vignesh

Re: PoC: prefetching index leaf pages (for inserts)

2024-01-26 Thread vignesh C

On Mon, 6 Nov 2023 at 22:36, Tomas Vondra  wrote:
>
> Seems cfbot was not entirely happy about the patch, for two reasons:
>
> 1) enable_insert_prefetching definition was inconsistent (different
> boot/default values, missing in .conf and so on)
>
> 2) stupid bug in execReplication, inserting index entries twice
>
> The attached v3 should fix all of that, I believe.
>
>
> As for the path forward, I think the prefetching is demonstrably
> beneficial. There are cases where it can't help or even harms
> performance. I think the success depends on three areas:
>
> (a) reducing the costs of the prefetching - For example right now we
> build the index tuples twice (once for prefetch, once for the insert),
> but maybe there's a way to do that only once? There are also predicate
> indexes, and so on.
>
> (b) being smarter about when to prefetch - For example if we only have
> one "prefetchable" index, it's somewhat pointless to prefetch (for
> single-row cases). And so on.
>
> (c) not prefetching when already cached - This is somewhat related to
> the previous case, but perhaps it'd be cheaper to first check if the
> data is already cached. For shared buffers it should not be difficult,
> for page cache we could use preadv2 with RWF_NOWAIT flag. The question
> is if this is cheap enough to be cheaper than just doing posix_fadvise
> (which however only deals with shared buffers).

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
7014c9a4bba2d1b67d60687afb5b2091c1d07f73 ===
=== applying patch ./0001-insert-prefetch-v3.patch
patching file src/backend/access/brin/brin.c
Hunk #1 FAILED at 117.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/access/brin/brin.c.rej
patching file src/backend/access/gin/ginutil.c
Hunk #1 FAILED at 64.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/access/gin/ginutil.c.rej
patching file src/backend/access/gist/gist.c
Hunk #1 FAILED at 86.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/access/gist/gist.c.rej
patching file src/backend/access/hash/hash.c
Hunk #1 FAILED at 83.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/access/hash/hash.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4622.log

Regards,
Vignesh

Re: pg_upgrade failing for 200+ million Large Objects

2024-01-26 Thread vignesh C

On Tue, 2 Jan 2024 at 23:03, Kumar, Sachin  wrote:
>
> > On 11/12/2023, 01:43, "Tom Lane"  > > wrote:
>
> > I had initially supposed that in a parallel restore we could
> > have child workers also commit after every N TOC items, but was
> > soon disabused of that idea. After a worker processes a TOC
> > item, any dependent items (such as index builds) might get
> > dispatched to some other worker, which had better be able to
> > see the results of the first worker's step. So at least in
> > this implementation, we disable the multi-command-per-COMMIT
> > behavior during the parallel part of the restore. Maybe that
> > could be improved in future, but it seems like it'd add a
> > lot more complexity, and it wouldn't make life any better for
> > pg_upgrade (which doesn't use parallel pg_restore, and seems
> > unlikely to want to in future).
>
> I was not able to find email thread which details why we are not using
> parallel pg_restore for pg_upgrade. IMHO most of the customer will have 
> single large
> database, and not using parallel restore will cause slow pg_upgrade.
>
> I am attaching a patch which enables parallel pg_restore for DATA and 
> POST-DATA part
> of dump. It will push down --jobs value to pg_restore and will restore 
> database sequentially.

CFBot shows that the patch does not apply anymore as in [1]:
=== Applying patches on top of PostgreSQL commit ID
46a0cd4cefb4d9b462d8cc4df5e7ecdd190bea92 ===
=== applying patch ./v9-005-parallel_pg_restore.patch
patching file src/bin/pg_upgrade/pg_upgrade.c
Hunk #3 FAILED at 650.
1 out of 3 hunks FAILED -- saving rejects to file
src/bin/pg_upgrade/pg_upgrade.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4713.log

Regards,
Vignesh

RE: Current Connection Information

2024-01-26 Thread Maiquel Grassi

Hi Aleksander,

>>I assume you wanted to reply to the mailing list and add me to cc:
>>(aka "Reply to All") but sent the e-mail off-list by mistake, so
>>quoting it here:

Yes, tks for that.

>>IMO it's worth trying submitting the patch, if your time permits it of course.

I've been spending a little time thinking about this.

Regards,
Maiquel.

Re: logical decoding and replication of sequences, take 2

2024-01-26 Thread Robert Haas

On Wed, Jan 24, 2024 at 12:46 PM Tomas Vondra
 wrote:
> I did try to explain how this works (and why) in a couple places:
>
> 1) the commit message
> 2) reorderbuffer header comment
> 3) ReorderBufferSequenceIsTransactional comment (and nearby)
>
> It's possible this does not meet your expectations, ofc. Maybe there
> should be a separate README for this - I haven't found anything like
> that for logical decoding in general, which is why I did (1)-(3).

I read over these and I do think they answer a bunch of questions, but
I don't think they answer all of the questions.

Suppose T1 creates a sequence and commits. Then T2 calls nextval().
Then T3 drops the sequence. According to the commit message, T2's
change will be "replayed immediately after decoding". But it's
essential to replay T2's change after we replay T1 and before we
replay T3, and the comments don't explain why that's guaranteed.

The answer might be "locks". If we always replay a transaction
immediately when we see it's commit record then in the example above
we're fine, because the commit record for the transaction that creates
the sequence must precede the nextval() call, since the sequence won't
be visible until the transaction commits, and also because T1 holds a
lock on it at that point sufficient to hedge out nextval. And the
nextval record must precede the point where T3 takes an exclusive lock
on the sequence.

Note, however, that this change of reasoning critically depends on us
never delaying application of a transaction. If we might reach T1's
commit record and say "hey, let's hold on to this for a bit and replay
it after we've decoded some more," everything immediately breaks,
unless we also delay application of T2's non-transactional update in
such a way that it's still guaranteed to happen after T1. I wonder if
this kind of situation would be a problem for a future parallel-apply
feature. It wouldn't work, for example, to hand T1 and T3 off (in that
order) to a separate apply process but handle T2's "non-transactional"
message directly, because it might handle that message before the
application of T1 got completed.

This also seems to depend on every transactional operation that might
affect a future non-transactional operation holding a lock that would
conflict with that non-transactional operation. For example, if ALTER
SEQUENCE .. RESTART WITH didn't take a strong lock on the sequence,
then you could have: T1 does nextval, T2 does ALTER SEQUENCE RESTART
WITH, T1 does nextval again, T1 commits, T2 commits. It's unclear what
the semantics of that would be -- would T1's second nextval() see the
sequence restart, or what? But if the effect of T1's second nextval
does depend in some way on the ALTER SEQUENCE operation which precedes
it in the WAL stream, then we might have some trouble here, because
both nextvals precede the commit of T2. Fortunately, this sequence of
events is foreclosed by locking.

But I did find one somewhat-similar case in which that's not so.

S1: create table withseq (a bigint generated always as identity);
S1: begin;
S2: select nextval('withseq_a_seq');
S1: alter table withseq set unlogged;
S2: select nextval('withseq_a_seq');

I think this is a bug in the code that supports owned sequences rather
than a problem that this patch should have to do something about. When
a sequence is flipped between logged and unlogged directly, we take a
stronger lock than we do here when it's done in this indirect way.
Also, I'm not quite sure if it would pose a problem for sequence
decoding anyway: it changes the relfilenode, but not the value. But
this is the *kind* of problem that could make the approach unsafe:
supposedly transactional changes being interleaved with supposedly
non-transctional changes, in such a way that the non-transactional
changes might get applied at the wrong time relative to the
transactional changes.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: POC: GROUP BY optimization

2024-01-26 Thread vignesh C

On Thu, 25 Jan 2024 at 01:15, Alexander Korotkov  wrote:
>
> On Wed, Jan 24, 2024 at 7:38 PM Nathan Bossart  
> wrote:
> > A recent buildfarm failure [0] seems to indicate a name collision with the
> > "abc" index in the aggregates.sql test and the "abc" table in
> > namespace.sql.
> >
> > [0] 
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet=2024-01-24%2014%3A05%3A14
>
> Thank you for catching this.  Fixed.

Since the patch has been committed, I have marked this entry as committed.

Regards,
Vignesh

Re: [PATCH] Add native windows on arm64 support

2024-01-26 Thread Dave Cramer

On Fri, 26 Jan 2024 at 07:36, Andrew Dunstan  wrote:

>
> On 2024-01-25 Th 20:32, Michael Paquier wrote:
> > On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer wrote:
> >> On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan 
> wrote:
> >>> On 2024-01-25 Th 16:17, Dave Cramer wrote:
> >>> Yeah, I think the default Developer Command Prompt for VS2022 is set up
> >>> for x86 builds. AIUI you should start by executing "vcvarsall
> x64_arm64".
> >> Yup, now I'm in the same state you are
> > Wait a minute here.  Based on [1], x64_arm64 means you can use a x64
> > host and you'll be able to produce ARM64 builds, still these will not
> > be able to run on the host where they were built.  How much of the
> > patch posted upthread is required to produce such builds?  Basically
> > everything from it, I guess, so as build dependencies can be
> > satisfied?
> >
> > [1]:
> https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170
>
>
> If you look at the table here x86 and x64 are the only supported host
> architectures. But that's OK, the x64 binaries will run on arm64 (W11
> ARM64 has x64 emulation builtin). If that didn't work Dave and I would
> not have got as far as we have. But you want the x64_arm64 argument to
> vcvarsall so you will get ARM64 output.
>

I've rebuilt it using  x64_arm64 and with the attached (very naive patch)
and I still get an x64 binary :(

>
>


lock.patch
Description: Binary data

Re: [PoC] Improve dead tuple storage for lazy vacuum

2024-01-26 Thread Masahiko Sawada

On Wed, Jan 24, 2024 at 3:42 PM John Naylor  wrote:
>
> On Tue, Jan 23, 2024 at 10:58 AM Masahiko Sawada  
> wrote:
> >
> > The new patches probably need to be polished but the VacDeadItemInfo
> > idea looks good to me.
>
> That idea looks good to me, too. Since you already likely know what
> you'd like to polish, I don't have much to say except for a few
> questions below. I also did a quick sweep through every patch, so some
> of these comments are unrelated to recent changes:

Thank you!

>
> v55-0003:
>
> +size_t
> +dsa_get_total_size(dsa_area *area)
> +{
> + size_t size;
> +
> + LWLockAcquire(DSA_AREA_LOCK(area), LW_SHARED);
> + size = area->control->total_segment_size;
> + LWLockRelease(DSA_AREA_LOCK(area));
>
> I looked and found dsa.c doesn't already use shared locks in HEAD,
> even dsa_dump. Not sure why that is...

Oh, the dsa_dump part seems to be a bug. But it'll keep it consistent
with others.

>
> +/*
> + * Calculate the slab blocksize so that we can allocate at least 32 chunks
> + * from the block.
> + */
> +#define RT_SLAB_BLOCK_SIZE(size) \
> + Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * (size), (size) * 32)
>
> The first parameter seems to be trying to make the block size exact,
> but that's not right, because of the chunk header, and maybe
> alignment. If the default block size is big enough to waste only a
> tiny amount of space, let's just use that as-is.

Agreed.

> Also, I think all
> block sizes in the code base have been a power of two, but I'm not
> sure how much that matters.

Did you mean all slab block sizes we use in radixtree.h?

>
> +#ifdef RT_SHMEM
> + fprintf(stderr, "  [%d] chunk %x slot " DSA_POINTER_FORMAT "\n",
> + i, n4->chunks[i], n4->children[i]);
> +#else
> + fprintf(stderr, "  [%d] chunk %x slot %p\n",
> + i, n4->chunks[i], n4->children[i]);
> +#endif
>
> Maybe we could invent a child pointer format, so we only #ifdef in one place.

WIll change.

>
> --- /dev/null
> +++ b/src/test/modules/test_radixtree/meson.build
> @@ -0,0 +1,35 @@
> +# FIXME: prevent install during main install, but not during test :/
>
> Can you look into this?

Okay, I'll look at it.

>
> test_radixtree.c:
>
> +/*
> + * XXX: should we expose and use RT_SIZE_CLASS and RT_SIZE_CLASS_INFO?
> + */
> +static int rt_node_class_fanouts[] = {
> + 4, /* RT_CLASS_3 */
> + 15, /* RT_CLASS_32_MIN */
> + 32, /* RT_CLASS_32_MAX */
> + 125, /* RT_CLASS_125 */
> + 256 /* RT_CLASS_256 */
> +};
>
> These numbers have been wrong a long time, too, but only matters for
> figuring out where it went wrong when something is broken. And for the
> XXX, instead of trying to use the largest number that should fit (it's
> obviously not testing that the expected node can actually hold that
> number anyway), it seems we can just use a "big enough" number to
> cause growing into the desired size class.
>
> As far as cleaning up the tests, I always wondered why these didn't
> use EXPECT_TRUE, EXPECT_FALSE, etc. as in Andres's prototype where
> where convenient, and leave comments above the tests. That seemed like
> a good idea to me -- was there a reason to have hand-written branches
> and elog messages everywhere?

The current test is based on test_integerset. I agree that we can
improve it by using EXPECT_TRUE etc.

>
> --- a/src/tools/pginclude/cpluspluscheck
> +++ b/src/tools/pginclude/cpluspluscheck
> @@ -101,6 +101,12 @@ do
>   test "$f" = src/include/nodes/nodetags.h && continue
>   test "$f" = src/backend/nodes/nodetags.h && continue
>
> + # radixtree_*_impl.h cannot be included standalone: they are just
> code fragments.
> + test "$f" = src/include/lib/radixtree_delete_impl.h && continue
> + test "$f" = src/include/lib/radixtree_insert_impl.h && continue
> + test "$f" = src/include/lib/radixtree_iter_impl.h && continue
> + test "$f" = src/include/lib/radixtree_search_impl.h && continue
>
> Ha! I'd forgotten about these -- they're long outdated.

Will remove.

>
> v55-0005:
>
> - * The radix tree is locked in shared mode during the iteration, so
> - * RT_END_ITERATE needs to be called when finished to release the lock.
> + * The caller needs to acquire a lock in shared mode during the iteration
> + * if necessary.
>
> "need if necessary" is maybe better phrased as "is the caller's 
> responsibility"

Will fix.

>
> + /*
> + * We can rely on DSA_AREA_LOCK to get the total amount of DSA memory.
> + */
>   total = dsa_get_total_size(tree->dsa);
>
> Maybe better to have a header comment for RT_MEMORY_USAGE that the
> caller doesn't need to take a lock.

Will fix.

>
> v55-0006:
>
> "WIP: Not built, since some benchmarks have broken" -- I'll work on
> this when I re-run some benchmarks.

Thanks!

>
> v55-0007:
>
> + * Internally, a tid is encoded as a pair of 64-bit key and 64-bit value,
> + * and stored in the radix tree.
>
> This hasn't been true for a few months now, and I thought we fixed
> this in some earlier version?

Yeah, I'll fix it.

>
> + * TODO: The caller must be certain that no other backend will attempt to
>

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-26 Thread Bharath Rupireddy

On Fri, Jan 26, 2024 at 8:31 AM Jeff Davis  wrote:
>
> > PSA v20 patch set.
>
> 0001 is very close. I have the following suggestions:
>
>   * Don't just return zero. If the caller is doing something we don't
> expect, we want to fix the caller. I understand you'd like this to be
> more like a transparent optimization, and we may do that later, but I
> don't think it's a good idea to do that now.

+if (RecoveryInProgress() ||
+tli != GetWALInsertionTimeLine())
+return ntotal;
+
+Assert(!XLogRecPtrIsInvalid(startptr));

Are you suggesting to error out instead of returning 0? If yes, I
disagree with it. Because, failure to read due to unmet pre-conditions
doesn't necessarily have to be to error out. If we error out, the
immediate failure we see is in the src/bin/psql TAP test for calling
XLogReadFromBuffers when the server is in recovery. How about
returning a negative value instead of just 0 or returning true/false
just like WALRead?

>   * There's currently no use for reading LSNs between Write and Insert,
> so remove the WaitXLogInsertionsToFinish() code path. That also means
> we don't need the extra emitLog parameter, so we can remove that. When
> we have a use case, we can bring it all back.

I disagree with this. I don't see anything wrong with
XLogReadFromBuffers having the capability to wait for in-progress
insertions to finish. In fact, it makes the function near-complete.
Imagine, implementing an extension (may be for fun or learning or
educational or production purposes) to read unflushed WAL directly
from WAL buffers using XLogReadFromBuffers as page_read callback with
xlogreader facility. AFAICT, I don't see a problem with
WaitXLogInsertionsToFinish logic in XLogReadFromBuffers.

FWIW, one important aspect of XLogReadFromBuffers is its ability to
read the unflushed WAL from WAL buffers. Also, see a note from Andres
here 
https://www.postgresql.org/message-id/20231109205836.zjoawdrn4q77yemv%40awork3.anarazel.de.

> If you agree, I can just make those adjustments (and do some final
> checking) and commit 0001. Otherwise let me know what you think.

Thanks. Please see my responses above.

> 0002: How does the test control whether the data requested is before
> the Flush pointer, the Write pointer, or the Insert pointer? What if
> the walwriter comes in and moves one of those pointers before the next
> statement is executed?

Tried to keep wal_writer quiet with wal_writer_delay=1ms and
wal_writer_flush_after = 1GB to not to flush WAL in the background.
Also, disabled autovacuum, and set checkpoint_timeout to a higher
value. All of this is done to generate minimal WAL so that WAL buffers
don't get overwritten. Do you see any problems with it?

> Also, do you think a test module is required for
> the basic functionality in 0001, or only when we start doing more
> complex things like reading past the Flush pointer?

With WaitXLogInsertionsToFinish in XLogReadFromBuffers, we have that
capability already in. Having a separate test module ensures the code
is tested properly.

As far as the test is concerned, it verifies 2 cases:
1. Check if WAL is successfully read from WAL buffers. For this, the
test generates minimal WAL and reads from WAL buffers from the start
LSN = current insert LSN captured before the WAL generation.
2. Check with a WAL that doesn't yet exist. For this, the test reads
from WAL buffers from the start LSN = current flush LSN+16MB (a
randomly chosen higher value).

> 0003: can you explain why this is useful for wal summarizer to read
> from the buffers?

Can the WAL summarizer ever read the WAL on current TLI? I'm not so
sure about it, I haven't explored it in detail.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add new error_action COPY ON_ERROR "log"

2024-01-26 Thread jian he

On Fri, Jan 26, 2024 at 12:42 AM torikoshia  wrote:
>
> Hi,
>
> As described in 9e2d870119, COPY ON_EEOR is expected to have more
> "error_action".
> (Note that option name was changed by b725b7eec)
>
> I'd like to have a new option "log", which skips soft errors and logs
> information that should have resulted in errors to PostgreSQL log.
>
> I think this option has some advantages like below:
>
> 1) We can know which number of line input data was not loaded and
> reason.
>
>Example:
>
>=# copy t1 from stdin with (on_error log);
>Enter data to be copied followed by a newline.
>End with a backslash and a period on a line by itself, or an EOF
> signal.
>>> 1
>>> 2
>>> 3
>>> z
>>> \.
>LOG:  invalid input syntax for type integer: "z"
>NOTICE:  1 row was skipped due to data type incompatibility
>COPY 3
>
>=# \! tail data/log/postgresql*.log
>LOG:  22P02: invalid input syntax for type integer: "z"
>CONTEXT:  COPY t1, line 4, column i: "z"
>LOCATION:  pg_strtoint32_safe, numutils.c:620
>STATEMENT:  copy t1 from stdin with (on_error log);
>
>
> 2) Easier maintenance than storing error information in tables or
> proprietary log files.
> For example, in case a large number of soft errors occur, some
> mechanisms are needed to prevent an infinite increase in the size of the
> destination data, but we can left it to PostgreSQL's log rotation.
>
>
> Attached a patch.
> This basically comes from previous discussion[1] which did both "ignore"
> and "log" soft error.
>
> As shown in the example above, the log output to the client does not
> contain CONTEXT, so I'm a little concerned that client cannot see what
> line of the input data had a problem without looking at the server log.
>
>
> What do you think?
>

I doubt the following part:
  If the log value is specified,
  COPY behaves the same as
ignore, exept that
  it logs information that should have resulted in errors to
PostgreSQL log at
  INFO level.

I think it does something like:
When an error happens, cstate->escontext->error_data->elevel will be ERROR
you manually change the cstate->escontext->error_data->elevel to LOG,
then you call ThrowErrorData.

but it's not related to `INFO level`?
my log_min_messages is default, warning.

Re: tablecmds.c/MergeAttributes() cleanup

2024-01-26 Thread Peter Eisentraut


On 24.01.24 07:27, Ashutosh Bapat wrote:

While working on identity support and now while looking at the
compression problem you referred to, I found MergeAttribute() to be
hard to read. It's hard to follow high level logic in that function
since the function is not modular. I took a stab at modularising a
part of it. Attached is the resulting patch series.

0001 is your patch as is
0002 is pgindent fix and also fixing what I think is a typo/thinko
from 0001. If you are fine with the changes, 0002 should be merged
into 0003.
0003 separates the part of code merging a child attribute to the
corresponding inherited attribute into its own function.
0004 does the same for code merging inherited attributes incrementally.

I have kept 0003 and 0004 separate in case we pick one and not the
other. But they can be committed as a single commit.


I have committed all this.  These are great improvements.

(One little change I made to your 0003 and 0004 patches is that I kept 
the check whether the new column matches an existing one by name in 
MergeAttributes().  I found that pushing that down made the logic in 
MergeAttributes() too hard to follow.  But it's pretty much the same.)

Re: Confine vacuum skip logic to lazy_scan_skip

2024-01-26 Thread vignesh C

On Fri, 12 Jan 2024 at 05:12, Melanie Plageman
 wrote:
>
> v3 attached
>
> On Thu, Jan 4, 2024 at 3:23 PM Andres Freund  wrote:
> >
> > Hi,
> >
> > On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:
> > > Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var 
> > > rel_pages
> > > Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
> > >  nskippable_blocks
> >
> > I think these may lead to worse code - the compiler has to reload
> > vacrel->rel_pages/next_unskippable_block for every loop iteration, because 
> > it
> > can't guarantee that they're not changed within one of the external 
> > functions
> > called in the loop body.
>
> I buy that for 0001 but 0002 is still using local variables.
> nskippable_blocks was just another variable to keep track of even
> though we could already get that info from local variables
> next_unskippable_block and next_block.
>
> In light of this comment, I've refactored 0003/0004 (0002 and 0003 in
> this version [v3]) to use local variables in the loop as well. I had
> started using the members of the VacSkipState which I introduced.
>
> > > Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state
> > >
> > > Future commits will remove all skipping logic from lazy_scan_heap() and
> > > confine it to lazy_scan_skip(). To make those commits more clear, first
> > > introduce the struct, VacSkipState, which will maintain the variables
> > > needed to skip ranges less than SKIP_PAGES_THRESHOLD.
> >
> > Why not add this to LVRelState, possibly as a struct embedded within it?
>
> Done in attached.
>
> > > From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
> > > From: Melanie Plageman 
> > > Date: Sat, 30 Dec 2023 16:59:27 -0500
> > > Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip
> >
> > > By always calling lazy_scan_skip() -- instead of only when we have 
> > > reached the
> > > next unskippable block, we no longer need the skipping_current_range 
> > > variable.
> > > lazy_scan_heap() no longer needs to manage the skipped range -- checking 
> > > if we
> > > reached the end in order to then call lazy_scan_skip(). And 
> > > lazy_scan_skip()
> > > can derive the visibility status of a block from whether or not we are in 
> > > a
> > > skippable range -- that is, whether or not the next_block is equal to the 
> > > next
> > > unskippable block.
> >
> > I wonder if it should be renamed as part of this - the name is somewhat
> > confusing now (and perhaps before)? lazy_scan_get_next_block() or such?
>
> Why stop there! I've removed lazy and called it
> heap_vac_scan_get_next_block() -- a little long, but...
>
> > > + while (true)
> > >   {
> > >   Buffer  buf;
> > >   Pagepage;
> > > - boolall_visible_according_to_vm;
> > >   LVPagePruneState prunestate;
> > >
> > > - if (blkno == vacskip.next_unskippable_block)
> > > - {
> > > - /*
> > > -  * Can't skip this page safely.  Must scan the 
> > > page.  But
> > > -  * determine the next skippable range after the 
> > > page first.
> > > -  */
> > > - all_visible_according_to_vm = 
> > > vacskip.next_unskippable_allvis;
> > > - lazy_scan_skip(vacrel, , blkno + 1);
> > > -
> > > - Assert(vacskip.next_unskippable_block >= blkno + 1);
> > > - }
> > > - else
> > > - {
> > > - /* Last page always scanned (may need to set 
> > > nonempty_pages) */
> > > - Assert(blkno < rel_pages - 1);
> > > -
> > > - if (vacskip.skipping_current_range)
> > > - continue;
> > > + blkno = lazy_scan_skip(vacrel, , blkno + 1,
> > > +
> > > _visible_according_to_vm);
> > >
> > > - /* Current range is too small to skip -- just scan 
> > > the page */
> > > - all_visible_according_to_vm = true;
> > > - }
> > > + if (blkno == InvalidBlockNumber)
> > > + break;
> > >
> > >   vacrel->scanned_pages++;
> > >
> >
> > I don't like that we still do determination about the next block outside of
> > lazy_scan_skip() and have duplicated exit conditions between 
> > lazy_scan_skip()
> > and lazy_scan_heap().
> >
> > I'd probably change the interface to something like
> >
> > while (lazy_scan_get_next_block(vacrel, ))
> > {
> > ...
> > }
>
> I've done this. I do now find the parameter names a bit confusing.
> There is next_block (which is the "next block in line" and is an input
> parameter) and blkno, which is an output parameter with the next block
> that should actually be processed. Maybe it's okay?
>
> > > From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17

Re: [PATCH] Compression dictionaries for JSONB

2024-01-26 Thread vignesh C

On Wed, 17 Jan 2024 at 19:52, Aleksander Alekseev
 wrote:
>
> Hi Shubham,
>
> > > > 8272749e added a few more arguments to CastCreate(). Here is the 
> > > > rebased patch.
> > >
> > > After merging afbfc029 [1] the patch needed a rebase. PFA v10.
> > >
> > > The patch is still in a PoC state and this is exactly why comments and
> > > suggestions from the community are most welcome! Particularly I would
> > > like to know:
> > >
> > > 1. Would you call it a wanted feature considering the existence of
> > > Pluggable TOASTer patchset which (besides other things) tries to
> > > introduce type-aware TOASTers for EXTERNAL attributes? I know what
> > > Simon's [2] and Nikita's latest answers were, and I know my personal
> > > opinion on this [3][4], but I would like to hear from the rest of the
> > > community.
> > >
> > > 2. How should we make sure a dictionary will not consume all the
> > > available memory? Limiting the amount of dictionary entries to pow(2,
> > > 16) and having dictionary versions seems to work OK for ZSON. However
> > > it was pointed out that this may be an unwanted limitation for the
> > > in-core implementation.
> > >
> > > [1]: 
> > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c727f511;hp=afbfc02983f86c4d71825efa6befd547fe81a926
> > > [2]: 
> > > https://www.postgresql.org/message-id/CANbhV-HpCF852WcZuU0wyh1jMU4p6XLbV6rCRkZpnpeKQ9OenQ%40mail.gmail.com
> > > [3]: 
> > > https://www.postgresql.org/message-id/CAJ7c6TN-N3%3DPSykmOjmW1EAf9YyyHFDHEznX-5VORsWUvVN-5w%40mail.gmail.com
> > > [4]: 
> > > https://www.postgresql.org/message-id/CAJ7c6TO2XTTk3cu5w6ePHfhYQkoNpw7u1jeqHf%3DGwn%2BoWci8eA%40mail.gmail.com
> >
> > I tried to apply the patch but it is failing at the Head. It is giving
> > the following error:
>
> Yes it does for a while now. Until we reach any agreement regarding
> questions (1) and (2) personally I don't see the point in submitting
> rebased patches. We can continue the discussion but mark the CF entry
> as RwF for now it will be helpful.

Thanks. I have updated the status to "Returned with feedback". Feel
free to create a new entry after we agree on the approach to take it
forward.

Regards,
Vignesh

Re: fix stats_fetch_consistency value in postgresql.conf.sample

2024-01-26 Thread vignesh C

On Sat, 20 Jan 2024 at 08:39, Michael Paquier  wrote:
>
> On Sat, Jan 20, 2024 at 07:59:22AM +0530, vignesh C wrote:
> > I'm seeing that there has been no activity in this thread for more
> > than 8 months, I'm planning to close this in the current commitfest
> > unless someone is planning to take it forward.
>
> Thanks, that seems right to me.

Thanks, I have updated the commitfest entry to "returned with
feedback". Feel free to start a new entry when someone wants to pursue
it further more actively.

Regards,
Vignesh

Re: [PATCH] Add support function for containment operators

2024-01-26 Thread vignesh C

On Sun, 21 Jan 2024 at 00:31, Tom Lane  wrote:
>
> jian he  writes:
> > Now I see your point. If the transformed plan is right, the whole
> > added code should be fine.
> > but keeping the textrange_supp related test should be a good idea.
> > since we don't have SUBTYPE_OPCLASS related sql tests.
>
> Yeah, it's a little harder to make a table-less test for that case.
> I thought about using current_user or the like as a stable comparison
> value, but that introduces some doubt about what the collation would
> be.  That test seems cheap enough as-is, since it's handling only a
> tiny amount of data.
>
> Committed.

I have updated the commitfest entry to Committed as the patch is committed.

Regards,
Vignesh

1 2 >

1 - 100 of 131 matches

Mail list logo