Re: Handle TEMP_CONFIG for pg_regress style tests in pg_regress.c

2023-02-18 Thread Peter Eisentraut

On 18.02.23 21:26, Andres Freund wrote:

When building with meson, TEMP_CONFIG is supported for TAP tests, but doesn't
do anything for regress/isolation.

The reason for that is that meson's (and ninja's) architecture is to separate
"build setup" from the "build/test/whatever" stage, moving dynamism (and more
costly operations) to the "setup" phase.

In this case the implication is that the command line for the test isn't
re-computed dynamically. But pg_regress doesn't look at TEMP_CONFIG, it just
has a --temp-config=... parameter, that src/Makefile.global.in dynamically
adds if TEMP_CONFIG is set.

In contrast to that, TEMP_CONFIG support for tap tests is implemented in
Cluster.pm, and thus works transparently.

My inclination is to move TEMP_CONFIG support from the Makefile to
pg_regress.c. That way it's consistent across the build tools and isn't
duplicated. pg_regress already looks at a bunch of temporary variables
(e.g. PG_REGRESS_SOCK_DIR, PG_TEST_USE_UNIX_SOCKETS), so this isn't really
breaking new ground.


I'm having a hard time understanding what TEMP_CONFIG is for.  It 
appears that the intention is to allow injecting arbitrary configuration 
into the tests?  In that case, I think your proposal makes sense.  But I 
don't see this documented, so who knows what it is actually used for.






Re: shoud be get_extension_schema visible?

2023-02-18 Thread Pavel Stehule
Hi


pá 17. 2. 2023 v 6:45 odesílatel Pavel Stehule 
napsal:

> Hi
>
> more times I needed to get the extension's assigned namespace. There is
> already a cooked function get_extension_schema, but it is static.
>
> I need to find a function with a known name, but possibly an unknown
> schema from a known extension.
>

Here is an patch

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>
commit a0bcf2b8261c330e228268939c0381cd2d14eec2
Author: ok...@github.com 
Date:   Sun Feb 19 06:38:51 2023 +0100

initial

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index b1509cc505..02ff4a9a7f 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -222,7 +222,7 @@ get_extension_name(Oid ext_oid)
  *
  * Returns InvalidOid if no such extension.
  */
-static Oid
+Oid
 get_extension_schema(Oid ext_oid)
 {
 	Oid			result;
diff --git a/src/include/commands/extension.h b/src/include/commands/extension.h
index 9f47504491..7e98e37b50 100644
--- a/src/include/commands/extension.h
+++ b/src/include/commands/extension.h
@@ -47,6 +47,7 @@ extern ObjectAddress ExecAlterExtensionContentsStmt(AlterExtensionContentsStmt *
 
 extern Oid	get_extension_oid(const char *extname, bool missing_ok);
 extern char *get_extension_name(Oid ext_oid);
+extern Oid get_extension_schema(Oid ext_oid);
 extern bool extension_file_exists(const char *extensionName);
 
 extern ObjectAddress AlterExtensionNamespace(const char *extensionName, const char *newschema,


"out of relcache_callback_list slots" after multiple calls to pg_logical_slot_get_binary_changes

2023-02-18 Thread shiy.f...@fujitsu.com
Hi hackers,

After multiple calls to the function pg_logical_slot_get_binary_changes() in
single client backend (the output plugin of the slot is pgoutput), I got the
following error:

client backend FATAL:  out of relcache_callback_list slots
client backend CONTEXT:  slot "testslot", output plugin "pgoutput", in the 
startup callback
client backend STATEMENT:  SELECT data FROM 
pg_logical_slot_get_binary_changes('testslot', NULL, NULL, 'proto_version', 
'3', 'streaming', 'off', 'publication_names', 'pub');

I tried to look into it and found that it's because every time the function
(pg_logical_slot_get_binary_changes) is called, relcache callback and syscache
callbacks are registered when initializing pgoutput (see pgoutput_startup() and
init_rel_sync_cache()), but they are not unregistered when it shutdowns. So,
after multiple calls to the function, MAX_RELCACHE_CALLBACKS is exceeded. This
is mentioned in the following comment.

/*
 * We can get here if the plugin was used in SQL interface as the
 * RelSchemaSyncCache is destroyed when the decoding finishes, but there
 * is no way to unregister the relcache invalidation callback.
 */
if (RelationSyncCache == NULL)
return;

Could we fix it by adding two new function to unregister relcache callback and
syscache callback? I tried to do so in the attached patch.

Regards,
Shi Yu


v1-0001-Unregister-syscache-callback-and-relcache-callbac.patch
Description:  v1-0001-Unregister-syscache-callback-and-relcache-callbac.patch


Re: Support logical replication of DDLs

2023-02-18 Thread Jonathan S. Katz

On 2/17/23 4:15 AM, Amit Kapila wrote:

On Fri, Feb 17, 2023 at 1:13 AM Jonathan S. Katz  wrote:


On 2/16/23 2:38 PM, Alvaro Herrera wrote:

On 2023-Feb-16, Jonathan S. Katz wrote:


On 2/16/23 12:53 PM, Alvaro Herrera wrote:



I don't think this is the fault of logical replication.  Consider that
for the backend server, the function source code is just an opaque
string that is given to the plpgsql engine to interpret.  So there's no
way for the logical DDL replication engine to turn this into runnable
code if the table name is not qualified.


Sure, that's fair. That said, the example above would fall under a "typical
use case", i.e. I'm replicating functions that call tables without schema
qualification. This is pretty common, and as logical replication becomes
used for more types of workloads (e.g. high availability), we'll definitely
see this.


Hmm, I think you're saying that replay should turn check_function_bodies
off, and I think I agree with that.


Yes, exactly. +1



But will that be sufficient? I guess such functions can give errors at
a later stage when invoked at DML or another DDL time. Consider the
following example:

Pub:
CREATE PUBLICATION pub FOR ALL TABLES with (ddl = 'all');

Sub:
(Set check_function_bodies = off in postgresql.conf)
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub;

Pub:
CREATE FUNCTION t1(a int) RETURNS int AS $$
select a+1;
$$ LANGUAGE sql;
CREATE FUNCTION t(a int) RETURNS int AS $$
select t1(a);
$$ LANGUAGE sql;
CREATE TABLE tbl1 (a int primary key, b text);
create index idx on tbl1(t(a));

insert into tbl1 values (1,1); -- This insert on publisher causes an
error on the subscriber. Check subscriber Logs (ERROR:  function
t1(integer) does not exist at character 9.)


I did reproduce this as is. I also reproduced this when I rewrote the 
function in PL/pgSQL.


I also did an experiment using PL/v8[1] where I rewrote the functions 
above and did two tests: one via SPI, the other via PL/v8's ability to 
find and call a function[2]. In both cases, the INSERT statement failed 
citing the inability to find the function. The calls did work when I 
schema-qualified.


However, when I converted the SQL-only functions to use the SQL standard 
syntax (BEGIN ATOMIC), I did not get this error and was able to 
successfully use this index with the table on both publisher and 
subscriber. I believe this is due to the generated function body having 
all of the schema qualifications in it.



This happens because of the function used in the index expression.
Now, this is not the only thing, the replication can even fail during
DDL replication when the function like above is IMMUTABLE and used as
follows: ALTER TABLE tbl ADD COLUMN d int DEFAULT t(1);

Normally, it is recommended that users can fix such errors by
schema-qualifying affected names. See commits 11da97024a and
582edc369c.


I'm very familiar with those CVEs, but even though these are our 
recommended best practices, there is still a lot of code that does not 
schema-qualify the names of functions (including many of our own examples ;)


If we're going to say "You can use logical replication to replicate 
functions, but you have to ensure you've schema-qualified any function 
calls within them," I think that will prevent people from being able to 
use this feature, particularly on existing applications.


I guess there's a connection I'm missing here. For the failing examples 
above, I look at the pg_proc entries on both the publisher and the 
subscriber and they're identical. I'm not understanding why creating and 
executing the functions works on the publisher, but it does not on the 
subscriber. What additional info would the subscriber need to be able to 
successfully run these functions? Would we need to pass in some 
additional context, e.g. what the search_path was at the time the 
publisher created the function?


Thanks,

Jonathan

[1] https://plv8.github.io/
[2] https://plv8.github.io/#-code-plv8-find_function-code-


OpenPGP_signature
Description: OpenPGP digital signature


Re: Ignoring BRIN for HOT updates (was: -udpates seems broken)

2023-02-18 Thread Matthias van de Meent
On Thu, 16 Jun 2022 at 15:05, Tomas Vondra
 wrote:
>
> I've pushed the revert. Let's try again for PG16.

As we discussed in person at the developer meeting, here's a patch to
try again for PG16.

It combines the committed patches with my fix, and adds some
additional comments and polish. I am confident the code is correct,
but not that it is clean (see the commit message of the patch for
details).

Kind regards,

Matthias van de Meent

PS. I'm adding this to the commitfest

Original patch thread:
https://www.postgresql.org/message-id/flat/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com

Other relevant:
https://www.postgresql.org/message-id/flat/CA%2BTgmoZOgdoAFH9HatRwuydOZkMdyPi%3D97rNhsu%3DhQBBYs%2BgXQ%40mail.gmail.com
From 402a07d45b9aae70f8a01edcce059eaa13783360 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Tue, 30 Nov 2021 19:15:14 +0100
Subject: [PATCH v1] Ignore BRIN indexes when checking for HOT updates

When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.

This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.

A new type TU_UpdateIndexes is invented provide a signal to the executor
to determine which indexes to update - no indexes, all indexes, or only
the summarizing indexes.

One otherwise unused bit in the heap tuple header is (ab)used to signal
that the HOT update would still update at least one summarizing index.
The bit is cleared immediately

Original patch by Josef Simanek, various fixes and improvements by
Tomas Vondra and me.

Authors: Josef Simanek, Tomas Vondra, Matthias van de Meent
Reviewed-by: Tomas Vondra, Alvaro Herrera
---
 doc/src/sgml/indexam.sgml |  13 +++
 src/backend/access/brin/brin.c|   1 +
 src/backend/access/gin/ginutil.c  |   1 +
 src/backend/access/gist/gist.c|   1 +
 src/backend/access/hash/hash.c|   1 +
 src/backend/access/heap/heapam.c  |  15 ++-
 src/backend/access/heap/heapam_handler.c  |  21 +++-
 src/backend/access/nbtree/nbtree.c|   1 +
 src/backend/access/spgist/spgutils.c  |   1 +
 src/backend/access/table/tableam.c|   2 +-
 src/backend/catalog/index.c   |   9 +-
 src/backend/catalog/indexing.c|  29 -
 src/backend/commands/copyfrom.c   |   5 +-
 src/backend/commands/indexcmds.c  |  10 +-
 src/backend/executor/execIndexing.c   |  37 --
 src/backend/executor/execReplication.c|   9 +-
 src/backend/executor/nodeModifyTable.c|  13 ++-
 src/backend/nodes/makefuncs.c |   4 +-
 src/backend/utils/cache/relcache.c|  62 +++---
 src/include/access/amapi.h|   2 +
 src/include/access/htup_details.h |  29 +
 src/include/access/tableam.h  |  19 ++-
 src/include/executor/executor.h   |   3 +-
 src/include/nodes/execnodes.h |   2 +
 src/include/nodes/makefuncs.h |   4 +-
 src/include/utils/rel.h   |   4 +-
 src/include/utils/relcache.h  |   5 +-
 .../modules/dummy_index_am/dummy_index_am.c   |   1 +
 src/test/regress/expected/stats.out   | 110 ++
 src/test/regress/sql/stats.sql|  82 -
 30 files changed, 431 insertions(+), 65 deletions(-)

diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 4f83970c85..897419ec95 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -127,6 +127,9 @@ typedef struct IndexAmRoutine
 boolamcaninclude;
 /* does AM use maintenance_work_mem? */
 boolamusemaintenanceworkmem;
+/* does AM summarize tuples, with at least all tuples in the block
+ * summarized in one summary */
+boolamsummarizing;
 /* OR of parallel vacuum flags */
 uint8   amparallelvacuumoptions;
 /* type of data stored in index, or InvalidOid if variable */
@@ -247,6 +250,16 @@ typedef struct IndexAmRoutine
null, independently of amoptionalkey.
   
 
+  
+   The amsummarizing flag indicates whether the
+   access method summarizes the indexed tuples, with summarizing granularity
+   of at least per block.
+   Access methods that do not point to individual tuples, but to  (like
+   BRIN), may allow the HOT optimization
+   to continue. This does not apply to attributes referenced in index
+   predicates, an update of such attribute always disables HOT.
+  
+
  
 
  
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index de1427a1e0..c4bdd0e7b0 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -109,6 +109,7 @@ brinhandler(PG_FUNCTION_ARGS)
 

Re: pg_upgrade and logical replication

2023-02-18 Thread Julien Rouhaud
On Sat, Feb 18, 2023 at 04:12:52PM +0530, Amit Kapila wrote:
> On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud  wrote:
> >
> > > Now, I think it would be a bit tricky if the user already has a
> > > publication defined with FOR ALL TABLES. In that case, we probably
> > > need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
> > > we currently don't have.
> >
> > Yes, and note that I rely on FOR ALL TABLES for my original physical to 
> > logical
> > use case.
> >
>
> Okay, but if we would have functionality like EXCEPT (list of tables),
> one could do ALTER PUBLICATION .. before doing REFRESH on the
> subscriber-side.

Honestly I'm not a huge fan of this approach.  It feels hacky to have such a
feature, and doesn't even solve the problem on its own as you still lose
records when reactivating the subscription unless you also provide an ALTER
SUBSCRIPTION ENABLE WITH (refresh = true, copy_data = false), which will
probably require different defaults than the rest of the ALTER SUBSCRIPTION
subcommands that handle a refresh.

> > > > Indeed, but it's barely saying "It is then up to the user to reactivate 
> > > > the
> > > > subscriptions in a suitable way" and "It might also be appropriate to 
> > > > truncate
> > > > the target tables before initiating a new full table copy".  As I 
> > > > mentioned, I
> > > > don't think there's a suitable way to reactivate the subscription, at 
> > > > least if
> > > > you don't want to miss some records, so truncating all target tables is 
> > > > the
> > > > only fully safe way to proceed.  It seems quite silly to have to do so 
> > > > just
> > > > because pg_upgrade doesn't retain the list of relation per subscription.
> > > >
> > >
> > > I also don't know if there is any other safe way for newly added
> > > tables apart from the above suggestion to create separate publications
> > > but that can work only in specific cases.
> >
> > I might be missing something, but what could go wrong if pg_upgrade could 
> > emit
> > a bunch of commands like:
> >
> > ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
> >
>
> How will we know the STATE and LSN of each relation?

In the pg_subscription_rel catalog of the upgraded server?  I didn't look in
detail on how information are updated but I'm assuming that if logical
replication survives after a database restart it shouldn't be a problem to also
fully dump it during pg_upgrade.

> But I think even
> if know that what is the guarantee that publisher side still has still
> retained the corresponding slots?

No guarantee, but if you're just doing a pg_upgrade of a logical replica why
would you drop the replication slot?  In any case the warning you mentioned in
pg_dump documentation would still apply and you would have to reenable it as
needed, the only difference is that you would actually be able to keep your
logical replication after a pg_upgrade if you need.  If you dropped the
replication slot on the publisher side, then simply remove the publications on
the upgraded node too, or create a new one, exactly as you would do with the
current pg_upgrade workflow.




Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2023-02-18 Thread Peter Smith
On Fri, Feb 17, 2023 at 5:57 PM Peter Smith  wrote:
>
> FYI, I accidentally left this (v23) patch's TAP test
> t/032_subscribe_use_index.pl still lurking even after removing all
> other parts of this patch.
>
> In this scenario, the t/032 test gets stuck (build of latest HEAD)
>
> IIUC the patch is only meant to affect performance, so I expected this
> 032 test to work regardless of whether the rest of the patch is
> applied.
>
> Anyway,  it hangs every time for me. I didn't dig looking for the
> cause, but if it requires patched code for this new test to pass, I
> thought it indicates something wrong either with the test or something
> more sinister the new test has exposed. Maybe I am mistaken
>

Sorry, probably the above was a false alarm. After a long time
(minutes) the stuck test did eventually timeout with:

t/032_subscribe_use_index.pl ... # poll_query_until timed out
executing this query:
# select (idx_scan = 1) from pg_stat_all_indexes where indexrelname =
'test_replica_id_full_idx';
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
t/032_subscribe_use_index.pl ... Dubious, test returned 29 (wstat
7424, 0x1d00)

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Weird failure with latches in curculio on v15

2023-02-18 Thread Andres Freund
Hi,

On 2023-02-18 15:51:06 +0530, Robert Haas wrote:
> On Thu, Feb 16, 2023 at 10:02 PM Andres Freund  wrote:
> > But there's nothing inherent in that. We know for certain which files we're
> > going to archive. And we don't need to work one-by-one. The archiver could
> > just start multiple subprocesses at the same time.
> 
> But what if it doesn't want to start multiple processes, just
> multiplex within a single process?

To me that seems even simpler? Nothing but the archiver is supposed to create
.done files and nothing is supposed to remove .ready files without archiver
having created the .done files.  So the archiver process can scan
archive_status until its done or until N archives have been collected, and
then process them at once?  Only the creation of the .done files would be
serial, but I don't think that's commonly a problem (and could be optimized as
well, by creating multiple files and then fsyncing them in a second pass,
avoiding N filesystem journal flushes).

Maybe I am misunderstanding what you see as the problem?

Greetings,

Andres Freund




Re: occasional valgrind reports for handle_sig_alarm on 32-bit ARM

2023-02-18 Thread Andres Freund
Hi,

On 2023-02-18 13:56:38 +0100, Tomas Vondra wrote:
> or (somewhat weird)
> 
> ==23734== Use of uninitialised value of size 4
> ==23734==at 0x88DDC8: handle_sig_alarm (timeout.c:457)
> ==23734==by 0x: ???
> ==23734==  Uninitialised value was created by a stack allocation
> ==23734==at 0x64CE2C: EndCommand (dest.c:167)
> ==23734==
> {
>
>Memcheck:Value4
>fun:handle_sig_alarm
>obj:*
> }

I'd try using valgrind's --vgdb-error=1, and inspecting the state.

I assume this is without specifying --read-var-info=yes? Might be worth
trying, sometimes the increased detail can be really helpful.


It's certainly interesting that the error happens in timeout.c:457 - currently
that's the end of the function. And dest.c:167 is the entry of EndCommand().

Perhaps there's some confusion around the state of the stack? The fact that it
looks like the function epilogue of handle_sig_alarm() uses an uninitialized
variable created by the function prologue of EndCommand() does seem to suggest
something like that.

It'd be interesting to see the exact instruction triggering the failure +
surroundings.


> It might be a valgrind issue and/or false positive, but I don't think
> I've seen such failures before, so I'm wondering if this might be due to
> some recent changes?

Have you run 32bit arm valgrind before? It'd not surprise me if there are some
32bit arm issues in valgrind, libc, or such.

Greetings,

Andres Freund




Re: Reducing connection overhead in pg_upgrade compat check phase

2023-02-18 Thread Justin Pryzby
On Fri, Feb 17, 2023 at 10:44:49PM +0100, Daniel Gustafsson wrote:
> When adding a check to pg_upgrade a while back I noticed in a profile that the
> cluster compatibility check phase spend a lot of time in connectToServer.  
> Some
> of this can be attributed to data type checks which each run serially in turn
> connecting to each database to run the check, and this seemed like a place
> where we can do better.

 src/bin/pg_upgrade/check.c  | 371 +++---
 src/bin/pg_upgrade/pg_upgrade.h |  28 ++-
 src/bin/pg_upgrade/version.c| 394 ++--
 3 files changed, 373 insertions(+), 420 deletions(-)

And saves 50 LOC.

The stated goal of the patch is to reduce overhead.  But it only updates
a couple functions, and there are (I think) nine functions which loop
around all DBs.  If you want to reduce the overhead, I assumed you'd
cache the DB connection for all tests ... but then I tried it, and first
ran into max_connections, and then ran into EMFILE.  Which is probably
enough to kill my idea.

But maybe the existing patch could be phrased in terms of moving all the
per-db checks from functions to data structures (which has its own
merits).  Then, there could be a single loop around DBs which executes
all the functions.  The test runner can also test the major version and
handle the textfile output.

However (as Nathan mentioned) what's currently done shows *all* the
problems of a given type - if there were 9 DBs with 99 relations with
OIDs, it'd show all of them at once.  It'd be a big step backwards to
only show problems for the first problematic DB.

But maybe that's an another opportunity to do better.  Right now, if I
run pg_upgrade, it'll show all the failing objects, but only for first
check that fails.  After fixing them, it might tell me about a 2nd
failing check.  I've never run into multiple types of failing checks,
but I do know that needing to re-run pg-upgrade is annoying (see
3c0471b5f).

You talked about improving the two data types tests, which aren't
conditional on a maximum server version.  The minimal improvement you'll
get is when only those two checks are run (like on a developer upgrade
v16=>v16).  But when more checks are run during a production upgrade
like v13=>v16, you'd see a larger gain.

I fooled around with that idea in the attached patch.  I have no
particular interest in optimizing --check for large numbers of DBs, so
I'm not planning to pursue it further, but maybe it'll be useful to you.

About your original patch:

+static DataTypesUsageChecks data_types_usage_checks[] = {
+   /*
+* Look for composite types that were made during initdb *or* belong to
+* information_schema; that's important in case information_schema was
+* dropped and reloaded.
+*
+* The cutoff OID here should match the source cluster's value of
+* FirstNormalObjectId.  We hardcode it rather than using that C #define
+* because, if that #define is ever changed, our own version's value is
+* NOT what to use.  Eventually we may need a test on the source 
cluster's
+* version to select the correct value.
+*/
+   {"Checking for system-defined composite types in user tables",
+"tables_using_composite.txt",

I think this might e cleaner using "named initializer" struct
initialization, rather than a comma-separated list (whatever that's
called).

Maybe instead of putting all checks into an array of
DataTypesUsageChecks, they should be defined in separate arrays, and
then an array defined with the list of checks?

+* If the check failed, terminate the umbrella status 
and print
+* the specific status line of the check to indicate 
which it was
+* before terminating with the detailed error message.
+*/
+   if (found)
+   {
+   PQfinish(conn);
 
-   base_query = psprintf("SELECT '%s'::pg_catalog.regtype AS oid",
- type_name);
+   report_status(PG_REPORT, "failed");
+   prep_status("%s", cur_check->status);
+   pg_log(PG_REPORT, "fatal");
+   pg_fatal("%s%s", cur_check->fatal_check, 
output_path);
+   }

I think this loses the message localization/translation that currently
exists.  It could be written like prep_status(cur_check->status) or
prep_status("%s", _(cur_check->status)).  And _(cur_check->fatal_check). 

-- 
Justin
>From 18f406c16e5ebeaaf4a24c5b5a57a8358a91afb4 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 17 Feb 2023 19:51:42 -0600
Subject: [PATCH] wip: pg_upgrade data structure

---
 src/bin/pg_upgrade/check.c  | 929 ++--
 src/bin/pg_upgrade/pg_upgrade.h |  10 +-
 

Handle TEMP_CONFIG for pg_regress style tests in pg_regress.c

2023-02-18 Thread Andres Freund
Hi,

When building with meson, TEMP_CONFIG is supported for TAP tests, but doesn't
do anything for regress/isolation.

The reason for that is that meson's (and ninja's) architecture is to separate
"build setup" from the "build/test/whatever" stage, moving dynamism (and more
costly operations) to the "setup" phase.

In this case the implication is that the command line for the test isn't
re-computed dynamically. But pg_regress doesn't look at TEMP_CONFIG, it just
has a --temp-config=... parameter, that src/Makefile.global.in dynamically
adds if TEMP_CONFIG is set.

In contrast to that, TEMP_CONFIG support for tap tests is implemented in
Cluster.pm, and thus works transparently.

My inclination is to move TEMP_CONFIG support from the Makefile to
pg_regress.c. That way it's consistent across the build tools and isn't
duplicated. pg_regress already looks at a bunch of temporary variables
(e.g. PG_REGRESS_SOCK_DIR, PG_TEST_USE_UNIX_SOCKETS), so this isn't really
breaking new ground.

It can be implemented differently, e.g. by adding the parameter dynamically in
the wrapper around pg_regress, but I don't see an advantage in that.

Patch attached.

Greetings,

Andres Freund
>From 22d931aa350698e63e61c449d76b9ac2e3ee37b0 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sat, 18 Feb 2023 12:20:09 -0800
Subject: [PATCH v1 1/2] Move TEMP_CONFIG into pg_regress.c

That way it works for meson as well, and gets rid of a bit of duplicated
handling of TEMP_CONFIG.
---
 src/interfaces/ecpg/test/Makefile | 6 +++---
 src/test/regress/pg_regress.c | 8 
 src/Makefile.global.in| 8 
 src/tools/msvc/vcregress.pl   | 5 -
 4 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/src/interfaces/ecpg/test/Makefile b/src/interfaces/ecpg/test/Makefile
index d7a7d1d1ca5..ce4b7237b66 100644
--- a/src/interfaces/ecpg/test/Makefile
+++ b/src/interfaces/ecpg/test/Makefile
@@ -71,11 +71,11 @@ REGRESS_OPTS =  --expecteddir=$(srcdir) \
   $(EXTRA_REGRESS_OPTS)
 
 check: all
-	$(with_temp_install) ./pg_regress $(REGRESS_OPTS) --temp-instance=./tmp_check $(TEMP_CONF) --bindir= $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule sql/twophase
+	$(with_temp_install) ./pg_regress $(REGRESS_OPTS) --temp-instance=./tmp_check --bindir= $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule sql/twophase
 
 # Connect to the server using TCP, and add a TCP-specific test.
 checktcp: all | temp-install
-	$(with_temp_install) ./pg_regress $(REGRESS_OPTS) --temp-instance=./tmp_check $(TEMP_CONF) --bindir= $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule --host=localhost sql/twophase connect/test1
+	$(with_temp_install) ./pg_regress $(REGRESS_OPTS) --temp-instance=./tmp_check --bindir= $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule --host=localhost sql/twophase connect/test1
 
 installcheck: all
 	./pg_regress $(REGRESS_OPTS) --bindir='$(bindir)' $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule
@@ -89,4 +89,4 @@ installcheck-prepared-txns: all
 	./pg_regress $(REGRESS_OPTS) --bindir='$(bindir)' $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule sql/twophase
 
 check-prepared-txns: all | temp-install
-	$(with_temp_install) ./pg_regress $(REGRESS_OPTS) --temp-instance=./tmp_check $(TEMP_CONF) --bindir= $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule sql/twophase
+	$(with_temp_install) ./pg_regress $(REGRESS_OPTS) --temp-instance=./tmp_check --bindir= $(pg_regress_locale_flags) $(THREAD) --schedule=$(srcdir)/ecpg_schedule sql/twophase
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 6cd5998b9d7..759bf00d981 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -2032,6 +2032,14 @@ regression_main(int argc, char *argv[],
 	if (!use_unix_sockets)
 		hostname = "localhost";
 
+	/*
+	 * If the TEMP_CONFIG environment variable is set, use it. Give higher
+	 * precedence to an explicitly specified --temp-config by adding it later
+	 * to the list, as later config assignments override earlier ones.
+	 */
+	if (getenv("TEMP_CONFIG"))
+		add_stringlist_item(_configs, getenv("TEMP_CONFIG"));
+
 	/*
 	 * We call the initialization function here because that way we can set
 	 * default parameters and let them be overwritten by the commandline.
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index fb3e197fc06..24656a1c74c 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -655,12 +655,6 @@ ifdef NO_LOCALE
 NOLOCALE += --no-locale
 endif
 
-# file with extra config for temp build
-TEMP_CONF =
-ifdef TEMP_CONFIG
-TEMP_CONF += --temp-config=$(TEMP_CONFIG)
-endif
-
 pg_regress_locale_flags = $(if $(ENCODING),--encoding=$(ENCODING)) $(NOLOCALE)
 pg_regress_clean_files = results/ regression.diffs regression.out tmp_check/ tmp_check_iso/ log/ output_iso/
 
@@ -671,7 +665,6 @@ pg_regress_check = 

Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-18 Thread Andres Freund
Hi,

On 2023-02-18 18:00:00 +0300, Alexander Lakhin wrote:
> 18.02.2023 04:06, Andres Freund wrote:
> > On 2023-02-18 13:27:04 +1300, Thomas Munro wrote:
> > How can a process that we did notify crashing, that has already executed
> > SQL statements, end up in MarkPostmasterChildActive()?
>
> Maybe it's just the backend started for the money test has got
> the same PID (5948) that the backend for the name test had?

I somehow mashed name and money into one test in my head... So forget what I
wrote.

That doesn't really explain the assertion though.


It's too bad that we didn't use doesn't include
log_connections/log_disconnections. If nothing else, it makes it a lot easier
to identify problems like that. We actually do try to configure it for CI, but
it currently doesn't work for pg_regress style tests with meson. Need to fix
that. Starting a thread.



One thing that made me very suspicious when reading related code is this
remark:

bool
ReleasePostmasterChildSlot(int slot)
...
/*
 * Note: the slot state might already be unused, because the logic in
 * postmaster.c is such that this might get called twice when a child
 * crashes.  So we don't try to Assert anything about the state.
 */

That seems fragile, and potentially racy. What if we somehow can end up
starting another backend inbetween the two ReleasePostmasterChildSlot() calls,
we can end up marking a slot that, newly, has a process associated with it, as
inactive? Once the slot has been released the first time, it can be assigned
again.


ISTM that it's not a good idea that we use PM_CHILD_ASSIGNED to signal both,
that a slot has not been used yet, and that it's not in use anymore. I think
that makes it quite a bit harder to find state management issues.



> A simple script that I've found [1] shows that the pids reused rather often
> (for me, approximately each 300 process starts in Windows 10 H2), buy maybe
> under some circumstances (many concurrent processes?) PIDs can coincide even
> so often to trigger that behavior.

It's definitely very aggressive in reusing pids - and it seems to
intentionally do work to keep pids small. I wonder if it'd be worth trying to
exercise this path aggressively by configuring a very low max pid on linux, in
an EXEC_BACKEND build.

Greetings,

Andres Freund




Re: PATCH: Using BRIN indexes for sorted output

2023-02-18 Thread Tomas Vondra
On 2/18/23 19:51, Justin Pryzby wrote:
> Are (any of) these patches targetting v16 ?
> 

Probably not. Maybe if there's more feedback / scrutiny, but I'm not
sure one commitfest is enough to polish the patch (especially
considering I haven't done much on the costing yet).

> typos:
> ar we - we are?
> morestly - mostly
> interstect - intersect
> 
>> + * XXX We don't sort the bins, so just do binary sort. For large number of 
>> values
>> + * this might be an issue, for small number of values a linear search is 
>> fine.
> 
> "binary sort" is wrong?
> 
>> + * only half of there ranges, thus 1/2. This can be extended to randomly
> 
> half of *these* ranges ?
> 

Thanks, I'll fix those.

>> From 7b3307c27b35ece119feab4891f03749250e454b Mon Sep 17 00:00:00 2001
>> From: Tomas Vondra 
>> Date: Mon, 17 Oct 2022 18:39:28 +0200
>> Subject: [PATCH 01/11] Allow index AMs to build and use custom statistics
> 
> I think the idea can also apply to btree - currently, correlation is
> considered to be a property of a column, but not an index.  But that
> fails to distinguish between a freshly built index, and an index with
> out of order heap references, which can cause an index scan to be a lot
> more expensive.
> 
> I implemented per-index correlation stats way back when:
> https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com
> 
> See also:
> https://www.postgresql.org/message-id/14438.1512499...@sss.pgh.pa.us
> 
> With my old test case:
> 
> Index scan is 3x slower than bitmap scan, but index scan is costed as
> being cheaper:
> 
> postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
>  Index Scan using t_i_idx on t  (cost=0.43..21153.74 rows=130912 width=8) 
> (actual time=0.107..222.737 rows=128914 loops=1)
> 
> postgres=# SET enable_indexscan =no;
> postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
>  Bitmap Heap Scan on t  (cost=2834.28..26895.96 rows=130912 width=8) (actual 
> time=16.830..69.860 rows=128914 loops=1)
> 
> If it's clustered, then the index scan is almost twice as fast, and the
> costs are more consistent with the associated time.  The planner assumes
> that the indexes are freshly built...
> 
> postgres=# CLUSTER t USING t_i_idx ;
> postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
>  Index Scan using t_i_idx on t  (cost=0.43..20121.74 rows=130912 width=8) 
> (actual time=0.084..117.549 rows=128914 loops=1)
> 

Yeah, the concept of indexam statistics certainly applies to other index
types, and for btree we might collect information about correlation etc.
I haven't looked at the 2017 patch, but it seems reasonable.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: PATCH: Using BRIN indexes for sorted output

2023-02-18 Thread Justin Pryzby
Are (any of) these patches targetting v16 ?

typos:
ar we - we are?
morestly - mostly
interstect - intersect

> + * XXX We don't sort the bins, so just do binary sort. For large number of 
> values
> + * this might be an issue, for small number of values a linear search is 
> fine.

"binary sort" is wrong?

> + * only half of there ranges, thus 1/2. This can be extended to randomly

half of *these* ranges ?

> From 7b3307c27b35ece119feab4891f03749250e454b Mon Sep 17 00:00:00 2001
> From: Tomas Vondra 
> Date: Mon, 17 Oct 2022 18:39:28 +0200
> Subject: [PATCH 01/11] Allow index AMs to build and use custom statistics

I think the idea can also apply to btree - currently, correlation is
considered to be a property of a column, but not an index.  But that
fails to distinguish between a freshly built index, and an index with
out of order heap references, which can cause an index scan to be a lot
more expensive.

I implemented per-index correlation stats way back when:
https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com

See also:
https://www.postgresql.org/message-id/14438.1512499...@sss.pgh.pa.us

With my old test case:

Index scan is 3x slower than bitmap scan, but index scan is costed as
being cheaper:

postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
 Index Scan using t_i_idx on t  (cost=0.43..21153.74 rows=130912 width=8) 
(actual time=0.107..222.737 rows=128914 loops=1)

postgres=# SET enable_indexscan =no;
postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
 Bitmap Heap Scan on t  (cost=2834.28..26895.96 rows=130912 width=8) (actual 
time=16.830..69.860 rows=128914 loops=1)

If it's clustered, then the index scan is almost twice as fast, and the
costs are more consistent with the associated time.  The planner assumes
that the indexes are freshly built...

postgres=# CLUSTER t USING t_i_idx ;
postgres=# explain analyze SELECT * FROM t WHERE i>11 AND i<55;
 Index Scan using t_i_idx on t  (cost=0.43..20121.74 rows=130912 width=8) 
(actual time=0.084..117.549 rows=128914 loops=1)

-- 
Justin




Re: [PATCH] Add pretty-printed XML output option

2023-02-18 Thread Peter Eisentraut

On 17.02.23 23:24, Nikolay Samokhvalov wrote:


My idea was to follow the SQL standard (part 14, SQL/XML); 
unfortunately, there is no free version, but there are drafts at 
http://www.wiscorp.com/SQLStandards.html 
.


 ::= XMLSERIALIZE  [ 
 ]


 AS  [  ] [ serialize version> ] [  ]


[  ] 

 ::= [ NO ] INDENT


Good find.  It would be better to use this standard syntax.




Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-18 Thread Alexander Lakhin

Hello,
18.02.2023 04:06, Andres Freund wrote:

Hi,

On 2023-02-18 13:27:04 +1300, Thomas Munro wrote:

I still have no theory for how this condition was reached despite a
lot of time thinking about it and searching for more clues.  As far as
I can tell, the recent improvements to postmaster's signal and event
handling shouldn't be related: the state management and logic was
unchanged.

Yea, it's all very odd.

If you look at the log:

2023-02-08 00:53:20.175 GMT client backend[5948] pg_regress/name DETAIL:  No valid 
identifier after ".".
2023-02-08 00:53:20.175 GMT client backend[5948] pg_regress/name STATEMENT:  
SELECT parse_ident('xxx.1020');
...
TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED"), File: 
"../src/backend/storage/ipc/pmsignal.c", Line: 329, PID: 5948
abort() has been called
...
2023-02-08 00:53:27.420 GMT postmaster[872] LOG:  server process (PID 5948) was 
terminated by exception 0xC354
2023-02-08 00:53:27.420 GMT postmaster[872] HINT:  See C include file 
"ntstatus.h" for a description of the hexadecimal value.
2023-02-08 00:53:27.420 GMT postmaster[872] LOG:  terminating any other active 
server processes
2023-02-08 00:53:27.434 GMT postmaster[872] LOG:  all server processes 
terminated; reinitializing


and that it's indeed the money test that failed:
  money... FAILED (test process exited with exit 
code 2) 7337 ms

it's very hard to understand how this stack can come to be:

0085`f03ffa40 7ff6`fd89faa8 ucrtbased!abort(void)+0x5a 
[minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 77]
0085`f03ffa80 7ff6`fd6474dc postgres!ExceptionalCondition(
char * conditionName = 0x7ff6`fdd03ca8 
"PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED",
char * fileName = 0x7ff6`fdd03c80 
"../src/backend/storage/ipc/pmsignal.c",
int lineNumber = 0n329)+0x78 
[c:\cirrus\src\backend\utils\error\assert.c @ 67]
0085`f03ffac0 7ff6`fd676eff 
postgres!MarkPostmasterChildActive(void)+0x7c 
[c:\cirrus\src\backend\storage\ipc\pmsignal.c @ 329]
0085`f03ffb00 7ff6`fd59aa3a postgres!InitProcess(void)+0x2ef 
[c:\cirrus\src\backend\storage\lmgr\proc.c @ 375]
0085`f03ffb60 7ff6`fd467689 postgres!SubPostmasterMain(
int argc = 0n3,
char ** argv = 0x01c6`f3814e80)+0x33a 
[c:\cirrus\src\backend\postmaster\postmaster.c @ 4962]
0085`f03ffd90 7ff6`fda0e1c9 postgres!main(
int argc = 0n3,
char ** argv = 0x01c6`f3814e80)+0x2f9 
[c:\cirrus\src\backend\main\main.c @ 192]

How can a process that we did notify crashing, that has already executed SQL
statements, end up in MarkPostmasterChildActive()?

Maybe it's just the backend started for the money test has got
the same PID (5948) that the backend for the name test had?
A simple script that I've found [1] shows that the pids reused rather often
(for me, approximately each 300 process starts in Windows 10 H2), buy maybe
under some circumstances (many concurrent processes?) PIDs can coincide even
so often to trigger that behavior.

[1] https://superuser.com/questions/636497/does-windows-7-reuse-process-ids




Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-18 Thread Justin Pryzby
On Sat, Feb 18, 2023 at 01:27:04PM +1300, Thomas Munro wrote:
> (cfbot previously wasn't interested in the past at all;
> I'd need to get my hands on the commit IDs for earlier testing but I
> can't figure out how to get those out of Cirrus or Github -- anyone
> know how?).

I wish I knew - my only suggestion is to scrape it out of "git reflog",
but that only works if you configured it to save a huge reflog or saved
the historic output of "git reflog".  Or if you don't change the branch
often, which I imagine doesn't hold true here.

-- 
Justin




occasional valgrind reports for handle_sig_alarm on 32-bit ARM

2023-02-18 Thread Tomas Vondra
Hi,

I've been running a lot of valgrind tests on 32-bit arm recently, and
from time to time I get a failure in handle_sig_alarm like this:

==13605== Use of uninitialised value of size 4
==13605==at 0x88DA98: handle_sig_alarm (timeout.c:457)
==13605==by 0x: ???
==13605==  Uninitialised value was created by a heap allocation
==13605==at 0x8A0374: MemoryContextAllocExtended (mcxt.c:1149)
==13605==by 0x86A187: DynaHashAlloc (dynahash.c:292)
==13605==by 0x86CB07: element_alloc (dynahash.c:1715)
==13605==by 0x86A9E7: hash_create (dynahash.c:611)
==13605==by 0x8A1CE3: EnablePortalManager (portalmem.c:122)
==13605==by 0x8716CF: InitPostgres (postinit.c:806)
==13605==by 0x653F63: PostgresMain (postgres.c:4141)
==13605==by 0x5651CB: BackendRun (postmaster.c:4461)
==13605==by 0x564A43: BackendStartup (postmaster.c:4189)
==13605==by 0x560663: ServerLoop (postmaster.c:1779)
==13605==by 0x55FE27: PostmasterMain (postmaster.c:1463)
==13605==by 0x4107F3: main (main.c:200)
==13605==
{
   
   Memcheck:Value4
   fun:handle_sig_alarm
   obj:*
}

or (somewhat weird)

==23734== Use of uninitialised value of size 4
==23734==at 0x88DDC8: handle_sig_alarm (timeout.c:457)
==23734==by 0x: ???
==23734==  Uninitialised value was created by a stack allocation
==23734==at 0x64CE2C: EndCommand (dest.c:167)
==23734==
{
   
   Memcheck:Value4
   fun:handle_sig_alarm
   obj:*
}

It might be a valgrind issue and/or false positive, but I don't think
I've seen such failures before, so I'm wondering if this might be due to
some recent changes?

It's pretty rare, as it depends on the timing of the signal being just
"right" (I wonder if there's a way to increase the frequency).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: pg_upgrade and logical replication

2023-02-18 Thread Amit Kapila
On Sat, Feb 18, 2023 at 11:21 AM Julien Rouhaud  wrote:
>
> On Sat, Feb 18, 2023 at 09:31:30AM +0530, Amit Kapila wrote:
> > On Fri, Feb 17, 2023 at 9:05 PM Julien Rouhaud  wrote:
> > >
> > > I'm concerned about people not coming from physical replication.  If you 
> > > just
> > > had some "normal" logical replication, you can't assume that you already 
> > > have
> > > all the data from the upstream subscription.  If it was modified and a non
> > > empty table is added, you might need to copy the data of part of the 
> > > tables and
> > > keep replicating for the rest.  It's hard to be sure from a user point of 
> > > view,
> > > and even if you knew you have no way to express it.
> > >
> >
> > Can't the user create a separate publication for such newly added
> > tables and a corresponding new subscription on the downstream node?
>
> Yes that seems like a safe way to go, but it relies on users being very 
> careful
> if they don't want to get corrupted logical standby, and I think it's
> impossible to run any check to make sure that the subscription is adequate?
>

I can't think of any straightforward way but one can probably take of
dump of data on both nodes using pg_dump and then compare it.

> > Now, I think it would be a bit tricky if the user already has a
> > publication defined with FOR ALL TABLES. In that case, we probably
> > need some way to specify FOR ALL TABLES EXCEPT (list of tables) which
> > we currently don't have.
>
> Yes, and note that I rely on FOR ALL TABLES for my original physical to 
> logical
> use case.
>

Okay, but if we would have functionality like EXCEPT (list of tables),
one could do ALTER PUBLICATION .. before doing REFRESH on the
subscriber-side.

> > >
> > > Indeed, but it's barely saying "It is then up to the user to reactivate 
> > > the
> > > subscriptions in a suitable way" and "It might also be appropriate to 
> > > truncate
> > > the target tables before initiating a new full table copy".  As I 
> > > mentioned, I
> > > don't think there's a suitable way to reactivate the subscription, at 
> > > least if
> > > you don't want to miss some records, so truncating all target tables is 
> > > the
> > > only fully safe way to proceed.  It seems quite silly to have to do so 
> > > just
> > > because pg_upgrade doesn't retain the list of relation per subscription.
> > >
> >
> > I also don't know if there is any other safe way for newly added
> > tables apart from the above suggestion to create separate publications
> > but that can work only in specific cases.
>
> I might be missing something, but what could go wrong if pg_upgrade could emit
> a bunch of commands like:
>
> ALTER SUBSCRIPTION subname ADD RELATION relid STATE 'x' LSN 'X/Y';
>

How will we know the STATE and LSN of each relation? But I think even
if know that what is the guarantee that publisher side still has still
retained the corresponding slots?

-- 
With Regards,
Amit Kapila.




Re: Weird failure with latches in curculio on v15

2023-02-18 Thread Robert Haas
On Thu, Feb 16, 2023 at 10:02 PM Andres Freund  wrote:
> But there's nothing inherent in that. We know for certain which files we're
> going to archive. And we don't need to work one-by-one. The archiver could
> just start multiple subprocesses at the same time.

But what if it doesn't want to start multiple processes, just
multiplex within a single process?

> What I was trying to point out was that the work a "restorer" process has to
> do is more speculative, because we don't know when we'll promote, whether
> we'll follow a timeline increase, whether the to-be-restored WAL already
> exists. That's solvable, but a bunch of the relevant work ought to be solved
> in core core code, instead of just in archive modules.

Yep, I can see that there are some things to figure out there, and I
agree that they should be figured out in the core code.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Weird failure with latches in curculio on v15

2023-02-18 Thread Robert Haas
On Thu, Feb 16, 2023 at 10:28 PM Nathan Bossart
 wrote:
> > Hmm. So in this design, the archiver doesn't really do the archiving
> > any more, because the interface makes that impossible. It has to use a
> > separate background worker process for that, full stop.
> >
> > I don't think that's a good design. It's fine if some people want to
> > implement it that way, but it shouldn't be forced by the interface.
>
> I don't think it would force you to use a background worker, but if you
> wanted to, the tools would be available.  At least, that is the intent.

I'm 100% amenable to somebody demonstrating how that is super easy,
barely an inconvenience. But I think we would need to see some code
showing at least what the API is going to look like, and ideally a
sample implementation, in order for me to be convinced of that. What I
suspect is that if somebody tries to do that they are going to find
that the core API has to be quite opinionated about how the archive
module has to do things, which I think is not what we want. But if
that turns out to be false, cool!

-- 
Robert Haas
EDB: http://www.enterprisedb.com




questions about possible enhancing protocol of communication between psql and pager

2023-02-18 Thread Pavel Stehule
Hi

I am starting to think about the next generation of pspg (
https://github.com/okbob/pspg).

Now, the communication between psql and pspg is very simple. psql reads all
data, does formatting to pretty table format, and sends it through pipe to
pspg. pspg stores all data and tries to detect header lines that are used
for identification of column widths. The information about number of header
rows and about width of columns are used for printing fixed or moveable
data.

It is working surprisingly well, but there are limits.

1. In some cases it can be slow - you can try \x and select * from pg_proc.
The formatted table can be full of spaces, and the formatting can be slow,
and passing via pipe too. The difference is in 2 versus 20 seconds.

2. It cannot to work when FETCH_COUNT is non zero

Passing data in csv format to pager can be very significantly faster.
Processing csv data can be much more robust than processing tabular format
that depends on a lot of pset settings. Unfortunately, psql doesn't send in
csv all information. There is not any information about used data types,
there is no title. Unfortunately, there is not any info about wanted
formatting settings from psql - so the user's comfort is less than could be.

Can be nice (from my perspective) if pspg can read some metadata about the
result. The question is - how to do it?

There are three possibilities:

a) psql sends some control data through a pipe. Currently we use only text
protocol, but we can use some ascii control chars, so it is not a problem
to detect start of header, and detect start of data, and possibly we can
detect end of data.

b) psql can send data like now, but before the start of the pager can fill
some environment variables. A pager can read these variables - like
PSQL_PAGER_SETTING, PSQL_PAGER_DATADESC, ...

c) we can introduce a new custom format (can be named "pspg")- it can be
based on csv or tsv, where the first part is data description, following
data, and it can be ended by some special flag.

What do you think about described possibilities?

regards

Pavel


Re: Move defaults toward ICU in 16?

2023-02-18 Thread Robert Haas
On Fri, Feb 17, 2023 at 10:32 PM Jeff Davis  wrote:
> Thinking about this more, it's not clear to me if this would be in
> scope for pg_upgrade or not. If pg_upgrade is fixing up the new cluster
> rather than checking for compatibility, why doesn't it just take over
> and do the initdb for the new cluster itself? That would be less
> confusing for users, and avoid some weirdness (like, if you drop the
> database "postgres" on the original, why does it reappear after an
> upgrade?).
>
> Someone might want to do something interesting to the new cluster
> before the upgrade, but it's not clear from the docs what would be both
> useful and safe.

I agree with all of this. I think it would be fantastic if pg_upgrade
did the initdb itself. It would be simple to make this optional
behavior, too: if the destination directory does not exist or is
empty, initdb into it, otherwise skip that. That might be too
automagical, so we could add add a --no-initdb option. If not
specified, the destination directory must either not exist or be
empty; else it must exist and look like a data directory.

I completely concur with the idea that doing something with the new
cluster before the upgrade is weird, and I don't think we should
encourage people to do it. Nevertheless, as the threads to which
Justin linked probably say, I'm not sure that it's a good idea to
completely slam the door shut on that option. If we did want to move
in that direction, though, having pg_upgrade do the initdb would be an
excellent first step. We could at some later time decide to remove the
--no-initdb option; or maybe we'll decide that it's good to keep it
for emergencies, which is my present bias. In any event, the resulting
system would be more usable and less error-prone than what we have
today.

-- 
Robert Haas
EDB: http://www.enterprisedb.com