Re: WIN32 pg_import_system_collations

2022-01-24 Thread Dmitry Koval

Hi Juan José,

I a bit tested this feature and have small doubts about block:

+/*
+ * Windows will use hyphens between language and territory, where POSIX
+ * uses an underscore. Simply make it POSIX looking.
+ */
+ hyphen = strchr(localebuf, '-');
+ if (hyphen)
+*hyphen = '_';

After this block modified collation name is used in function

GetNLSVersionEx(COMPARE_STRING, wide_collcollate, )

(see win32_read_locale() -> CollationFromLocale() -> CollationCreate()
call). Is it correct to use (wide_collcollate = "en_NZ") instead of
(wide_collcollate = "en-NZ") in GetNLSVersionEx() function?

1) Documentation [1], [2], quote:
If it is a neutral locale for which the script is significant,
the pattern is 

Foreign join search stops on the first try

2022-01-24 Thread Alexander Pyhalov

Hi.

I was investigating why foreign join is not happening and found the 
following issue. When we search for foreign join path we stop on the 
first try. For example, in case
A JOIN B JOIN C where all of them are foreign tables on the same server 
firstly we estimate A JOIN B, A JOIN C and B JOIN C costs. Then we 
select one of them (let's say A JOIN B) and estimate (A JOIN B) JOIN C 
cost. On this stage we fill in joinrel->fdw_private in joinrel (A, B, 
C). Now we could look at A JOIN (B JOIN C), but we don't, as
in contrib/postgres_fdw/postgres_fdw.c:5962  (in 
postgresGetForeignJoinPaths()) we have:


/*
 * Skip if this join combination has been considered already.
 */
if (joinrel->fdw_private)
return;

This can lead to situation when A JOIN B JOIN C cost is not correctly 
estimated (as we don't look at another join orders) and foreign join is 
not pushed down when it would be efficient.


Attached patch tries to fix this. Now we collect different join paths 
for joinrels, if join is possible at all. However, as we don't know what 
path is really selected, we can't precisely estimate fpinfo costs.


The following example shows the case, when we fail to push down foreign 
join due to mentioned issue.


create extension postgres_fdw;

DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER 
postgres_fdw

OPTIONS (dbname '$$||current_database()||$$',
 port '$$||current_setting('port')||$$'
)$$;
END
$d$;

create user MAPPING FOR PUBLIC SERVER loopback ;

CREATE SCHEMA test;

CREATE TABLE test.district (
d_id smallint NOT NULL PRIMARY KEY,
d_next_o_id integer NOT NULL,
d_name varchar(100) NOT NULL
);

CREATE TABLE test.stock (
s_i_id integer NOT NULL PRIMARY KEY,
s_quantity smallint NOT NULL,
s_data varchar(100) NOT NULL
);

CREATE TABLE test.order_line (
ol_o_id integer NOT NULL PRIMARY KEY,
ol_i_id integer NOT NULL,
ol_d_id smallint NOT NULL
);

import foreign schema test from server loopback into public ;

INSERT INTO district SELECT i, 2 + 20*i  , 'test' FROM 
generate_series(1,10) i;
INSERT INTO stock SELECT i, round(random()*100) + 10, 'test data' FROM 
generate_series(1,1) i;
INSERT INTO order_line SELECT i, i%1, i%10   FROM 
generate_series(1,3) i;


analyze test.order_line, test.district ,test.stock;
analyze order_line, district ,stock;


Without patch:

explain analyze verbose SELECT * FROM order_line, stock, district WHERE 
ol_d_id = 1 AND d_id = 1 AND (ol_o_id < d_next_o_id) AND ol_o_id >= 
(d_next_o_id - 20) AND s_i_id = ol_i_id AND s_quantity < 11;
 
QUERY PLAN

-
 Hash Join  (cost=382.31..966.86 rows=2 width=37) (actual 
time=5.459..5.466 rows=0 loops=1)
   Output: order_line.ol_o_id, order_line.ol_i_id, order_line.ol_d_id, 
stock.s_i_id, stock.s_quantity, stock.s_data, district.d_id, 
district.d_next_o_id, district.d_name

   Hash Cond: (order_line.ol_i_id = stock.s_i_id)
   ->  Foreign Scan  (cost=100.00..683.29 rows=333 width=21) (actual 
time=2.773..2.775 rows=2 loops=1)
 Output: order_line.ol_o_id, order_line.ol_i_id, 
order_line.ol_d_id, district.d_id, district.d_next_o_id, district.d_name

 Relations: (public.order_line) INNER JOIN (public.district)
 Remote SQL: SELECT r1.ol_o_id, r1.ol_i_id, r1.ol_d_id, r3.d_id, 
r3.d_next_o_id, r3.d_name FROM (test.order_line r1 INNER JOIN 
test.district r3 ON (((r1.ol_o_id < r3.d_next_o_id)) AND ((r1.ol_o_id >= 
(r3.d_next_o_id - 20))) AND ((r3.d_id = 1)) AND ((r1.ol_d_id = 1
   ->  Hash  (cost=281.42..281.42 rows=71 width=16) (actual 
time=2.665..2.665 rows=48 loops=1)

 Output: stock.s_i_id, stock.s_quantity, stock.s_data
 Buckets: 1024  Batches: 1  Memory Usage: 11kB
 ->  Foreign Scan on public.stock  (cost=100.00..281.42 rows=71 
width=16) (actual time=2.625..2.631 rows=48 loops=1)

   Output: stock.s_i_id, stock.s_quantity, stock.s_data
   Remote SQL: SELECT s_i_id, s_quantity, s_data FROM 
test.stock WHERE ((s_quantity < 11))

 Planning Time: 1.812 ms
 Execution Time: 8.534 ms
(15 rows)

With patch:

explain analyze verbose SELECT * FROM order_line, stock, district WHERE 
ol_d_id = 1 AND d_id = 1 AND (ol_o_id < d_next_o_id) AND ol_o_id >= 
(d_next_o_id - 20) AND s_i_id = ol_i_id AND s_quantity < 11;
 
 
  QUERY PLAN


Re: WIN32 pg_import_system_collations

2022-01-24 Thread Dmitry Koval

Hi Juan José,

I a bit tested this feature and have small doubts about block:

+/*
+ * Windows will use hyphens between language and territory, where POSIX
+ * uses an underscore. Simply make it POSIX looking.
+ */
+ hyphen = strchr(localebuf, '-');
+ if (hyphen)
+*hyphen = '_';

After this block modified collation name is used in function

GetNLSVersionEx(COMPARE_STRING, wide_collcollate, )

(see win32_read_locale() -> CollationFromLocale() -> CollationCreate()
call). Is it correct to use (wide_collcollate = "en_NZ") instead of
(wide_collcollate = "en-NZ") in GetNLSVersionEx() function?

1) Documentation [1], [2], quote:
If it is a neutral locale for which the script is significant,
the pattern is 

Re: GUC flags

2022-01-24 Thread Michael Paquier
On Mon, Jan 24, 2022 at 07:07:29PM -0600, Justin Pryzby wrote:
> I think you'll find that this is how it's done elsewhere in postgres.
> In the frontend, see appendPQExpBufferChar and appendPGArray and 3e6e86abc.
> On the backend, see: git grep -F "'{'" |grep -w appendStringInfoChar

Yeah, I was not careful enough to look after the uses of TEXTARRAYOID,
and there is one in the same area, as of config_enum_get_options().
At least things are consistent this way.

> I updated the patch with a regex to accommodate GUCs without '=', as needed
> since f47ed79cc8.

Okay.  While looking at your proposal, I was thinking that we had
better include the array with the flags by default in pg_settings, and
not just pg_show_all_settings().

+SELECT lower(name) FROM pg_settings_flags WHERE NOT not_in_sample EXCEPT
+SELECT regexp_replace(ln, '^#?([_[:alpha:]]+) (= .*|[^ ]*$)', '\1') AS guc
+FROM (SELECT regexp_split_to_table(pg_read_file('postgresql.conf'),
'\n') AS ln) conf

Tests reading postgresql.conf would break on instances started with a
custom config_file provided by a command line, no?  You could change
the patch to use the value provided by the GUC, instead, but I am not
convinced that we need that at all, even if check_guc does so.

Regarding the tests, I am not sure if we need to be this much
extensive.  We could take is slow, and I am also wondering if this
could not cause some issues with GUCs loaded via
shared_preload_libraries if we are too picky about the requirements,
as this could cause installcheck failures.

The following things have been issues recently, though, and they look
sensible enough to have checks for: 
- GUC_NOT_IN_SAMPLE with developer options.
- Query-tuning parameters with GUC_EXPLAIN, and we'd better add some
comments in the test to explain why there are exceptions like
default_statistics_target.
- preset parameters marked as runtime-computed.
- NO_SHOW_ALL and NOT_IN_SAMPLE.

Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: Fix BUG #17335: Duplicate result rows in Gather node

2022-01-24 Thread David Rowley
On Tue, 25 Jan 2022 at 17:35, Yura Sokolov  wrote:
> And another attempt to fix tests volatility.

FWIW, I had not really seen the point in adding a test for this.   I
did however see a point in it with your original patch. It seemed
useful there to verify that Gather and GatherMerge did what we
expected with 1 worker.

David




Support escape sequence for cluster_name in postgres_fdw.application_name

2022-01-24 Thread Fujii Masao

Hi,

Commit 6e0cb3dec1 allowed postgres_fdw.application_name to include escape 
sequences %a (application name), %d (database name), %u (user name) and %p 
(pid). In addition to them, I'd like to support the escape sequence (e.g., %C) 
for cluster name there. This escape sequence is helpful to investigate where 
each remote transactions came from. Thought?

Patch attached.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index fc3ce6a53a..4120c21d09 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -489,6 +489,9 @@ process_pgfdw_appname(const char *appname)
case 'a':
appendStringInfoString(, application_name);
break;
+   case 'C':
+   appendStringInfoString(, cluster_name);
+   break;
case 'd':
appendStringInfoString(, 
MyProcPort->database_name);
break;
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 2bb31f1125..d7f545958c 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -983,6 +983,13 @@ postgres=# SELECT postgres_fdw_disconnect_all();
  %a
  Application name on local server
 
+
+ %C
+ 
+  Cluster name in local server
+  (see  for details)
+ 
+
 
  %u
  User name on local server
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 6bb81707b0..f1bfe79feb 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -271,7 +271,7 @@ extern int  temp_file_limit;
 
 extern int num_temp_buffers;
 
-extern char *cluster_name;
+extern PGDLLIMPORT char *cluster_name;
 extern PGDLLIMPORT char *ConfigFileName;
 extern char *HbaFileName;
 extern char *IdentFileName;


Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Fujii Masao




On 2022/01/25 8:18, Mark Dilger wrote:




On Jan 24, 2022, at 2:21 PM, Stephen Frost  wrote:

Superuser is a problem specifically because it gives people access to do 
absolutely anything, both for security and safety concerns. Disallowing a way 
to curtail that same risk when it comes to role ownership invites exactly those 
same problems.


Before the patch, users with CREATEROLE can do mischief.  After the patch, 
users with CREATEROLE can do mischief.  The difference is that the mischief 
that can be done after the patch is a proper subset of the mischief that can be 
done before the patch.  (Counter-examples highly welcome.)

Specifically, I claim that before the patch, non-superuser "bob" with CREATEROLE can 
interfere with *any* non-superuser.  After the patch, non-superuser "bob" with CREATEROLE 
can interfere with *some* non-superusers; specifically, with non-superusers he created himself, or 
which have had ownership transferred to him.

Restricting the scope of bob's mischief is a huge win, in my view.


+1

One of "mischiefs" I'm thinking problematic is that users with CREATEROLE can 
give any predefined role that they don't have, to other users including themselves. For 
example, users with CREATEROLE can give pg_execute_server_program to themselves and run 
any OS commands by COPY PROGRAM. This would be an issue when providing something like 
PostgreSQL cloud service that wants to prevent end users from running OS commands but 
allow them to create/drop roles. Does the proposed patch fix also this issue?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Design of pg_stat_subscription_workers vs pgstats

2022-01-24 Thread Andres Freund
Hi,

I was looking the shared memory stats patch again. The rebase of which
collided fairly heavily with the addition of pg_stat_subscription_workers.

I'm concerned about the design of pg_stat_subscription_workers. The view was
introduced in


commit 8d74fc96db5fd547e077bf9bf4c3b67f821d71cd
Author: Amit Kapila 
Date:   2021-11-30 08:54:30 +0530

Add a view to show the stats of subscription workers.

This commit adds a new system view pg_stat_subscription_workers, that
shows information about any errors which occur during the application of
logical replication changes as well as during performing initial table
synchronization. The subscription statistics entries are removed when the
corresponding subscription is removed.

It also adds an SQL function pg_stat_reset_subscription_worker() to reset
single subscription errors.

The contents of this view can be used by an upcoming patch that skips the
particular transaction that conflicts with the existing data on the
subscriber.

This view can be extended in the future to track other xact related
statistics like the number of xacts committed/aborted for subscription
workers.

Author: Masahiko Sawada
Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip 
Kumar, Takamichi Osumi, Amit Kapila
Discussion: 
https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xjfuvihn...@mail.gmail.com


I tried to skim-read the discussion leading to its introduction, but it's
extraordinarily long: 474 messages in [1], 131 messages in [2], as well as a
few other associated threads.


>From the commit message alone I am concerned that this appears to be intended
to be used to store important state in pgstats. For which pgstats is
fundamentally unsuitable (pgstat can loose state during normal operation,
always looses state during crash restarts, the state can be reset).

I don't really understand the name "pg_stat_subscription_workers" - what
workers are stats kept about exactly? The columns don't obviously refer to a
single worker or such? From the contents it should be name
pg_stat_subscription_table_stats or such. But no, that'd not quite right,
because apply errors are stored per-susbcription, while initial sync stuff is
per-(subscription, table).

The pgstat entries are quite wide (292 bytes), because of the error message
stored. That's nearly twice the size of PgStat_StatTabEntry. And as far as I
can tell, once there was an error, we'll just keep the stats entry around
until the subscription is dropped.  And that includes stats for long dropped
tables, as far as I can see - except that they're hidden from view, due to a
join to pg_subscription_rel.


To me this looks like it's using pgstat as an extremely poor IPC mechanism.


Why isn't this just storing data in pg_subscription_rel?

Greetings,

Andres Freund

[1] 
https://www.postgresql.org/message-id/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK%3D30xJfUVihNZDA%40mail.gmail.com
[2] 
https://postgr.es/m/OSBPR01MB48887CA8F40C8D984A6DC00CED199%40OSBPR01MB4888.jpnprd01.prod.outlook.com




Re: Printing backtrace of postgres processes

2022-01-24 Thread vignesh C
On Mon, Jan 24, 2022 at 1:05 PM torikoshia  wrote:
>
> On 2022-01-14 19:48, Bharath Rupireddy wrote:
> > On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
> >  wrote:
> >>
> >> On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> >> > The Attached v15 patch has the fixes for the same.
> >>
> >> Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as
> >> RfC.
> >
> > The patch was not applying because of the recent commit [1]. I took
> > this opportunity and tried a bunch of things without modifying the
> > core logic of the pg_log_backtrace feature that Vignesh has worked on.
> >
> > I did following -  moved the duplicate code to a new function
> > CheckPostgresProcessId which can be used by pg_log_backtrace,
> > pg_log_memory_contexts, pg_signal_backend and pg_log_query_plan ([2]),
>
> Thanks for refactoring!
> I'm going to use it for pg_log_query_plan after this patch is merged.
>
> > modified the code comments, docs and tests to be more in sync with the
> > commit [1], moved two of ProcessLogBacktraceInterrupt calls (archiver
> > and wal writer) to their respective interrupt handlers. Here's the v16
> > version that I've come up with.
>
> I have some minor comments.
>
> > +
> > +You can get the file name and line number from the logged details
> > by using
> > +gdb/addr2line in linux platforms (users must ensure gdb/addr2line
> > is
> > +already installed).
> > +
> > +1)  "info line *address" from gdb on postgres executable. For example:
> > +gdb ./postgres
> > +(gdb) info line *0x71c25d
> > +Line 378 of "execMain.c" starts at address 0x71c25d
> > standard_ExecutorRun+470
> > and ends at 0x71c263
> > standard_ExecutorRun+476.
> > +OR
> > +2) Using "addr2line -e postgres address", For example:
> > +addr2line -e ./postgres 0x71c25d
> > +/home/postgresdba/src/backend/executor/execMain.c:378
> > +
> > +   
> > +
>
> Isn't it better to remove line 1) and 2) from ?
> I just glanced at the existing sgml, but  seems to
> contain only codes.

Modified

> > + * CheckPostgresProcessId -- check if the process with given pid is a
> > backend
> > + * or an auxiliary process.
> > + *
> > +
> > + */
>
> Isn't the 4th line needless?

Modified

> BTW, when I saw the name of this function, I thought it just checks if
> the specified pid is PostgreSQL process or not.
> Since it returns the pointer to the PGPROC or BackendId of the PID, it
> might be kind to write comments about it.

Modified

Thanks for the comments, attached v17 patch has the fix for the same.

Regards,
Vignesh
From 5859e2c34f73206cf48a40d9e429b02f923b7081 Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Tue, 25 Jan 2022 08:21:22 +0530
Subject: [PATCH v17] Add function to log the backtrace of the specified
 postgres process.

This commit adds pg_log_backtrace() function that requests to log
the backtrace of the specified backend or auxiliary process except
logger and statistic collector.

Only superusers are allowed to request to log the backtrace
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its backtrace at LOG_SERVER_ONLY level,
so that the backtrace will appear in the server log but not
be sent to the client.

Bump catalog version.

Authors: Vignesh C, Bharath Rupireddy
---
 doc/src/sgml/func.sgml| 62 +++
 src/backend/catalog/system_functions.sql  |  2 +
 src/backend/postmaster/autovacuum.c   |  4 ++
 src/backend/postmaster/checkpointer.c |  4 ++
 src/backend/postmaster/interrupt.c|  4 ++
 src/backend/postmaster/pgarch.c   | 10 +++-
 src/backend/postmaster/startup.c  |  4 ++
 src/backend/postmaster/walwriter.c|  4 ++
 src/backend/storage/ipc/procarray.c   | 56 +
 src/backend/storage/ipc/procsignal.c  | 40 +
 src/backend/storage/ipc/signalfuncs.c | 73 ---
 src/backend/tcop/postgres.c   |  4 ++
 src/backend/utils/adt/mcxtfuncs.c | 40 +++--
 src/backend/utils/error/elog.c| 19 --
 src/backend/utils/init/globals.c  |  1 +
 src/include/catalog/pg_proc.dat   |  5 ++
 src/include/miscadmin.h   |  1 +
 src/include/storage/procarray.h   |  2 +
 src/include/storage/procsignal.h  |  3 +-
 src/include/utils/elog.h  |  2 +
 src/test/regress/expected/backtrace.out   | 49 +++
 src/test/regress/expected/backtrace_1.out | 55 +
 src/test/regress/parallel_schedule|  2 +-
 src/test/regress/sql/backtrace.sql| 33 ++
 24 files changed, 414 insertions(+), 65 deletions(-)
 create mode 100644 src/test/regress/expected/backtrace.out
 create mode 100644 src/test/regress/expected/backtrace_1.out
 create mode 100644 src/test/regress/sql/backtrace.sql

diff --git 

Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Amit Kapila
On Tue, Jan 25, 2022 at 12:26 AM Euler Taveira  wrote:
>
> On Mon, Jan 24, 2022, at 12:58 AM, Michael Paquier wrote:
>
> FWIW, I find the API changes of HeapDetermineModifiedColumns() and
> ExtractReplicaIdentity() a bit grotty.  Shouldn't we try to flatten
> the old tuple instead?  There are things like
> toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies
> HeapTupleHasExternal(), or just heap_copy_tuple_as_datum().
>
> I checked v4 and I don't like the HeapDetermineModifiedColumns() and
> heap_tuple_attr_equals() changes either. It seems it is hijacking these
> functions to something else. I would suggest to change the signature to
>
> static void
> heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
>HeapTuple tup1, HeapTuple tup2,
>bool *is_equal, bool *key_has_external);
>
> and
>
> static void
> HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
>  HeapTuple oldtup, HeapTuple newtup,
>  Bitmapset *modified_attrs, bool 
> *key_has_external);
>
> I didn't figure out a cheap way to check if the key has external value other
> than slightly modifying the HeapDetermineModifiedColumns() function and its
> subroutine heap_tuple_attr_equals().
>

I am not sure if your proposal is much different compared to v4 or how
much it improves the situation? I see you didn't consider
'check_external_attr' parameter and I think that is important to know
if the key has any external toast value. Overall, I see your point
that the change of APIs looks a bit ugly. But, I guess that is more
due to their names and current purpose. I think it could be better if
we bring all the code of heap_tuple_attr_equals in its only caller
HeapDetermineModifiedColumns or at least part of the code where we get
attr value and can determine whether the value is stored externally.
Then change name of HeapDetermineModifiedColumns to
HeapDetermineColumnsInfo with additional parameters.

> As Alvaro said I don't think adding
> HeapTupleHasExternal() (as in v3) is a good idea because it does not optimize
> genuine cases such as a table whose PK is an integer and contains a single
> TOAST column.
>
> Another suggestion is to keep key_changed and add another attribute
> (key_has_external) to ExtractReplicaIdentity(). If we need key_changed in the
> future we'll have to decompose it again.
>

True, but we can make the required changes at that point as well.
OTOH, we can do what you are suggesting as well but I am not sure if
that is required.

-- 
With Regards,
Amit Kapila.




Re: PublicationActions - use bit flags.

2022-01-24 Thread Greg Nancarrow
On Tue, Jan 25, 2022 at 7:31 AM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:
>
> Why can't GetRelationPublicationActions() have the PublicationActions as
> a return value, instead of changing it to an output argument?

That would be OK too, for now, for the current (small size, typically
4-byte) PublicationActions struct.
But if that function was extended in the future to return more publication
information than just the PublicationActions struct (and I'm seeing that in
the filtering patches [1]), then using return-by-value won't be as
efficient as pass-by-reference, and I'd tend to stick with
pass-by-reference in that case.

[1]
https://postgr.es/m/OS0PR01MB5716B899A66D2997EF28A1B3945F9%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Regards,
Greg Nancarrow
Fujitsu Australia


Re: row filtering for logical replication

2022-01-24 Thread Peter Smith
A couple more comments for the v69-0001 TAP tests.

~~~

1. src/test/subscription/t/027_row_filter.pl

+# The subscription of the ALL TABLES IN SCHEMA publication means
there should be
+# no filtering on the tablesync COPY, so all expect all 5 will be present.
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM schema_rf_x.tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x
should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have
not effect.
+# Expected:
+# tab_rf_x   :  5 initial rows + 2 new rows = 7 rows
+# tab_rf_partition   :  1 initial row  + 1 new row  = 2 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO
schema_rf_x.tab_rf_x (x) VALUES (-99), (99)");
+$node_publisher->safe_psql('postgres', "INSERT INTO
schema_rf_x.tab_rf_partitioned (x) VALUES (5), (25)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM schema_rf_x.tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM
public.tab_rf_partition");
+is($result, qq(20
+25), 'check table tab_rf_partition should be filtered');

That comment ("Similarly, normal filtering after the initial phase
will also have not effect.") seems no good:
- it is too vague for the tab_rf_x tablesync
- it seems completely wrong for the tab_rf_partition table (because
that filter is working fine)

I'm not sure exactly what the comment should say, but possibly
something like this (??):

BEFORE:
Similarly, normal filtering after the initial phase will also have not effect.
AFTER:
Similarly, the table filter for tab_rf_x (after the initial phase) has
no effect when combined with the ALL TABLES IN SCHEMA. Meanwhile, the
filter for the tab_rf_partition does work because that partition
belongs to a different schema (and publish_via_partition_root =
false).

~~~

2. src/test/subscription/t/027_row_filter.pl

Here is a 2nd place with the same broken comment:

+# The subscription of the FOR ALL TABLES publication means there should be no
+# filtering on the tablesync COPY, so all expect all 5 will be present.
+my $result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM tab_rf_x");
+is($result, qq(5), 'check initial data copy from table tab_rf_x
should not be filtered');
+
+# Similarly, normal filtering after the initial phase will also have
not effect.
+# Expected: 5 initial rows + 2 new rows = 7 rows
+$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x)
VALUES (-99), (99)");
+$node_publisher->wait_for_catchup($appname);
+$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
FROM tab_rf_x");
+is($result, qq(7), 'check table tab_rf_x should not be filtered');

Here I also think the comment maybe should just say something like:

BEFORE:
Similarly, normal filtering after the initial phase will also have not effect.
AFTER:
Similarly, the table filter for tab_rf_x (after the initial phase) has
no effect when combined with the ALL TABLES IN SCHEMA.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Printing backtrace of postgres processes

2022-01-24 Thread vignesh C
On Mon, Jan 24, 2022 at 10:06 PM Fujii Masao
 wrote:
>
>
>
> On 2022/01/24 16:35, torikoshia wrote:
> > On 2022-01-14 19:48, Bharath Rupireddy wrote:
> >> On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
> >>  wrote:
> >>>
> >>> On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> >>> > The Attached v15 patch has the fixes for the same.
> >>>
> >>> Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as 
> >>> RfC.
> >>
> >> The patch was not applying because of the recent commit [1]. I took
> >> this opportunity and tried a bunch of things without modifying the
> >> core logic of the pg_log_backtrace feature that Vignesh has worked on.
>
> I have one question about this feature. When the PID of auxiliary process 
> like archiver is specified, probably the function always reports the same 
> result, doesn't it? Because, for example, the archiver always logs its 
> backtrace in HandlePgArchInterrupts().

It can be either from HandlePgArchInterrupts in pgarch_MainLoop or
from HandlePgArchInterrupts in pgarch_ArchiverCopyLoop, since the
archiver functionality is mainly in these functions.

Regards,
Vignesh




Re: pg_dump new feature: exporting functions only. Bad or good idea ?

2022-01-24 Thread Michael Paquier
On Tue, Nov 09, 2021 at 03:23:07PM +0100, Daniel Gustafsson wrote:
> Looking at this thread I think it makes sense to go ahead with this patch.  
> The
> filter functionality worked on in another thread is dealing with 
> cherry-picking
> certain objects where this is an all-or-nothing switch, so I don't think they
> are at odds with each other.

Including both procedures and functions sounds natural from here.  Now
I have a different question, something that has not been discussed in
this thread at all.  What about patterns?  Switches like --table or
--extension are able to digest a psql-like pattern to decide which
objects to dump.  Is there a reason not to have this capability for
this new switch with procedure names?  I mean to handle the case
without the function arguments, even if the same name is used by
multiple functions with different arguments.
--
Michael


signature.asc
Description: PGP signature


RE: Support tab completion for upper character inputs in psql

2022-01-24 Thread tanghy.f...@fujitsu.com
On Monday, January 24, 2022 6:36 PM, Peter Eisentraut 
 wrote:
> The way your patch works now is that the case-insensitive behavior you
> are implementing only works if the client encoding is a single-byte
> encoding.  This isn't what downcase_identifier() does;
> downcase_identifier() always works for ASCII characters.  As it is, this
> patch is nearly useless, since very few people use single-byte client
> encodings anymore.  Also, I think it would be highly confusing if the
> tab completion behavior depended on the client encoding in a significant
> way.

Thanks for your review. I misunderstood the logic of downcase_identifier().
Modified the code to support ASCII characters input. 

> Also, as I had previously suspected, your patch treats the completion of
> enum labels in a case-insensitive way (since it all goes through
> _complete_from_query()), but enum labels are not case insensitive.  You
> can observe this behavior using this test case:
> 
> +check_completion("ALTER TYPE enum1 RENAME VALUE 'F\t\t", qr|foo|, "FIXME");
> +
> +clear_line();

Your suspect is correct. I didn't aware enum labels are case sensitive.
I've added this test to the tap tests. 

> You should devise a principled way to communicate to
> _complete_from_query() whether it should do case-sensitive or
> -insensitive completion.  We already have COMPLETE_WITH() and
> COMPLETE_WITH_CS() etc. to do this in other cases, so it should be
> straightforward to adapt a similar system.

I tried to add a flag(casesensitive) in the _complete_from_query().
Now the attached patch passed all the added tap tests.

Regards,
Tang


v13-0001-Support-tab-completion-with-a-query-result-for-u.patch
Description:  v13-0001-Support-tab-completion-with-a-query-result-for-u.patch


Re: Schema variables - new implementation for Postgres 15

2022-01-24 Thread Julien Rouhaud
Hi,

On Mon, Jan 24, 2022 at 12:33:11PM +0100, Pavel Stehule wrote:
> 
> here is updated patch with locking support

Thanks for updating the patch!

While the locking is globally working as intended, I found a few problems with
it.

First, I don't think that acquiring the lock in
get_session_variable_type_typmod_collid() and prepare_variable_for_reading() is
the correct approach.  In transformColumnRef() and transformLetStmt() you first
call IdentifyVariable() to check if the given name is a variable without
locking it and later try to lock the variable if you get a valid Oid.  This is
bug prone as any other backend could drop the variable between the two calls
and you would end up with a cache lookup failure.  I think the lock should be
acquired during IdentifyVariable.  It should probably be optional as one
codepath only needs the information to raise a warning when a variable is
shadowed, so a concurrent drop isn't a problem there.

For prepare_variable_for_reading(), the callers are CopySessionVariable() and
GetSessionVariable().  IIUC those should take care of executor-time locks, but
shouldn't there be some changes for planning, like in AcquirePlannerLocks()?

Some other comments on this part of the patch:

@@ -717,6 +730,9 @@ RemoveSessionVariable(Oid varid)
Relationrel;
HeapTuple   tup;

+   /* Wait, when dropped variable is not used */
+   LockDatabaseObject(VariableRelationId, varid, 0, AccessExclusiveLock);

Why do you explicitly try to acquire an AEL on the variable here?
RemoveObjects / get_object_address should guarantee that this was already done.
You could add an assert LockHeldByMe() here, but no other code path do it so it
would probably waste cycles in assert builds for nothing as it's a fundamental
guarantee.


@@ -747,6 +763,9 @@ RemoveSessionVariable(Oid varid)
 * only when current transaction will be commited.
 */
register_session_variable_xact_action(varid, ON_COMMIT_RESET);
+
+   /* Release lock */
+   UnlockDatabaseObject(VariableRelationId, varid, 0, AccessExclusiveLock);
 }

Why releasing the lock here?  It will be done at the end of the transaction,
and you certainly don't want other backends to start using this variable in
between.  Also, since you acquired the lock a second time it only decreases the
lock count in the locallock so the lock isn't released anyway.

+ * Returns type, typmod and collid of session variable.
+ *
+ * As a side effect this function acquires AccessShareLock on the
+ * related session variable.
  */
 void
-get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod, 
Oid *collid)
+get_session_variable_type_typmod_collid(Oid varid, Oid *typid, int32 *typmod, 
Oid *collid,
+   bool lock_held)


lock_held is a bit misleading.  If you keep some similar parameter for this or
another function, maybe name it lock_it or something like that instead?

Also, the comment isn't accurate and should say that an ASL is acquired iff the
variable is true.

+   /*
+* Acquire a lock on session variable, which we won't release until commit.
+* This ensure that one backend cannot to drop session variable used by
+* second backend.
+*/

(and similar comments)
I don't think it's necessary to explain why we acquire locks, we should just
say that the lock will be kept for the whole transaction (and not until a
commit)

And while looking at nearby code, it's probably worthwhile to add an Assert in
create_sessionvars_hashtable() to validate that sessionvars htab is NULL.




Re: Skipping logical replication transactions on subscriber side

2022-01-24 Thread Amit Kapila
On Mon, Jan 24, 2022 at 7:40 PM Peter Eisentraut
 wrote:
>
> On 22.01.22 10:41, Amit Kapila wrote:
> >> Additionally, the description for pg_stat_subscription_workers should 
> >> describe what happens once the transaction represented by last_error_xid 
> >> has either been successfully processed or skipped.  Does this "last error" 
> >> stick around until another error happens (which is hopefully very rare) or 
> >> does it reset to blanks?
> >>
> > It will be reset only on subscription drop, otherwise, it will stick
> > around until another error happens.
>
> Is this going to be a problem with transaction ID wraparound?
>

I think to avoid this we can send a message to clear this (at least to
clear XID in the view) after skipping the xact but there is no
guarantee that it will be received by the stats collector.
Additionally, the worker can periodically (say after every N (100,
500, etc) successful transaction) send a clear message after
successful apply. This will ensure that eventually the error entry
will be cleared.

>  Do we
> need to use 64-bit xids for this?
>

For 64-bit XIds, as this reported XID is for the remote transactions,
I think we need to add 4-bytes to each transaction message(say Begin)
and that could be costly for small transactions. We also probably need
to make logical decoding aware of 64-bit XID? Note that XIDs in WAL
records are still 32-bit XID. I don't think this feature deserves such
a big (in terms of WAL and network message size) change.

-- 
With Regards,
Amit Kapila.




Re: Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?

2022-01-24 Thread Michael Paquier
On Mon, Jan 10, 2022 at 11:04:05AM +0530, Bharath Rupireddy wrote:
> On Mon, Jan 10, 2022 at 10:58 AM Jaime Casanova
>  wrote:
>> Now; I do think that the secondd patch, the one that just skips update
>> of the state in control file, is the way to go. The other patch adds too
>> much complexity for a small return.
> 
> Thanks. Attaching the above patch.

I agree that the addition of DB_IN_END_OF_RECOVERY_CHECKPOINT is not
necessary as the control file state will be reflected in a live server
once it the instance is ready to write WAL after promotion, as much as
I agree that the state stored in the control file because of the
end-of-recovery record does not reflect the reality.

Now, I also find confusing the state of CreateCheckpoint() once this
patch gets applied.  Now the code and comments imply that an
end-of-recovery checkpoint is a shutdown checkpoint because they
perform the same actions, which is fine.  Could it be less confusing
to remove completely the "shutdown" variable instead and replace those
checks with "flags"?  What the patch is doing is one step in this
direction.
--
Michael


signature.asc
Description: PGP signature


Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2022-01-24 Thread Shruthi Gowda
On Tue, Jan 25, 2022 at 1:14 AM Robert Haas  wrote:
>
> On Sat, Jan 22, 2022 at 2:20 AM Shruthi Gowda  wrote:
> > Agree. In the latest patch, the template0 and postgres OIDs are fixed
> > to unused manually assigned OIDs 4 and 5 respectively. These OIDs are
> > no more listed as unused OIDs.
>
> Thanks. Committed with a few more cosmetic changes.

Thanks, Robert for committing this patch.




Re: Correct error message for end-of-recovery record TLI

2022-01-24 Thread Amul Sul
On Tue, Jan 25, 2022 at 10:08 AM Michael Paquier  wrote:
>
> On Wed, Dec 01, 2021 at 07:09:34PM +, Bossart, Nathan wrote:
> > The patch no longer applied, so I went ahead and rebased it.
>
> This was on the CF stack for some time, so applied.  I have also
> changed the messages produced for the shutdown and online checkpoint
> records as they used the same messages so as one can get more
> context depending on the record types.

A ton of thanks for the improvement & the commit.

Regards,
Amul




Re: Correct error message for end-of-recovery record TLI

2022-01-24 Thread Michael Paquier
On Wed, Dec 01, 2021 at 07:09:34PM +, Bossart, Nathan wrote:
> The patch no longer applied, so I went ahead and rebased it.

This was on the CF stack for some time, so applied.  I have also
changed the messages produced for the shutdown and online checkpoint
records as they used the same messages so as one can get more
context depending on the record types.
--
Michael


signature.asc
Description: PGP signature


Re: Fix BUG #17335: Duplicate result rows in Gather node

2022-01-24 Thread Yura Sokolov
В Пн, 24/01/2022 в 16:24 +0300, Yura Sokolov пишет:
> В Вс, 23/01/2022 в 14:56 +0300, Yura Sokolov пишет:
> > В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> > > On Fri, 31 Dec 2021 at 00:14, Yura Sokolov  
> > > wrote:
> > > > Suggested quick (and valid) fix in the patch attached:
> > > > - If Append has single child, then copy its parallel awareness.
> > > 
> > > I've been looking at this and I've gone through changing my mind about
> > > what's the right fix quite a number of times.
> > > 
> > > My current thoughts are that I don't really like the fact that we can
> > > have plans in the following shape:
> > > 
> > >  Finalize Aggregate
> > >->  Gather
> > >  Workers Planned: 1
> > >  ->  Partial Aggregate
> > >->  Parallel Hash Left Join
> > >  Hash Cond: (gather_append_1.fk = gather_append_2.fk)
> > >  ->  Index Scan using gather_append_1_ix on 
> > > gather_append_1
> > >Index Cond: (f = true)
> > >  ->  Parallel Hash
> > >->  Parallel Seq Scan on gather_append_2
> > > 
> > > It's only made safe by the fact that Gather will only use 1 worker.
> > > To me, it just seems too fragile to assume that's always going to be
> > > the case. I feel like this fix just relies on the fact that
> > > create_gather_path() and create_gather_merge_path() do
> > > "pathnode->num_workers = subpath->parallel_workers;". If someone
> > > decided that was to work a different way, then we risk this breaking
> > > again. Additionally, today we have Gather and GatherMerge, but we may
> > > one day end up with more node types that gather results from parallel
> > > workers, or even a completely different way of executing plans.
> > 
> > It seems strange parallel_aware and parallel_safe flags neither affect
> > execution nor are properly checked.
> > 
> > Except parallel_safe is checked in ExecSerializePlan which is called from
> > ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
> > But looks like this check doesn't affect execution as well.
> > 
> > > I think a safer way to fix this is to just not remove the
> > > Append/MergeAppend node if the parallel_aware flag of the only-child
> > > and the Append/MergeAppend don't match. I've done that in the
> > > attached.
> > > 
> > > I believe the code at the end of add_paths_to_append_rel() can remain as 
> > > is.
> > 
> > I found clean_up_removed_plan_level also called from 
> > set_subqueryscan_references.
> > Is there a need to patch there as well?
> > 
> > And there is strange state:
> > - in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
> >   all its subpath's parallel_safe
> >   (therefore there were need to copy it in my patch version),
> > - that means, our AppendPath is parallel_aware but not parallel_safe.
> > It is ridiculous a bit.
> > 
> > And it is strange AppendPath could have more parallel_workers than sum of
> > its children parallel_workers.
> > 
> > So it looks like whole machinery around parallel_aware/parallel_safe has
> > no enough consistency.
> > 
> > Either way, I attach you version of fix with my tests as new patch version.
> 
> Looks like volatile "Memory Usage:" in EXPLAIN brokes 'make check'
> sporadically.
> 
> Applied replacement in style of memoize.sql test.
> 
> Why there is no way to disable "Buckets: %d Buffers: %d Memory Usage: %dkB"
> output in show_hash_info?

And another attempt to fix tests volatility.
From fb09491a401f0df828faf6088158f431b2a69381 Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Sun, 23 Jan 2022 14:53:21 +0300
Subject: [PATCH v4] Fix duplicate result rows after Append path removal.

It could happen Append path is created with "parallel_aware" flag,
but its single child is not. Append path parent (Gather or Gather Merge)
thinks its child is parallel_aware, but after Append path removal Gather's
child become not parallel_aware. Then when Gather/Gather Merge decides
to run child in several workers or worker + leader participation, it
gathers duplicate result rows from several child path invocations.

To fix it don't remove Append/MergeAppend node if it's parallel_aware !=
single child parallel_aware.

Authors: David Rowley, Sokolov Yura.
---
 src/backend/optimizer/plan/setrefs.c  |  24 +++-
 .../expected/gather_removed_append.out| 126 ++
 src/test/regress/parallel_schedule|   1 +
 .../regress/sql/gather_removed_append.sql |  94 +
 4 files changed, 241 insertions(+), 4 deletions(-)
 create mode 100644 src/test/regress/expected/gather_removed_append.out
 create mode 100644 src/test/regress/sql/gather_removed_append.sql

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e44ae971b4b..a7b11b7f03a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1512,8 +1512,16 @@ 

Re: autovacuum prioritization

2022-01-24 Thread Dilip Kumar
On Fri, Jan 21, 2022 at 12:54 AM Robert Haas  wrote:

> So at a high level, I think that what we ought to do is, first, for
> each table, estimate the time at which we think something bad will
> occur. There are several bad events: too much bloat, XID wraparound,
> MXID wraparound. We need to estimate the time at which we think each
> of those things will occur, and then take the earliest of those
> estimates. That's the time by which we need to have finished vacuuming
> the table. Then, make an estimate of how long it will take to complete
> a vacuum of the table, subtract that from the time at which we need to
> be done, and that's the time by which we need to start. The earliest
> need-to-start time is the highest priority table.


I think we need some more parameters to compare bloat vs wraparound.
I mean in one of your examples in the 2nd paragraph we can say that
the need-to-start of table A is earlier than table B so it's kind of
simple.  But when it comes to wraparound vs bloat we need to add some
weightage to compute how much bloat is considered as bad as
wraparound.  I think the amount of bloat can not be an absolute number
but it should be relative w.r.t the total database size or so.  I
don't think it can be computed w.r.t to the table size because if the
table is e.g. just 1 GB size and it is 5 times bloated then it is not
as bad as another 1 TB table which is just 2 times bloated.

>
> A second problem is that, if the earliest need-to-start time is in the
> past, then we definitely are in trouble and had better get to work at
> once, but if it's in the future, that doesn't necessarily mean we're
> safe. If there are three tables with a need-to-finish time that is 12
> hours in the future and each of them will take 11 hours to vacuum,
> then every need-to-start time computed according to the algorithm
> above is in the future, but in fact we're in a lot of trouble. If the
> estimates are accurate, we need 3 autovacuum workers to be available
> to start within 1 hour, or we're doomed. The very last thing we want
> to do is wait another hour before doing anything. It's not impossible
> to factor this into the calculation of need-to-start times, assuming
> we know how many workers we have. For instance, if we've got tables
> whose need-to-finish times are 30, 50, and 70 minutes in the future,
> we can see that if each one takes 20 minutes or less to vacuum, then
> the need-to-start times can just be computed by subtraction. But the
> tables with 50 or 70 minute deadlines are going to take more than 20
> minutes to vacuum, then we've got to back up the need-to-start times
> so that we finish each table in time to start on the next one. I
> haven't looked into what algorithms exist for this kind of scheduling
> problem, but I feel like a literature search, or pulling out my
> college textbooks, would probably turn up some useful ideas.


I think we should be thinking of dynamically adjusting priority as
well.  Because it is possible that when autovacuum started we
prioritize the table based on some statistics and estimation but
vacuuming process can take long time and during that some priority
might change so during the start of the autovacuum if we push all
table to some priority queue and simply vacuum in that order then we
might go wrong somewhere.  I think we need to make different priority
queues based on different factors, for example 1 queue for wraparound
risk and another for bloat risk.  Even though there would be multiple
queue we would have need_to_start time with each item so that we
exactly know from which queue we pick the next item but dynamically
whenever picking the item we can recheck the priority of the item at
the head of the queue and always assume that the queue is arranged  in
order of need_to_start time.  So now what if the item back in the
queue becomes more important than the item at the queue head based on
some statistics?  I don't think it is wise to compute the
need_to_start time for all the items before picking any new item.  But
I think we need to have multiple queues based on different factors
(not only just wraparound and bloat) to reduce the risk of items in
the back of the queue becoming higher priority than items in front of
the queue.  I mean this can not completely be avoided but this can be
reduced by creating multiple work queues based on more factors which
can dynamically change.

>
> I know that this email is kind of a giant wall of text, so my thanks
> if you've read this far, and even more if you feel inspired to write
> back with your own thoughts.


Yeah it is a long email but quite interesting.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Catalog version access

2022-01-24 Thread Michael Paquier
On Mon, Jan 24, 2022 at 08:40:08PM +, Bossart, Nathan wrote:
> On 1/23/22, 7:31 PM, "Michael Paquier"  wrote:
>> On Mon, Aug 16, 2021 at 06:12:54PM +, Bossart, Nathan wrote:
>>> I was looking at the --check switch for the postgres binary recently
>>> [0], and this sounds like something that might fit in nicely there.
>>> In the attached patch, --check will also check the control file if one
>>> exists.
>>
>> This would not work on a running postmaster as CreateDataDirLockFile()
>> is called beforehand, but we want this capability, no?
> 
> I was not under the impression this was a requirement, based on the
> use-case discussed upthread [0].  

Hmm.  I got a different impression as of this one:
https://www.postgresql.org/message-id/3496407.1613955...@sss.pgh.pa.us
But I can see downthread that this is not the case.  Sorry for the
confusion.

>> Abusing a bootstrap option for this purpose does not look like a good
>> idea, to be honest, especially for something only used internally now
>> and undocumented, but we want to use something aimed at an external
>> use with some documentation.  Using a separate switch would be more
>> adapted IMO.
> 
> This is the opposite of what Magnus proposed earlier [1].  Do we need
> a new pg_ctl option for this?  It seems like it is really only
> intended for use by PostgreSQL developers, but perhaps there are other
> use-cases I am not thinking of.  In any case, the pg_ctl option would
> probably end up using --check (or another similar mode) behind the
> scenes.

Based on the latest state of the thread, I am understanding that we
don't want a new option for pg_ctl for this feature, and using a
bootstrap's --check for this purpose is not a good idea IMO.  What I
guess from Magnus' suggestion would be to add a completely different
switch. 

Once you remove the requirement of a running server, we have basically
what has been recently implemented with postgres -C for
runtime-computed GUCs, because we already go through a read of the
control file to be able to print those GUCs with their correct
values.  This also means that it is already possible to check if a
data folder is compatible with a set of binaries with this facility,
as any postgres -C command with a runtime GUC would trigger this
check.  Using any of the existing runtime GUCs may be confusing, but
that would work.  And I am not really convinced that we have any need
to add a specific GUC for this purpose, be it a sort of
is_controlfile_valid or controlfile_checksum (CRC32 of the control
file).

>> Also, I think that we should be careful with the read of
>> the control file to avoid false positives?   We can rely on an atomic
>> read/write thanks to its maximum size of 512 bytes, but this looks
>> like a lot what has been done recently with postgres -C for runtime
>> GUCs, that *require* a read of the control file before grabbing the
>> values we are interested in.
> 
> Sorry, I'm not following this one.  In my proposed patch, the control
> file (if one exists) is read after CreateDataDirLockFile(), just like
> PostmasterMain().

While looking at the patch, I was thinking about the fact that we may
want to support the case even if a server is running.  If we don't, my
worries about the concurrent control file activities are moot. 
--
Michael


signature.asc
Description: PGP signature


RE: refactoring basebackup.c

2022-01-24 Thread Shinoda, Noriyoshi (PN Japan FSIP)
Hi, 
Thank you for committing a great feature. I have tested the committed features. 
The attached small patch fixes the output of the --help message. In the 
previous commit, only gzip and none were output, but in the attached patch, 
client-gzip and server-gzip are added.

Regards,
Noriyoshi Shinoda
-Original Message-
From: Robert Haas  
Sent: Saturday, January 22, 2022 3:33 AM
To: Dipesh Pandit ; Michael Paquier 

Cc: Jeevan Ladhe ; tushar 
; Dmitry Dolgov <9erthali...@gmail.com>; Mark 
Dilger ; pgsql-hack...@postgresql.org
Subject: Re: refactoring basebackup.c

On Wed, Jan 19, 2022 at 4:26 PM Robert Haas  wrote:
> I spent some time thinking about test coverage for the server-side 
> backup code today and came up with the attached (v12-0003).

I committed the base backup target patch yesterday, and today I updated the 
remaining code in light of Michael Paquier's commit 
5c649fe153367cdab278738ee4aebbfd158e0546. Here is the resulting patch.

Michael, I am proposing to that we remove this message as part of this commit:

-pg_log_info("no value specified for compression
level, switching to default");

I think most people won't want to specify a compression level, so emitting a 
message when they don't seems too verbose.

--
Robert Haas
EDB: http://www.enterprisedb.com 


pg_basebackup_option_v1.diff
Description: pg_basebackup_option_v1.diff


Re: Bug in ProcArrayApplyRecoveryInfo for snapshots crossing 4B, breaking replicas

2022-01-24 Thread Michael Paquier
On Mon, Jan 24, 2022 at 10:45:48PM +0100, Tomas Vondra wrote:
> On 1/24/22 22:28, Bossart, Nathan wrote:
>>> Attached patch is fixing this by just sorting the XIDs logically. The
>>> xidComparator is meant for places that can't do logical ordering. But
>>> these XIDs come from RUNNING_XACTS, so they actually come from the same
>>> wraparound epoch (so sorting logically seems perfectly fine).
>> 
>> The patch looks reasonable to me.
> 
> Thanks!

Could it be possible to add a TAP test?  One idea would be to rely on
pg_resetwal -x and -e close to the 4B limit to set up a node before 
stressing the scenario of this bug, so that would be rather cheap.
--
Michael


signature.asc
Description: PGP signature


Re: 2022-01 Commitfest

2022-01-24 Thread Julien Rouhaud
Hi,

This is the 4th week of this commitfest.

Since last week, 5 entries were committed.  There are still overall 223 active
patches, the vast majority needing review.  If you signed up to review patches,
you still have a whole week to help patch making progress and getting
committed!

Status summary:
- Needs review: 142.
- Waiting on Author: 57.
- Ready for Committer: 24.
- Committed: 54.
- Moved to next CF: 1.
- Returned with Feedback: 2.
- Rejected: 4.
- Withdrawn: 8.




Re: Skipping logical replication transactions on subscriber side

2022-01-24 Thread Amit Kapila
On Mon, Jan 24, 2022 at 7:36 PM Peter Eisentraut
 wrote:
>
> On 22.01.22 03:54, Amit Kapila wrote:
> > Won't we already do that for Alter Subscription command which means
> > nothing special needs to be done for this? However, it seems to me
> > that the idea we are trying to follow here is that as this option can
> > lead to data inconsistency, it is good to allow only superusers to
> > specify this option. The owner of the subscription can be changed to
> > non-superuser as well in which case I think it won't be a good idea to
> > allow this option. OTOH, if we think it is okay to allow such an
> > option to users that don't have superuser privilege then I think
> > allowing it to the owner of the subscription makes sense to me.
>
> I don't think this functionality allows a nonprivileged user to do
> anything they couldn't otherwise do.  You can create inconsistent data
> in the sense that you can choose not to apply certain replicated data.
>

I thought this will be the only primary way to skip applying certain
transactions. The other could be via pg_replication_origin_advance().
Or are you talking about the case where we skip applying update/delete
where the corresponding rows are not found?

I see the point that if we can allow the owner to skip applying
updates/deletes in certain cases then probably this should also be
okay. Kindly let us know if you have something else in mind as well?

-- 
With Regards,
Amit Kapila.




Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-24 Thread Tom Lane
Thomas Munro  writes:
> On Tue, Jan 25, 2022 at 10:28 AM Andres Freund  wrote:
>> On 2022-01-24 15:35:25 -0500, Tom Lane wrote:
>>> Sure.  Do we want to revert in HEAD too?

>> Not sure. I'm also OK with trying to go with Thomas' patch to walreceiver and
>> try a bit longer to get all this working. Thomas?

> I vote for reverting in release branches only.  I'll propose a better
> WES patch set for master that hopefully also covers async append etc
> (which I was already planning to do before we knew about this Windows
> problem).  More soon.

WFM, but we'll have to remember to revert this in v15 if we don't
have a solid fix by then.

It's kinda late here, so I'll push the reverts tomorrow.

regards, tom lane




Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-24 16:47:28 -0500, Andrew Dunstan wrote:
> Give me what you can and I'll see what I can do. I have a couple of
> moderately high priority items on my plate, but I will probably be able
> to fit in some testing when those make my eyes completely glaze over.

Steps:

# install msys from https://www.msys2.org/
# install dependencies in msys shell
pacman -S git bison flex make ucrt64/mingw-w64-ucrt-x86_64-perl 
ucrt64/mingw-w64-ucrt-x86_64-gcc ucrt64/mingw-w64-ucrt-x86_64-zlib 
ucrt64/mingw-w64-ucrt-x86_64-ccache diffutils


# start mingw ucrt64 x64 shell
cpan install -T IPC::Run
perl -MIPC::Run -e 1 # verify ipc run is installed

cd /c/dev
# I added --reference postgres to accelerate the cloning
git clone https://git.postgresql.org/git/postgresql.git postgres-mingw
cd /c/dev/postgres-mingw

git revert ed52c3707bcf8858defb0d9de4b55f5c7f18fed7
git revert 6051857fc953a62db318329c4ceec5f9668fd42a

# apply attached patch

# see below why out-of-tree is easier or now
mkdir build
cd build
# path parameters probably not necessary, I thought I needed them earlier, not 
sure why
../configure --without-readline --cache cache --enable-tap-tests 
PROVE=/ucrt64/bin/core_perl/prove PERL=/ucrt64/bin/perl.exe CC="ccache gcc"
make -j8 -s world-bin && make -j8 -s -C src/interfaces/ecpg/test
make -j8 -s temp-install

# pg_regress' make_temp_socketdir() otherwise picks up the wrong TMPDIR
mkdir /c/dev/postgres-mingw/build/tmp

# the TAR= ensures that tests pick up a tar accessible with a windows path
# PG_TEST_USE_UNIX_SOCKETS=1 is required for test concurrency, otherwise there 
are port conflicts

(make -Otarget -j12 check-world NO_TEMP_INSTALL=1 PG_TEST_USE_UNIX_SOCKETS=1 
TMPDIR=C:/dev/postgres-mingw/tmp TAR="C:\Windows\System32\tar.exe" 2>&1 && echo 
test-world-success || echo test-world-fail) 2>&1 |tee test-world.log


To make tests in "in-tree" builds work, a bit more hackery would be
needed. The problem is that windows chooses binaries from the current working
directory *before* PATH. That's a problem for things like initdb.exe or
pg_ctl.exe that want to find postgres.exe, as that only works with the program
in their proper location, rather than CWD.

Greetings,

Andres Freund
diff --git i/src/bin/pg_dump/t/010_dump_connstr.pl w/src/bin/pg_dump/t/010_dump_connstr.pl
index 2f7919a4b31..bac15ac748a 100644
--- i/src/bin/pg_dump/t/010_dump_connstr.pl
+++ w/src/bin/pg_dump/t/010_dump_connstr.pl
@@ -8,7 +8,8 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
-if ($PostgreSQL::Test::Utils::is_msys2)
+# FIXME: needs to be readjusted
+if ($PostgreSQL::Test::Utils::is_msys2 || 1)
 {
 	plan skip_all => 'High bit name tests fail on Msys2';
 }
diff --git i/contrib/citext/Makefile w/contrib/citext/Makefile
index 789932fe366..8e569c96c48 100644
--- i/contrib/citext/Makefile
+++ w/contrib/citext/Makefile
@@ -11,7 +11,8 @@ DATA = citext--1.4.sql \
 	citext--1.0--1.1.sql
 PGFILEDESC = "citext - case-insensitive character string data type"
 
-REGRESS = citext citext_utf8
+# FIXME: broken under mingw-ucrt (and maybe mingw generally?)
+#REGRESS = citext citext_utf8
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git i/src/test/perl/PostgreSQL/Test/Utils.pm w/src/test/perl/PostgreSQL/Test/Utils.pm
index 50be10fb5af..f1268a18d40 100644
--- i/src/test/perl/PostgreSQL/Test/Utils.pm
+++ w/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -310,6 +310,8 @@ The returned path uses forward slashes but has no trailing slash.
 sub perl2host
 {
 	my ($subject) = @_;
+	# FIXME: to be sure this doesn't do anything
+	return $subject;
 	return $subject unless $Config{osname} eq 'msys';
 	if ($is_msys2)
 	{


Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-24 Thread Thomas Munro
On Tue, Jan 25, 2022 at 10:28 AM Andres Freund  wrote:
> On 2022-01-24 15:35:25 -0500, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2022-01-14 17:51:52 -0500, Tom Lane wrote:
> > >> FWIW, I'm just fine with reverting, particularly in the back branches.
> > >> It seems clear that this dank corner of Windows contains even more
> > >> creepy-crawlies than we thought.
> >
> > > Seems we should revert now-ish? There's a minor release coming up and I 
> > > think
> > > it'd be bad to ship these changes to users.
> >
> > Sure.  Do we want to revert in HEAD too?
>
> Not sure. I'm also OK with trying to go with Thomas' patch to walreceiver and
> try a bit longer to get all this working. Thomas?

I vote for reverting in release branches only.  I'll propose a better
WES patch set for master that hopefully also covers async append etc
(which I was already planning to do before we knew about this Windows
problem).  More soon.




Re: typos

2022-01-24 Thread Michael Paquier
On Mon, Jan 24, 2022 at 04:55:31PM +0900, Michael Paquier wrote:
> I'm thinking about just removing that at the end.

And done this way, keeping the whole simpler.  I have applied most of
the things you suggested, with a backpatch down to 10 for the relevant
user-visible parts in the docs.  Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: GUC flags

2022-01-24 Thread Justin Pryzby
On Wed, Jan 05, 2022 at 11:36:41PM -0600, Justin Pryzby wrote:
> On Thu, Jan 06, 2022 at 02:19:08PM +0900, Michael Paquier wrote:
> > > + initStringInfo();
> > > + appendStringInfoChar(, '{');
> > > +
> > > + if (flags & GUC_NO_SHOW_ALL)
> > > + appendStringInfo(, "NO_SHOW_ALL,");
> > > + if (flags & GUC_NO_RESET_ALL)
> > > + appendStringInfo(, "NO_RESET_ALL,");
> > > + if (flags & GUC_NOT_IN_SAMPLE)
> > > + appendStringInfo(, "NOT_IN_SAMPLE,");
> > > + if (flags & GUC_EXPLAIN)
> > > + appendStringInfo(, "EXPLAIN,");
> > > + if (flags & GUC_RUNTIME_COMPUTED)
> > > + appendStringInfo(, "RUNTIME_COMPUTED,");
> > > +
> > > + /* Remove trailing comma, if any */
> > > + if (ret.len > 1)
> > > + ret.data[--ret.len] = '\0';
> > 
> > The way of building the text array is incorrect here.  See

I think you'll find that this is how it's done elsewhere in postgres.
In the frontend, see appendPQExpBufferChar and appendPGArray and 3e6e86abc.
On the backend, see: git grep -F "'{'" |grep -w appendStringInfoChar

I updated the patch with a regex to accommodate GUCs without '=', as needed
since f47ed79cc8.

-- 
Justin
>From eee345a1685a04eeb0fedf3ba9ad0fc6fe6e3d81 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 5 Dec 2021 23:22:04 -0600
Subject: [PATCH 1/2] check_guc: fix absurd number of false positives

Avoid false positives;
Avoid various assumptions;
Avoid list of exceptions;
Simplify shell/awk/sed/grep;

This requires GNU awk for RS as a regex.
---
 src/backend/utils/misc/check_guc | 69 +---
 1 file changed, 10 insertions(+), 59 deletions(-)

diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index b171ef0e4ff..323ca13191b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -1,78 +1,29 @@
-#!/bin/sh
+#! /bin/sh
+set -e
 
-## currently, this script makes a lot of assumptions:
+## this script makes some assumptions about the formatting of files it parses.
 ## in postgresql.conf.sample:
 ##   1) the valid config settings may be preceded by a '#', but NOT '# '
 ##  (we use this to skip comments)
-##   2) the valid config settings will be followed immediately by  ' ='
-##  (at least one space preceding the '=')
-## in guc.c:
-##   3) the options have PGC_ on the same line as the option
-##   4) the options have '{' on the same line as the option
-
-##  Problems
-## 1) Don't know what to do with TRANSACTION ISOLATION LEVEL
-
-## if an option is valid but shows up in only one file (guc.c but not
-## postgresql.conf.sample), it should be listed here so that it
-## can be ignored
-INTENTIONALLY_NOT_INCLUDED="debug_deadlocks in_hot_standby \
-is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
-pre_auth_delay role seed server_encoding server_version server_version_num \
-session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
-trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
 
 ### What options are listed in postgresql.conf.sample, but don't appear
 ### in guc.c?
 
-# grab everything that looks like a setting and convert it to lower case
-SETTINGS=`grep ' =' postgresql.conf.sample |
-grep -v '^# ' | # strip comments
-sed -e 's/^#//' |
-awk '{print $1}'`
-
-SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'`
+# grab everything that looks like a setting
+SETTINGS=`sed '/^#[[:alnum:]]/!d; s/^#//; s/ =.*//; /^include/d' postgresql.conf.sample`
 
 for i in $SETTINGS ; do
-  hidden=0
   ## it sure would be nice to replace this with an sql "not in" statement
-  ## it doesn't seem to make sense to have things in .sample and not in guc.c
-#  for hidethis in $INTENTIONALLY_NOT_INCLUDED ; do
-#if [ "$hidethis" = "$i" ] ; then
-#  hidden=1
-#fi
-#  done
-  if [ "$hidden" -eq 0 ] ; then
-grep -i '"'$i'"' guc.c > /dev/null
-if [ $? -ne 0 ] ; then
-  echo "$i seems to be missing from guc.c";
-fi;
-  fi
+  grep -i "\"$i\"" guc.c >/dev/null ||
+echo "$i seems to be missing from guc.c";
 done
 
 ### What options are listed in guc.c, but don't appear
 ### in postgresql.conf.sample?
 
 # grab everything that looks like a setting and convert it to lower case
-
-SETTINGS=`grep '{.* PGC_' guc.c | awk '{print $1}' | \
-  sed -e 's/{//g' -e 's/"//g' -e 's/,//'`
-
-SETTINGS=`echo "$SETTINGS" | tr 'A-Z' 'a-z'`
-
+SETTINGS=`gawk -F '[",]' 'BEGIN{RS="\n\t{\n"} /",[[:space:]]*PGC_.*.*gettext_noop/ && !/NOT_IN_SAMPLE/{print tolower($2)}' guc.c`
 for i in $SETTINGS ; do
-  hidden=0
-  ## it sure would be nice to replace this with an sql "not in" statement
-  for hidethis in $INTENTIONALLY_NOT_INCLUDED ; do
-if [ "$hidethis" = "$i" ] ; then
-  hidden=1
-fi
-  done
-  if [ "$hidden" -eq 0 ] ; then
-grep -i '#'$i' ' postgresql.conf.sample > /dev/null
-if [ $? -ne 0 ] ; then
-  echo "$i seems to be missing from postgresql.conf.sample";
-fi
-  

Re: btree_gist into core?

2022-01-24 Thread Tom Lane
Andreas Karlsson  writes:
> On 1/19/22 09:30, Peter Eisentraut wrote:
>> I don't have a lot of experience with this module, so I don't know if 
>> there are any lingering concerns about whether it is mature enough as a 
>> built-in feature.

> While it I like the idea on a conceptual level I worry about the code 
> quality of the module. I know that the btree/cidr code is pretty broken. 
> But I am not sure if there are any issues with other data types.

Yeah :-(.  We just fixed an issue with its char(n) support too
(54b1cb7eb), so I don't have a terribly warm feeling about the
quality of the lesser-used code paths.  Still, maybe we could
do some code review/testing rather than a blind copy & paste.

I'd also opine that we don't have to preserve on-disk compatibility
while migrating into core, which'd help get out of the sticky problem
for inet/cidr.  This'd require being able to run the contrib module
alongside the core version for awhile (to support existing indexes),
but I think we could manage that if we tried.  IIRC we did something
similar when we migrated tsearch into core.

One thing I don't know anything about is how good are btree_gist
indexes performance-wise.  Do they have problems that we'd really
need to fix to consider them core-quality?

regards, tom lane




Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Mark Dilger



> On Jan 24, 2022, at 2:21 PM, Stephen Frost  wrote:
> 
> Superuser is a problem specifically because it gives people access to do 
> absolutely anything, both for security and safety concerns. Disallowing a way 
> to curtail that same risk when it comes to role ownership invites exactly 
> those same problems.

Before the patch, users with CREATEROLE can do mischief.  After the patch, 
users with CREATEROLE can do mischief.  The difference is that the mischief 
that can be done after the patch is a proper subset of the mischief that can be 
done before the patch.  (Counter-examples highly welcome.)

Specifically, I claim that before the patch, non-superuser "bob" with 
CREATEROLE can interfere with *any* non-superuser.  After the patch, 
non-superuser "bob" with CREATEROLE can interfere with *some* non-superusers; 
specifically, with non-superusers he created himself, or which have had 
ownership transferred to him.

Restricting the scope of bob's mischief is a huge win, in my view.

The argument about whether owners should always implicitly inherit privileges 
from roles they own is a bit orthogonal to my point about mischief-making.  Do 
we at least agree on the mischief-abatement aspect of this patch set?  

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Replace uses of deprecated Python module distutils.sysconfig

2022-01-24 Thread Tom Lane
I wrote:
> Yeah, I was just noticing that.  It looks like the whole business
> with checking both get_python_inc(False) and get_python_inc(True)
> has been useless from the start: none of the buildfarm animals report
> more than one -I switch in "checking Python include directories".

Also, that appears to be true even in the oldest runs that vendikar
still has data for, back in 2015.  So it's not something that they
cleaned up recently.

regards, tom lane




Re: btree_gist into core?

2022-01-24 Thread Andreas Karlsson

On 1/19/22 09:30, Peter Eisentraut wrote:

So, first of all, would people agree with this course of action?

I don't have a lot of experience with this module, so I don't know if 
there are any lingering concerns about whether it is mature enough as a 
built-in feature.


While it I like the idea on a conceptual level I worry about the code 
quality of the module. I know that the btree/cidr code is pretty broken. 
But I am not sure if there are any issues with other data types.


See 
https://www.postgresql.org/message-id/7891efc1-8378-2cf2-617b-4143848ec895%40proxel.se


Andreas





Re: XTS cipher mode for cluster file encryption

2022-01-24 Thread Bruce Momjian
On Mon, Nov 29, 2021 at 08:37:31AM +0100, Antonin Houska wrote:
> The changes to buffile.c are not trivial, but we haven't really changed the
> API, as long as you mean BufFileCreateTemp(), BufFileWrite(), BufFileRead().
> 
> What our patch affects on the caller side is that BufFileOpenTransient(),
> BufFileCloseTransient(), BufFileWriteTransient() and BufFileReadTransient()
> replace OpenTransientFile(), CloseTransientFile(), write()/fwrite() and
> read()/fread() respectively in reorderbuffer.c and in pgstat.c. These changes
> become a little bit less invasive in TDE 1.1 than they were in 1.0, see [1],
> see the diffs attached.

With pg_upgrade modified to preserve the relfilenode, tablespace oid, and
database oid, we are now closer to implementing cluster file encryption
using XTS.  I think we have a few steps left:

1.  modify temporary file I/O to use a more centralized API
2.  modify the existing cluster file encryption patch to use XTS with a
IV that uses more than the LSN
3.  add XTS regression test code like CTR
4.  create WAL encryption code using CTR

If we can do #1 in PG 15 I think I can have #2 ready for PG 16 in July.
The feature wiki page is:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption

Do people want to advance this feature forward?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: pg_upgrade should truncate/remove its logs before running

2022-01-24 Thread Michael Paquier
On Mon, Jan 24, 2022 at 02:44:21PM -0500, Bruce Momjian wrote:
> OK, thanks.  There are really two cleanups --- first, the "log"
> directory, and second deletion of the old cluster by running
> delete_old_cluster.sh.

Yes, this is the same thing as what's done on HEAD with a two-step
cleanup, except that we just only need to remove the log directory
rather than each individual log entry.
--
Michael


signature.asc
Description: PGP signature


Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Mark Dilger



> On Jan 24, 2022, at 2:21 PM, Stephen Frost  wrote:
> 
> Being able to create and drop users is, in fact, effectively a superuser-only 
> task today.  We could throw out the entire idea of role ownership, in fact, 
> as being entirely unnecessary when talking about that specific task.

Wow, that's totally contrary to how I see this patch.  The heart and soul of 
this patch is to fix the fact that CREATEROLE is currently overpowered.  
Everything else is gravy.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Replace uses of deprecated Python module distutils.sysconfig

2022-01-24 Thread Tom Lane
Andres Freund  writes:
> ... But none of the systems report a get_python_inc(False) differing from
> get_python_inc(True), or from the value of INCLUDEPY. So I don't see a reason
> for why it'd not?

Yeah, I was just noticing that.  It looks like the whole business
with checking both get_python_inc(False) and get_python_inc(True)
has been useless from the start: none of the buildfarm animals report
more than one -I switch in "checking Python include directories".

It's a little bit too soon to decide that INCLUDEPY is reliably equal
to that, but if it still looks that way tomorrow, I'll be satisfied.

regards, tom lane




Re: Replace uses of deprecated Python module distutils.sysconfig

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-23 19:49:57 -0800, Andres Freund wrote:
> > To avoid too noisy breakages, we could have python.m4 emit INCLUDEPY and 
> > then
> > search the bf logs in a day or three?
> 
> Maybe something like the attached? Not particularly nice, but should give us
> most of the relevant information?

FWIW, so far all 73 animals that reported on HEAD that they ran with the
change and that currently detect as with_python='yes', find Python.h via
INCLUDEPY the same as via get_python_inc(). This includes systems like gadwall
where sysconfig.get_path('include') returned the erroneous
/usr/local/include/python2.7.

Of course that doesn't guarantee in itself that Python.h is usable that
way. But none of the systems report a get_python_inc(False) differing from
get_python_inc(True), or from the value of INCLUDEPY. So I don't see a reason
for why it'd not?

Greetings,

Andres Freund




Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Stephen Frost
Greetings,

On Mon, Jan 24, 2022 at 16:42 Robert Haas  wrote:

> On Mon, Jan 24, 2022 at 4:23 PM Stephen Frost  wrote:
> > The idea behind this patch is to enable creation and dropping of roles,
> which isn’t possible now without being effectively a superuser.
> >
> > Forcing owners to also implicitly have all rights of the roles they
> create is orthogonal to that and an unnecessary change.
>
> I just took a look at the first email on this thread and it says this:
>
> >>> These patches have been split off the now deprecated monolithic
> "Delegating superuser tasks to new security roles" thread at [1].
>
> Therefore I think it is pretty clear that the goals of this patch set
> include being able to delegate superuser tasks to new security roles.
> And having those tasks be delegated but *work randomly differently* is
> much less useful.


Being able to create and drop users is, in fact, effectively a
superuser-only task today.  We could throw out the entire idea of role
ownership, in fact, as being entirely unnecessary when talking about that
specific task.

> I am not saying that we would explicitly set all cases to be noninherit
> or that we would even change the default away from what it is today, only
> that we should use the existing role system and it’s concept of
> inherit-vs-noninherit rather than throwing all of that away.
>
> INHERIT vs. NOINHERIT is documented to control the behavior of role
> *membership*. This patch is introducing a new concept of role
> *ownership*. It's not self-evident that what applies to one case
> should apply to the other.


This is an argument to drop the role ownership concept, as I view it.
Privileges are driven by membership today and inventing some new
independent way to do that is increasing confusion, not improving things.
I disagree that adding role ownership should necessarily change how the
regular GRANT privilege system works or throw away basic concepts of that
system which have been in place for decades.  Increasing the number of
independent ways to answer the question of “what users have what rights on
object X” is an active bad thing.  Anything that cares about object access
will now also have to address role ownership to answer that question, while
if we don’t include this one change then they don’t need to directly have
any concern for ownership because regular object privileges still work the
same way they did before.

> Further, being able to require a SET ROLE before running a given
> operation is certainly a benefit in much the same way that having a user
> have to sudo before running an operation is.
>
> That's a reasonable point of view, but having things work similarly to
> what happens for a superuser is ALSO a very big benefit. In my
> opinion, in fact, it is a far larger benefit.


Superuser is a problem specifically because it gives people access to do
absolutely anything, both for security and safety concerns. Disallowing a
way to curtail that same risk when it comes to role ownership invites
exactly those same problems.

I appreciate that there’s an edge between the ownership system being
proposed and the existing role membership system, but we’d be much better
off trying to minimize the amount that they end up overlapping- role
ownership should be about managing roles.

To push back on the original “tenant” argument, consider that one of the
bigger issues in cloud computing today is exactly the problem that the
cloud managers can potentially gain access to the sensitive data of their
tenants and that’s not generally viewed as a positive thing.  This change
would make it so that every landlord can go and SELECT from the tables of
their tenants without so much as a by-your-leave.  The tenants likely don’t
like that idea, and almost as likely the landlords in many cases aren’t
thrilled with it either.  Should the landlords be able to DROP the tenant
due to the tenant not paying their bill?  Of course, and that should then
eliminate the tenant’s tables and other objects which take up resources,
but that’s not the same thing as saying that a landlord should be able to
unlock a tenant’s old phone that they left behind (and yeah, maybe the
analogy falls apart a bit there, but the point I’m trying to get at is that
it’s not as simple as it’s being made out to be here and we should think
about these things and not just implicitly grant all access to the owner
because that’s an easy thing to do- and is exactly what viewing owners as
“mini superusers” does and leads to many of the same issues we already have
with superusers).

Thanks,

Stephen

>


Re: Collecting statistics about contents of JSONB columns

2022-01-24 Thread Tomas Vondra

On 1/23/22 01:24, Nikita Glukhov wrote:

Hi!

I am glad that you found my very old patch interesting and started to
work on it.  We failed to post it in 2016 mostly because we were not
satisfied with JSONB storage.  Also we decided to wait for completion
of work on extended statistics as we thought that it could help us.
But in early 2017 we switched to SQL/JSON and forgot about this patch.



Understood. Let's see how feasible this idea is and if we can move this 
forward.




I think custom datatype is necessary for better performance. With a
plain JSONB we need to do a lot of work for extraction of path stats:
  - iterate through MCV/Histogram JSONB arrays
  - cast numeric values to float, string to text etc.
  - build PG arrays from extracted datums
  - form pg_statistic tuple.

With a custom data type we could store pg_statistic tuple unmodified
and use it without any overhead.  But then we need modify a bit
VariableStatData and several functions to pass additional nullfrac
corrections.



I'm not against evaluating/exploring alternative storage formats, but my 
feeling is the real impact on performance will be fairly low. At least I 
haven't seen this as very expensive while profiling the patch so far. Of 
course, I may be wrong, and it may be more significant in some cases.



Maybe simple record type (path text, data pg_statistic, ext_data jsonb)
would be enough.

Maybe, but then you still need to store a bunch of those, right? So 
either an array (likely toasted) or 1:M table. I'm not sure it's goiing 
to be much cheaper than JSONB.


I'd suggest we focus on what we need to store first, which seems like 
tha primary question, and worry about the exact storage format then.




Also there is an idea to store per-path separately in pg_statistic_ext
rows using expression like (jb #> '{foo,bar}') as stxexprs.  This could
also help user to select which paths to analyze simply by using some
sort of CREATE STATISTICS.  But it is really unclear how to:
  * store pg_statistic_ext rows from typanalyze
  * form path expressions for array elements (maybe add new jsonpath
operator)
  * match various -> operator chains to stxexprs
  * jsonpath-like languages need simple API for searching by stxexprs



Sure, you can do statistics on expressions, right? Of course, if that 
expression produces JSONB value, that's not very useful at the moment. 
Maybe we could have two typanalyze functions - one for regular analyze, 
one for extended statistics?


That being said, I'm not sure extended stats are a good match for this. 
My feeling was we should collect these stats for all JSONB columns, 
which is why I argued for putting that in pg_statistic.





Per-path statistics should only be collected for scalars.  This can be
enabled by flag JsonAnalyzeContext.scalarsOnly.  But there are is a
problem: computed MCVs and histograms will be broken and we will not be
able to use them for queries like (jb > const) in general case.  Also
we will not be and use internally in scalarineqsel() and var_eq_const()
(see jsonSelectivity()).  Instead, we will have to implement own
estimator functions for JSONB comparison operators that will correctly
use our hacked MCVs and histograms (and maybe not all cases will be
supported; for example, comparison to scalars only).



Yeah, but maybe that's an acceptable trade-off? I mean, if we can 
improve estimates for most clauses, and there's a some number of clauses 
that are estimated just like without stats, that's still an improvement, 
right?



It's possible to replace objects and arrays with empty ones when
scalarsOnly is set to keep correct frequencies of non-scalars.
But there is an old bug in JSONB comparison: empty arrays are placed
before other values in the JSONB sort order, although they should go
after all scalars.  So we need also to distinguish empty and non-empty
arrays here.



Hmmm ...



I tried to fix a major part of places marked as XXX and FIXME, the new
version of the patches is attached.  There are a lot of changes, you
can see them in a step-by-step form in the corresponding branch
jsonb_stats_20220122 in our GitHub repo [1].



Thanks! I'll go through the changes soon.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Andrew Dunstan


On 1/24/22 16:39, Andres Freund wrote:
> Hi,
>
> On 2022-01-24 14:01:37 -0500, Andrew Dunstan wrote:
>> Well if we can get Andres' suggestion to work all this might go away,
>> which would keep everyone happy, especially me.
> I successfully tried it for a few tests. But I see tests hanging a lot
> independent of the way I run the tests, presumably due to the issues discussed
> in [1]. So we need to do something about that.
>
> I don't have the cycles to finish changing over to that way of running tests -
> do you have some time to work on it, if I clean up the bit I have?
>
> - Andres
>
> [1] 
> https://postgr.es/m/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com



Give me what you can and I'll see what I can do. I have a couple of
moderately high priority items on my plate, but I will probably be able
to fit in some testing when those make my eyes completely glaze over.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Bug in ProcArrayApplyRecoveryInfo for snapshots crossing 4B, breaking replicas

2022-01-24 Thread Tomas Vondra

On 1/24/22 22:28, Bossart, Nathan wrote:

On 1/22/22, 4:43 PM, "Tomas Vondra"  wrote:

There's a bug in ProcArrayApplyRecoveryInfo, introduced by 8431e296ea,
which may cause failures when starting a replica, making it unusable.
The commit message for 8431e296ea is not very clear about what exactly
is being done and why, but the root cause is that at while processing
RUNNING_XACTS, the XIDs are sorted like this:

  /*
   * Sort the array so that we can add them safely into
   * KnownAssignedXids.
   */
  qsort(xids, nxids, sizeof(TransactionId), xidComparator);

where "safely" likely means "not violating the ordering expected by
KnownAssignedXidsAdd". Unfortunately, xidComparator compares the values
as plain uint32 values, while KnownAssignedXidsAdd actually calls
TransactionIdFollowsOrEquals() and compares the logical XIDs :-(


Wow, nice find.


This likely explains why we never got any reports about this - most
systems probably don't leave transactions running for this long, so the
probability is much lower. And replica restarts are generally not that
common events either.


I'm aware of one report with the same message [0], but I haven't read
closely enough to determine whether it is the same issue.  It looks
like that particular report was attributed to backup_label being
removed.



Yeah, I saw that thread too, and I don't think it's the same issue. As 
you say, it seems to be caused by the backup_label shenanigans, and 
there's also the RUNNING_XACTS message:


Sep 20 15:00:27 ... CONTEXT: xlog redo Standby/RUNNING_XACTS: nextXid 
38585 latestCompletedXid 38571 oldestRunningXid 38572; 14 xacts: 38573 
38575 38579 38578 38574 38581 38580 38576 38577 38572 38582 38584 38583 
38583


The XIDs don't cross the 4B boundary at all, so this seems unrelated.



Attached patch is fixing this by just sorting the XIDs logically. The
xidComparator is meant for places that can't do logical ordering. But
these XIDs come from RUNNING_XACTS, so they actually come from the same
wraparound epoch (so sorting logically seems perfectly fine).


The patch looks reasonable to me.



Thanks!


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 4:42 PM Andres Freund  wrote:
> On 2022-01-24 16:31:08 -0500, Robert Haas wrote:
> > That seems consistent with what's been described on this thread so
> > far, but I still don't quite understand why the logic that reassembles
> > TOAST chunks doesn't solve it.
>
> There are no toast chunks to reassemble if the update didn't change the
> primary key. So this just hits the path we'd also hit for an unchanged toasted
> non-key column.

Oh. Hmm. That's bad.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Andres Freund
On 2022-01-24 16:31:08 -0500, Robert Haas wrote:
> That seems consistent with what's been described on this thread so
> far, but I still don't quite understand why the logic that reassembles
> TOAST chunks doesn't solve it.

There are no toast chunks to reassemble if the update didn't change the
primary key. So this just hits the path we'd also hit for an unchanged toasted
non-key column.




Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 4:23 PM Stephen Frost  wrote:
> The idea behind this patch is to enable creation and dropping of roles, which 
> isn’t possible now without being effectively a superuser.
>
> Forcing owners to also implicitly have all rights of the roles they create is 
> orthogonal to that and an unnecessary change.

I just took a look at the first email on this thread and it says this:

>>> These patches have been split off the now deprecated monolithic "Delegating 
>>> superuser tasks to new security roles" thread at [1].

Therefore I think it is pretty clear that the goals of this patch set
include being able to delegate superuser tasks to new security roles.
And having those tasks be delegated but *work randomly differently* is
much less useful.

> I am not saying that we would explicitly set all cases to be noninherit or 
> that we would even change the default away from what it is today, only that 
> we should use the existing role system and it’s concept of 
> inherit-vs-noninherit rather than throwing all of that away.

INHERIT vs. NOINHERIT is documented to control the behavior of role
*membership*. This patch is introducing a new concept of role
*ownership*. It's not self-evident that what applies to one case
should apply to the other.

> Further, being able to require a SET ROLE before running a given operation is 
> certainly a benefit in much the same way that having a user have to sudo 
> before running an operation is.

That's a reasonable point of view, but having things work similarly to
what happens for a superuser is ALSO a very big benefit. In my
opinion, in fact, it is a far larger benefit.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-24 14:01:37 -0500, Andrew Dunstan wrote:
> Well if we can get Andres' suggestion to work all this might go away,
> which would keep everyone happy, especially me.

I successfully tried it for a few tests. But I see tests hanging a lot
independent of the way I run the tests, presumably due to the issues discussed
in [1]. So we need to do something about that.

I don't have the cycles to finish changing over to that way of running tests -
do you have some time to work on it, if I clean up the bit I have?

- Andres

[1] 
https://postgr.es/m/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com




Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 4:17 PM Andres Freund  wrote:
> Possibly the root of the problem is that we/I didn't think of cases where the
> primary key is an external toast datum - in moast scenarios you'd an error
> about a too wide index tuple. But of course that neglects cases where toasting
> happens due to SET STORAGE or due to the aggregate tuple width, rather than
> individual column width.

That seems consistent with what's been described on this thread so
far, but I still don't quite understand why the logic that reassembles
TOAST chunks doesn't solve it. I mean, decoding doesn't know whether
any changes are happening on the subscriber side, so it's not like we
can (a) query for the row and then (b) decide to reassemble TOAST
chunks if we find it, or something like that. The decoding has to say,
well, here's the new tuple and the old key columns, and then the
subscriber does whatever it does.  I guess it could check whether the
old and new values are identical to decide whether to drop that column
out of the result, but it shouldn't compare a TOAST pointer to a
detoasted value and go "yeah, that looks equal"

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Bug in ProcArrayApplyRecoveryInfo for snapshots crossing 4B, breaking replicas

2022-01-24 Thread Bossart, Nathan
On 1/22/22, 4:43 PM, "Tomas Vondra"  wrote:
> There's a bug in ProcArrayApplyRecoveryInfo, introduced by 8431e296ea,
> which may cause failures when starting a replica, making it unusable.
> The commit message for 8431e296ea is not very clear about what exactly
> is being done and why, but the root cause is that at while processing
> RUNNING_XACTS, the XIDs are sorted like this:
>
>  /*
>   * Sort the array so that we can add them safely into
>   * KnownAssignedXids.
>   */
>  qsort(xids, nxids, sizeof(TransactionId), xidComparator);
>
> where "safely" likely means "not violating the ordering expected by
> KnownAssignedXidsAdd". Unfortunately, xidComparator compares the values
> as plain uint32 values, while KnownAssignedXidsAdd actually calls
> TransactionIdFollowsOrEquals() and compares the logical XIDs :-(

Wow, nice find.

> This likely explains why we never got any reports about this - most
> systems probably don't leave transactions running for this long, so the
> probability is much lower. And replica restarts are generally not that
> common events either.

I'm aware of one report with the same message [0], but I haven't read
closely enough to determine whether it is the same issue.  It looks
like that particular report was attributed to backup_label being
removed.

> Attached patch is fixing this by just sorting the XIDs logically. The
> xidComparator is meant for places that can't do logical ordering. But
> these XIDs come from RUNNING_XACTS, so they actually come from the same
> wraparound epoch (so sorting logically seems perfectly fine).

The patch looks reasonable to me.

Nathan

[0] https://postgr.es/m/1476795473014.15979.2188%40webmail4



Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-24 15:35:25 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-01-14 17:51:52 -0500, Tom Lane wrote:
> >> FWIW, I'm just fine with reverting, particularly in the back branches.
> >> It seems clear that this dank corner of Windows contains even more
> >> creepy-crawlies than we thought.
> 
> > Seems we should revert now-ish? There's a minor release coming up and I 
> > think
> > it'd be bad to ship these changes to users.
> 
> Sure.  Do we want to revert in HEAD too?

Not sure. I'm also OK with trying to go with Thomas' patch to walreceiver and
try a bit longer to get all this working. Thomas?

Greetings,

Andres Freund




Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Stephen Frost
Greetings,

On Mon, Jan 24, 2022 at 15:33 Robert Haas  wrote:

> On Sat, Jan 22, 2022 at 4:20 PM Stephen Frost  wrote:
> > Whoah, really?  No, I don't agree with this, it's throwing away the
> > entire concept around inheritance of role rights and how you can have
> > roles which you can get the privileges of by doing a SET ROLE to them
> > but you don't automatically have those rights.
>
> I see it differently. In my opinion, what that does is make the patch
> actually useful instead of largely a waste of time.


The idea behind this patch is to enable creation and dropping of roles,
which isn’t possible now without being effectively a superuser.

Forcing owners to also implicitly have all rights of the roles they create
is orthogonal to that and an unnecessary change.

If you are a
> service provider, you want to give your customers a super-user-like
> experience without actually making them superuser. You don't want to
> actually make them superuser, because then they could do things like
> change archive_command or install plperlu and shell out to the OS
> account, which you don't want. But you do want them to be able to
> administer objects within the database just as a superuser could. And
> a superuser has privileges over objects they own and objects belonging
> to other users automatically, without needing to SET ROLE.


I am not saying that we would explicitly set all cases to be noninherit or
that we would even change the default away from what it is today, only that
we should use the existing role system and it’s concept of
inherit-vs-noninherit rather than throwing all of that away.

Everybody now has
> to understand the behavior of a regular account, the behavior of a
> superuser account, and the behavior of this third type of account
> which is sort of like a superuser but requires a lot more SET ROLE
> commands.


Inherit vs. noninherit roles is not a new concept, it has existed since the
role system was implemented.  Further, that system does not require a lot
of SET ROLE commands unless and until an admin sets up a non-inherit role.
At that time, however, it’s expected that the rights of a role which has
inherit set to false are not automatically allowed for the role to which it
was GRANT’d.  That’s how roles have always worked since they were
introduced.

And also every tool. So for example pg_dump and restore
> isn't going to work, not even on the set of objects this
> elevated-privilege user can access. pgAdmin isn't going to understand
> that it needs to insert a bunch of extra SET ROLE commands to
> administer objects. Ditto literally every other tool anyone has ever
> written to administer PostgreSQL. And for all of that pain, we get
> exactly zero extra security.


We have an inherit system today and pg_dump works just fine, as far as I’m
aware, and it does, indeed, issue SET ROLE at various points. Perhaps you
could explain with PG today what the issue is that is caused?  Or what
issue pgAdmin has with PG’s existing role inherit system?

Further, being able to require a SET ROLE before running a given operation
is certainly a benefit in much the same way that having a user have to sudo
before running an operation is.

Thanks,

Stephen

>


Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-24 15:10:05 -0500, Robert Haas wrote:
> I think we realized when we were working on the logical decoding stuff
> that the key columns of the old tuple would have to be detoasted in
> order for the mechanism to work, because I remember worrying about
> whether it would potentially be a problem that the WAL record would
> end up huge. However, I think we believed that the new tuple wouldn't
> need to have the detoasted values, because logical decoding is
> designed to notice all the TOAST insertions for the new tuple and
> reassemble those separate chunks to get the original value back.

Possibly the root of the problem is that we/I didn't think of cases where the
primary key is an external toast datum - in moast scenarios you'd an error
about a too wide index tuple. But of course that neglects cases where toasting
happens due to SET STORAGE or due to the aggregate tuple width, rather than
individual column width.


> And off-hand I'm not sure why that logic doesn't apply just as much to the
> key columns as any others.

The difference is that it's mostly fine to not have unchanging key columns as
part of decoded update - you just don't update those columns. But you can't do
that without knowing the replica identity...

Greetings,

Andres Freund




Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Andrew Dunstan


On 1/24/22 15:17, Robert Haas wrote:
> On Mon, Jan 24, 2022 at 2:27 PM Robert Haas  wrote:
>> I really hate committing stuff that turns out to be broken. It's such
>> a fire drill when the build farm turns red.
> And there's a good chance it's about to break again, because I just
> committed the next patch in the series which, shockingly, also
> includes tests.
>
> I'd like to tell you I believe I got it right this time ... but I don't.



I'll just keep playing whackamole :-)


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Replace uses of deprecated Python module distutils.sysconfig

2022-01-24 Thread Tom Lane
Peter Eisentraut  writes:
> Also note that python-config is itself a Python script that uses 
> sysconfig and includes code like this:

>  elif opt in ('--includes', '--cflags'):
>  flags = ['-I' + sysconfig.get_path('include'),
>   '-I' + sysconfig.get_path('platinclude')]

> So this would just do the same thing we are already doing anyway.

It used to look like that, but at least in my 3.6.8 installation
on RHEL8, it's been rewritten to be a shell script that doesn't
depend on sysconfig at all.

The result is sufficiently bletcherous that you'd have thought
they'd reconsider getting rid of sysconfig :-(.  Also, it
definitely won't work on Windows.

regards, tom lane




Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Andrew Dunstan


On 1/24/22 15:33, Robert Haas wrote:
> On Sat, Jan 22, 2022 at 4:20 PM Stephen Frost  wrote:
>> Whoah, really?  No, I don't agree with this, it's throwing away the
>> entire concept around inheritance of role rights and how you can have
>> roles which you can get the privileges of by doing a SET ROLE to them
>> but you don't automatically have those rights.
> I see it differently. In my opinion, what that does is make the patch
> actually useful instead of largely a waste of time. If you are a
> service provider, you want to give your customers a super-user-like
> experience without actually making them superuser. You don't want to
> actually make them superuser, because then they could do things like
> change archive_command or install plperlu and shell out to the OS
> account, which you don't want. But you do want them to be able to
> administer objects within the database just as a superuser could. And
> a superuser has privileges over objects they own and objects belonging
> to other users automatically, without needing to SET ROLE.
>

+many


I encountered such issues on a cloud provider several years ago, and
blogged about the difficulties, which would have been solved very nicely
and cleanly by this proposal. It was when I understood properly how this
proposal worked, precisely as Robert states, that I became more
enthusiastic about it.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Replace uses of deprecated Python module distutils.sysconfig

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-24 21:48:21 +0100, Peter Eisentraut wrote:
> Also note that python-config is itself a Python script that uses sysconfig
> and includes code like this:

Huh. It's a shell script on my debian system. Looks like the python source
tree has both. Not sure what / who decides which is used.

Greetings,

Andres Freund




Re: Replace uses of deprecated Python module distutils.sysconfig

2022-01-24 Thread Peter Eisentraut

On 24.01.22 03:53, Andres Freund wrote:

On 2022-01-23 21:31:52 -0500, Tom Lane wrote:

Andres Freund  writes:

No, not really. There generally seems to be very little documentation about
what one is supposed to use when embedding python (rather than building a
python module). The only thing I really see is:
https://docs.python.org/3/extending/embedding.html#compiling-and-linking-under-unix-like-systems
which says to use python-config.

Yeah :-(.  I don't really want to go there, because it will break
existing setups.

Yea, it seems to introduce a whole set of new complexities (finding python
from python-config, mismatching python-config and explicitly specified python,
...). And it doesn't exist on windows either :(.


Also note that python-config is itself a Python script that uses 
sysconfig and includes code like this:


elif opt in ('--includes', '--cflags'):
flags = ['-I' + sysconfig.get_path('include'),
 '-I' + sysconfig.get_path('platinclude')]

So this would just do the same thing we are already doing anyway.




Re: TAP tests, directories and parameters

2022-01-24 Thread Tom Lane
Andres Freund  writes:
> It's currently not possible to just run all tap tests in one prove run,
> because a number of tests assume that they are run from specific directories
> and/or with per-directory parameters.
> For meson I "solved" this by running each individual test in a wrapper that
> changes directory etc. But that's not really a great approach.
> To me it seems we should change our tests and test invocations to not depend
> on being run from a specific directory and to unify the environment variables
> passed to tap tests to one set somewhere central.

I'd be sad if this implied that running "make [install]check" in a
particular subdirectory no longer runs just that directory's tests.
Otherwise, sounds fine.

regards, tom lane




Re: Catalog version access

2022-01-24 Thread Bossart, Nathan
Thanks for taking a look!

On 1/23/22, 7:31 PM, "Michael Paquier"  wrote:
> On Mon, Aug 16, 2021 at 06:12:54PM +, Bossart, Nathan wrote:
>> I was looking at the --check switch for the postgres binary recently
>> [0], and this sounds like something that might fit in nicely there.
>> In the attached patch, --check will also check the control file if one
>> exists.
>
> This would not work on a running postmaster as CreateDataDirLockFile()
> is called beforehand, but we want this capability, no?

I was not under the impression this was a requirement, based on the
use-case discussed upthread [0].  

> Abusing a bootstrap option for this purpose does not look like a good
> idea, to be honest, especially for something only used internally now
> and undocumented, but we want to use something aimed at an external
> use with some documentation.  Using a separate switch would be more
> adapted IMO.

This is the opposite of what Magnus proposed earlier [1].  Do we need
a new pg_ctl option for this?  It seems like it is really only
intended for use by PostgreSQL developers, but perhaps there are other
use-cases I am not thinking of.  In any case, the pg_ctl option would
probably end up using --check (or another similar mode) behind the
scenes.

> Also, I think that we should be careful with the read of
> the control file to avoid false positives?   We can rely on an atomic
> read/write thanks to its maximum size of 512 bytes, but this looks
> like a lot what has been done recently with postgres -C for runtime
> GUCs, that *require* a read of the control file before grabbing the
> values we are interested in.

Sorry, I'm not following this one.  In my proposed patch, the control
file (if one exists) is read after CreateDataDirLockFile(), just like
PostmasterMain().

Nathan

[0] https://postgr.es/m/20210222022407.ecaygvx2ise6uwyz%40alap3.anarazel.de
[1] 
https://postgr.es/m/CABUevEySovaEDci7c0DXOrV6c7JzWqYzfVwOiRUJxMao%3D9seEw%40mail.gmail.com



Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-24 Thread Tom Lane
Andres Freund  writes:
> On 2022-01-14 17:51:52 -0500, Tom Lane wrote:
>> FWIW, I'm just fine with reverting, particularly in the back branches.
>> It seems clear that this dank corner of Windows contains even more
>> creepy-crawlies than we thought.

> Seems we should revert now-ish? There's a minor release coming up and I think
> it'd be bad to ship these changes to users.

Sure.  Do we want to revert in HEAD too?

regards, tom lane




TAP tests, directories and parameters

2022-01-24 Thread Andres Freund
Hi,

Right now we run tap tests separately in each directory. Which is one of the
reasons the make output is so unreadable - instead of having one 'prove'
output listing all tests, we get a lot of different prove outputs, all
interspersed. And it severely limits parallelism on windows right now.

It's currently not possible to just run all tap tests in one prove run,
because a number of tests assume that they are run from specific directories
and/or with per-directory parameters.

For meson I "solved" this by running each individual test in a wrapper that
changes directory etc. But that's not really a great approach.


To me it seems we should change our tests and test invocations to not depend
on being run from a specific directory and to unify the environment variables
passed to tap tests to one set somewhere central.

I think this would require:

1) Moving handling of PG_TEST_EXTRA into the tap tests themselves. That's a
   good idea imo, because there's then output explaining that some tests
   aren't run.

2) teach tap test infrastructure to add the directory containing the test to
   the perl search path, so that modules like RewindTest.pm can be found.

3) teach tap test infrastructure to compute the output directory for a
   specific test from the file location of the test itself and a parameter
   like top_builddir.

4) Pass env variables like LZ4, TAR, GZIP_PROGRAM by a mechanism other than
   exports in individual makefiles. Maybe just generating a perl file at
   configure / mkvcbuild.pl / meson setup time?

While far from a small amount of work, it does seem doable? A good number of
tap tests already pass this way, btw, just not all.

Greetings,

Andres Freund




Re: Corruption during WAL replay

2022-01-24 Thread Daniel Shelepanov



On 27.09.2021 11:30, Kyotaro Horiguchi wrote:

Thank you for the comments! (Sorry for the late resopnse.)

At Tue, 10 Aug 2021 14:14:05 -0400, Robert Haas  wrote in

On Thu, Mar 4, 2021 at 10:01 PM Kyotaro Horiguchi
 wrote:

The patch assumed that CHKPT_START/COMPLETE barrier are exclusively
used each other, but MarkBufferDirtyHint which delays checkpoint start
is called in RelationTruncate while delaying checkpoint completion.
That is not a strange nor harmful behavior.  I changed delayChkpt to a
bitmap integer from an enum so that both barrier are separately
triggered.

I'm not sure this is the way to go here, though.  This fixes the issue
of a crash during RelationTruncate, but the issue of smgrtruncate
failure during RelationTruncate still remains (unless we treat that
failure as PANIC?).

I like this patch. As I understand it, we're currently cheating by
allowing checkpoints to complete without necessarily flushing all of
the pages that were dirty at the time we fixed the redo pointer out to
disk. We think this is OK because we know that those pages are going
to get truncated away, but it's not really OK because when the system
starts up, it has to replay WAL starting from the checkpoint's redo
pointer, but the state of the page is not the same as it was at the
time when the redo pointer was the end of WAL, so redo fails. In the
case described in
http://postgr.es/m/byapr06mb63739b2692dc6dbb3c5f186cab...@byapr06mb6373.namprd06.prod.outlook.com
modifications are made to the page before the redo pointer is fixed
and those changes never make it to disk, but the truncation also never
makes it to the disk either. With this patch, that can't happen,
because no checkpoint can intervene between when we (1) decide we're
not going to bother writing those dirty pages and (2) actually
truncate them away. So either the pages will get written as part of
the checkpoint, or else they'll be gone before the checkpoint
completes. In the latter case, I suppose redo that would have modified
those pages will just be skipped, thus dodging the problem.

I think your understanding is right.


In RelationTruncate, I suggest that we ought to clear the
delay-checkpoint flag before rather than after calling
FreeSpaceMapVacuumRange. Since the free space map is not fully
WAL-logged, anything we're doing there should be non-critical. Also, I

Agreed and fixed.


think it might be better if MarkBufferDirtyHint stays closer to the
existing coding and just uses a Boolean and an if-test to decide
whether to clear the bit, instead of inventing a new mechanism. I
don't really see anything wrong with the new mechanism, but I think
it's better to keep the patch minimal.

Yeah, that was a a kind of silly. Fixed.


As you say, this doesn't fix the problem that truncation might fail.
But as Andres and Sawada-san said, the solution to that is to get rid
of the comments saying that it's OK for truncation to fail and make it
a PANIC. However, I don't think that change needs to be part of this
patch. Even if we do that, we still need to do this. And even if we do
this, we still need to do that.

Ok. Addition to the aboves, I rewrote the comment in RelatinoTruncate.

+* Delay the concurrent checkpoint's completion until this truncation
+* successfully completes, so that we don't establish a redo-point 
between
+* buffer deletion and file-truncate. Otherwise we can leave 
inconsistent
+* file content against the WAL records after the REDO position and 
future
+* recovery fails.

However, a problem for me for now is that I cannot reproduce the
problem.

To avoid further confusion, the attached is named as *.patch.

regards.

Hi. This is my first attempt to review a patch so feel free to tell me 
if I missed something.


As of today's state of REL_14_STABLE 
(ef9706bbc8ce917a366e4640df8c603c9605817a), the problem is reproducible 
using the script provided by Daniel Wood in this 
(1335373813.287510.1573611814...@connect.xfinity.com) message. Also, the 
latest patch seems not to be applicable and requires some minor tweaks.



Regards,

Daniel Shelepanov





Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Robert Haas
On Sat, Jan 22, 2022 at 4:20 PM Stephen Frost  wrote:
> Whoah, really?  No, I don't agree with this, it's throwing away the
> entire concept around inheritance of role rights and how you can have
> roles which you can get the privileges of by doing a SET ROLE to them
> but you don't automatically have those rights.

I see it differently. In my opinion, what that does is make the patch
actually useful instead of largely a waste of time. If you are a
service provider, you want to give your customers a super-user-like
experience without actually making them superuser. You don't want to
actually make them superuser, because then they could do things like
change archive_command or install plperlu and shell out to the OS
account, which you don't want. But you do want them to be able to
administer objects within the database just as a superuser could. And
a superuser has privileges over objects they own and objects belonging
to other users automatically, without needing to SET ROLE.

Imagine what happens if we adopt your proposal here. Everybody now has
to understand the behavior of a regular account, the behavior of a
superuser account, and the behavior of this third type of account
which is sort of like a superuser but requires a lot more SET ROLE
commands. And also every tool. So for example pg_dump and restore
isn't going to work, not even on the set of objects this
elevated-privilege user can access. pgAdmin isn't going to understand
that it needs to insert a bunch of extra SET ROLE commands to
administer objects. Ditto literally every other tool anyone has ever
written to administer PostgreSQL. And for all of that pain, we get
exactly zero extra security.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: PublicationActions - use bit flags.

2022-01-24 Thread Peter Eisentraut



On 21.01.22 01:05, Greg Nancarrow wrote:

On Tue, Dec 21, 2021 at 12:55 PM Greg Nancarrow  wrote:


On Tue, Dec 21, 2021 at 11:56 AM Tom Lane  wrote:


Removing this is not good:

 if (relation->rd_pubactions)
-   {
 pfree(relation->rd_pubactions);
-   relation->rd_pubactions = NULL;
-   }

If the subsequent palloc fails, you've created a problem where
there was none before.



Oops, yeah, I got carried away; if palloc() failed and called exit(),
then it would end up crashing when trying to use/pfree rd_pubactions
again.
Better leave that line in ...



Attaching an updated patch to fix that oversight.
This patch thus fixes the original palloc issue in a minimal way,
keeping the same relcache structure.


Why can't GetRelationPublicationActions() have the PublicationActions as 
a return value, instead of changing it to an output argument?





Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 2:27 PM Robert Haas  wrote:
> I really hate committing stuff that turns out to be broken. It's such
> a fire drill when the build farm turns red.

And there's a good chance it's about to break again, because I just
committed the next patch in the series which, shockingly, also
includes tests.

I'd like to tell you I believe I got it right this time ... but I don't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: refactoring basebackup.c

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 9:30 AM Dipesh Pandit  wrote:
> v13 patch does not apply on the latest head, it requires a rebase. I have 
> applied
> it on commit dc43fc9b3aa3e0fa9c84faddad6d301813580f88 to validate gzip
> decompression patches.

It only needed trivial rebasing; I have committed it after doing that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Robert Haas
On Tue, Aug 10, 2021 at 1:20 AM Amit Kapila  wrote:
> It seems to me this problem exists from the time we introduced
> wal_level = logical in the commit e55704d8b2 [1], or another
> possibility is that logical replication commit didn't consider
> something to make it work. Andres, Robert, Petr, can you guys please
> comment because otherwise, we might miss something here.

I'm belatedly getting around to looking at this thread. My
recollection of this is:

I think we realized when we were working on the logical decoding stuff
that the key columns of the old tuple would have to be detoasted in
order for the mechanism to work, because I remember worrying about
whether it would potentially be a problem that the WAL record would
end up huge. However, I think we believed that the new tuple wouldn't
need to have the detoasted values, because logical decoding is
designed to notice all the TOAST insertions for the new tuple and
reassemble those separate chunks to get the original value back. And
off-hand I'm not sure why that logic doesn't apply just as much to the
key columns as any others.

But the evidence does suggest that there's some kind of bug here, so
evidently there's some flaw in that line of thinking. I'm not sure
off-hand what it is, though.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-14 17:51:52 -0500, Tom Lane wrote:
> FWIW, I'm just fine with reverting, particularly in the back branches.
> It seems clear that this dank corner of Windows contains even more
> creepy-crawlies than we thought.

Seems we should revert now-ish? There's a minor release coming up and I think
it'd be bad to ship these changes to users.

Greetings,

Andres Freund




Re: Error running configure on Mac

2022-01-24 Thread Andres Freund
Hi,

On 2022-01-24 08:41:39 -0800, samay sharma wrote:
> I've also attached the test.i file.

The problem is that you got a stdint.h in /usr/local/include/. And that
stdint.h doesn't match the system one. Which explains why there's a
compilation failure and also explains why others don't see this problem.

from test.i
> # 1 "/usr/local/include/stdint.h" 1 3 4
> 
> #define _ISL_INCLUDE_ISL_STDINT_H 1
> 
> #define _GENERATED_STDINT_H "isl 0.14.1"
> 
> #define _STDINT_HAVE_STDINT_H 1
> 
> # 1 "/usr/local/include/stdint.h" 1 3 4
> # 8 "/usr/local/include/stdint.h" 2 3 4
> # 73 
> "/Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/sys/resource.h"
>  2 3 4


Greetings,

Andres Freund




Re: pg_upgrade should truncate/remove its logs before running

2022-01-24 Thread Bruce Momjian
On Mon, Jan 24, 2022 at 11:41:17AM -0600, Justin Pryzby wrote:
> On Mon, Jan 24, 2022 at 12:39:30PM -0500, Bruce Momjian wrote:
> > On Mon, Jan 24, 2022 at 10:59:40AM +0900, Michael Paquier wrote:
> > > On Thu, Jan 20, 2022 at 07:51:37PM +0900, Michael Paquier wrote:
> > > > Neat idea.  That would work fine for my case.  So I am fine to stick
> > > > with this suggestion. 
> > > 
> > > I have been looking at this idea, and the result is quite nice, being
> > > simpler than anything that has been proposed on this thread yet.  We
> > > get a simpler removal logic, and there is no need to perform any kind
> > > of sanity checks with the output path provided as long as we generate
> > > the paths and the dirs after adjust_data_dir().
> > ...
> > >  
> > >
> > > pg_upgrade creates various working files, 
> > > such
> > > -   as schema dumps, in the current working directory.  For security, be 
> > > sure
> > > -   that that directory is not readable or writable by any other users.
> > > +   as schema dumps, stored within pg_upgrade_output.d 
> > > in
> > > +   the directory of the new cluster.
> > >
> > 
> > Uh, how are we instructing people to delete that pg_upgrade output
> > directory?  If pg_upgrade completes cleanly, would it be removed
> > automatically?
> 
> Clearly.

OK, thanks.  There are really two cleanups --- first, the "log"
directory, and second deletion of the old cluster by running
delete_old_cluster.sh.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2022-01-24 Thread Robert Haas
On Sat, Jan 22, 2022 at 2:20 AM Shruthi Gowda  wrote:
> Agree. In the latest patch, the template0 and postgres OIDs are fixed
> to unused manually assigned OIDs 4 and 5 respectively. These OIDs are
> no more listed as unused OIDs.

Thanks. Committed with a few more cosmetic changes.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 2:01 PM Andrew Dunstan  wrote:
> Well if we can get Andres' suggestion to work all this might go away,
> which would keep everyone happy, especially me. You're right that I was
> a little careless upthread. Mea culpa. Meanwhile I am committing a
> minimal one line fix.

I in no way intended to accuse you of being careless. I was just
pointing out that this stuff seems to be hard to get right, even for
smart people.

I really hate committing stuff that turns out to be broken. It's such
a fire drill when the build farm turns red.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: CREATEROLE and role ownership hierarchies

2022-01-24 Thread Andrew Dunstan


On 1/22/22 16:20, Stephen Frost wrote:
>> Subject: [PATCH v4 1/5] Add tests of the CREATEROLE attribute.
> No particular issue with this one.
>
>

I'm going to commit this piece forthwith so we get it out of the way.
That will presumably make the cfbot unhappy until Mark submits a new
patch set.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: fairywren is generating bogus BASE_BACKUP commands

2022-01-24 Thread Andrew Dunstan


On 1/23/22 22:52, Robert Haas wrote:
> On Sun, Jan 23, 2022 at 4:09 PM Andrew Dunstan  wrote:
>> The most common issues we get are around this issue of virtualized paths
>> in the TAP tests. If people followed the rule I suggested upthread, 99%
>> of those problems would go away.
> Well, that makes it sound like it's the fault of people for not
> following the rules, but I don't think that's really a fair
> assessment. Even your first guess as to how to solve this particular
> problem wasn't correct, and if you can't guess right on the first try,
> I don't know how anyone else is supposed to do it. I still don't even
> understand why your first guess wasn't right. I feel like every time I
> try to add a TAP test Msys blows up, and I can't figure out how to fix
> it myself. Which is not a great feeling.
>


Well if we can get Andres' suggestion to work all this might go away,
which would keep everyone happy, especially me. You're right that I was
a little careless upthread. Mea culpa. Meanwhile I am committing a
minimal one line fix.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [BUG]Update Toast data failure in logical replication

2022-01-24 Thread Euler Taveira
On Mon, Jan 24, 2022, at 12:58 AM, Michael Paquier wrote:
> FWIW, I find the API changes of HeapDetermineModifiedColumns() and
> ExtractReplicaIdentity() a bit grotty.  Shouldn't we try to flatten
> the old tuple instead?  There are things like
> toast_flatten_tuple_to_datum() for this purpose if a tuple satisfies
> HeapTupleHasExternal(), or just heap_copy_tuple_as_datum().
> 
I checked v4 and I don't like the HeapDetermineModifiedColumns() and
heap_tuple_attr_equals() changes either. It seems it is hijacking these
functions to something else. I would suggest to change the signature to

static void
heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
   HeapTuple tup1, HeapTuple tup2,
   bool *is_equal, bool *key_has_external);

and

static void
HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
 HeapTuple oldtup, HeapTuple newtup,
 Bitmapset *modified_attrs, bool *key_has_external);

I didn't figure out a cheap way to check if the key has external value other
than slightly modifying the HeapDetermineModifiedColumns() function and its
subroutine heap_tuple_attr_equals(). As Alvaro said I don't think adding
HeapTupleHasExternal() (as in v3) is a good idea because it does not optimize
genuine cases such as a table whose PK is an integer and contains a single
TOAST column.

Another suggestion is to keep key_changed and add another attribute
(key_has_external) to ExtractReplicaIdentity(). If we need key_changed in the
future we'll have to decompose it again. You also encapsulates that
optimization into the function that helps with future improvements/fixes.

static HeapTuple
ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
   bool key_has_external, bool *copy);


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Non-decimal integer literals

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 3:41 AM Peter Eisentraut
 wrote:
> Rebased patch set

What if someone finds this new behavior too permissive?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Index Skip Scan (new UniqueKeys)

2022-01-24 Thread Zhihong Yu
On Mon, Jan 24, 2022 at 8:51 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:

> > On Sun, Jan 23, 2022 at 04:25:04PM -0800, Zhihong Yu wrote:
> > On Sat, Jan 22, 2022 at 1:32 PM Dmitry Dolgov <9erthali...@gmail.com>
> wrote:
> > > Besides that the new patch version contains some cleaning up and
> > > addresses commentaries around leaf page pinning from [1]. The idea
> > > behind the series structure is still the same: the first three patches
> > > contains the essence of the implementation (hoping to help concentrate
> > > review), the rest are more "experimental".
> > >
> > > [1]:
> > >
> https://www.postgresql.org/message-id/flat/CAH2-WzmUscvoxVkokHxP=uptdjsi0tjkfpupd-cea35dvn-...@mail.gmail.com
> >
> > Hi,
> >
> > +   /* If same EC already is already in the list, then not unique */
> >
> > The word already is duplicated.
> >
> > + * make_pathkeys_for_uniquekeyclauses
> >
> > The func name in the comment is different from the actual func name.
>
> Thanks for the review! Right, both above make sense. I'll wait a bit if
> there will be more commentaries, and then post a new version with all
> changes at once.
>
> > + * Portions Copyright (c) 2020, PostgreSQL Global Development Group
> >
> > The year should be 2022 :-)
>
> Now you see how old is this patch :)
>
> > make_pathkeys_for_uniquekeys() is called by build_uniquekeys(). Should
> > make_pathkeys_for_uniquekeys() be moved into uniquekeys.c ?
>
> It's actually placed there by analogy with make_pathkeys_for_sortclauses
> (immediately preceding function), so I think moving it into uniquekeys
> will only make more confusion.
>
> > +query_has_uniquekeys_for(PlannerInfo *root, List *path_uniquekeys,
> > +bool allow_multinulls)
> >
> > It seems allow_multinulls is not checked in the func. Can the parameter
> be
> > removed ?
>
> Right, it could be removed. I believe it was somewhat important when the
> patch was tightly coupled with the UniqueKeys patch, where it was put in
> use.
>
> Hi,
It would be nice to take out this unused parameter for this patch.

The parameter should be added in patch series where it is used.

Cheers


explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))

2022-01-24 Thread Justin Pryzby
I'm renaming this thread for better visibility, since buffers is a small,
optional part of the patches I sent.

I made a CF entry here.
https://commitfest.postgresql.org/36/3409/

On Wed, Dec 01, 2021 at 06:58:20PM -0600, Justin Pryzby wrote:
> On Mon, Nov 15, 2021 at 01:09:54PM -0600, Justin Pryzby wrote:
> > Some time ago, I had a few relevant patches:
> > 1) add explain(REGRESS) which is shorthand for (BUFFERS OFF, TIMING OFF, 
> > COSTS OFF, SUMMARY OFF)
> > 2) add explain(MACHINE) which elides machine-specific output from explain;
> >for example, Heap Fetches, sort spaceUsed, hash nbuckets, and tidbitmap 
> > stuff.
> > 
> > https://www.postgresql.org/message-id/flat/20200306213310.gm...@telsasoft.com
> 
> The attached patch series now looks like this (some minor patches are not
> included in this list):
> 
>  1. add GUC explain_regress, which disables TIMING, SUMMARY, COSTS;
>  2. enable explain(BUFFERS) by default (but disabled by explain_regress);
>  3. Add explain(MACHINE) - which is disabled by explain_regress.
> This elides various machine-specific output like Memory and Disk use.
> Maybe it should be called something else like "QUIET" or "VERBOSE_MINUS_1"
> or ??
> 
> The regression tests now automatically run with explain_regress=on, which is
> shorthand for TIMING OFF, SUMMARY OFF, COSTS OFF, and then BUFFERS OFF.
> 
> There's a further option called explain(MACHINE) which can be disabled to hide
> portions of the output that are unstable, like Memory/Disk/Batches/
> Heap Fetches/Heap Blocks.  This allows "explain analyze" to be used more 
> easily
> in regression tests, and simplifies some existing tests that currently use
> plpgsql functions to filter the output.  But it doesn't handle all the
> variations from parallel workers.
> 
> (3) is optional, but simplifies some regression tests.  The patch series could
> be rephrased with (3) first.
> 
> Unfortunately, "COSTS OFF" breaks postgres_fdw remote_estimate.  If specifying
> "COSTS ON" in postgres_fdw.c is considered to be a poor fix , then I suppose
> this patch series could do as suggested and enable buffers by default only 
> when
> ANALYZE is specified.  Then postgres_fdw is not affected, and the
> explain_regress GUC is optional: instead, we'd need to specify BUFFERS OFF in
> ~100 regression tests which use EXPLAIN ANALYZE.  (3) still seems useful on 
> its
> own.
>From 747cb7a979cf96f99403a005053e635f3bf8c3f0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 22 Feb 2020 21:17:10 -0600
Subject: [PATCH 1/7] Add GUC: explain_regress

This changes the defaults for explain to: costs off, timing off, summary off.
It'd be reasonable to use this for new regression tests which are not intended
to be backpatched.
---
 contrib/postgres_fdw/postgres_fdw.c |  2 +-
 src/backend/commands/explain.c  | 13 +++--
 src/backend/utils/misc/guc.c| 13 +
 src/include/commands/explain.h  |  2 ++
 src/test/regress/expected/explain.out   |  3 +++
 src/test/regress/expected/inherit.out   |  2 +-
 src/test/regress/expected/stats_ext.out |  2 +-
 src/test/regress/pg_regress.c   |  3 ++-
 src/test/regress/sql/explain.sql|  4 
 src/test/regress/sql/inherit.sql|  2 +-
 src/test/regress/sql/stats_ext.sql  |  2 +-
 11 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index bf3f3d9e26e..71f5b145984 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3112,7 +3112,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * values, so don't request params_list.
 		 */
 		initStringInfo();
-		appendStringInfoString(, "EXPLAIN ");
+		appendStringInfoString(, "EXPLAIN (COSTS)");
 		deparseSelectStmtForRel(, root, foreignrel, fdw_scan_tlist,
 remote_conds, pathkeys,
 fpextra ? fpextra->has_final_sort : false,
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b970997c346..43e8bc78778 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -154,6 +154,7 @@ static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
 static void escape_yaml(StringInfo buf, const char *str);
 
+bool explain_regress = false; /* GUC */
 
 
 /*
@@ -172,6 +173,7 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	ListCell   *lc;
 	bool		timing_set = false;
 	bool		summary_set = false;
+	bool		costs_set = false;
 
 	/* Parse options list. */
 	foreach(lc, stmt->options)
@@ -183,7 +185,11 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 		else if (strcmp(opt->defname, "verbose") == 0)
 			es->verbose = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "costs") == 0)
+		{
+			/* Need to keep track if it was explicitly set to ON */
+			costs_set = true;
 			es->costs = defGetBoolean(opt);
+		}
 		else if (strcmp(opt->defname, "buffers") == 

Re: pg_upgrade should truncate/remove its logs before running

2022-01-24 Thread Bruce Momjian
On Mon, Jan 24, 2022 at 10:59:40AM +0900, Michael Paquier wrote:
> On Thu, Jan 20, 2022 at 07:51:37PM +0900, Michael Paquier wrote:
> > Neat idea.  That would work fine for my case.  So I am fine to stick
> > with this suggestion. 
> 
> I have been looking at this idea, and the result is quite nice, being
> simpler than anything that has been proposed on this thread yet.  We
> get a simpler removal logic, and there is no need to perform any kind
> of sanity checks with the output path provided as long as we generate
> the paths and the dirs after adjust_data_dir().
...
>  
>
> pg_upgrade creates various working files, such
> -   as schema dumps, in the current working directory.  For security, be sure
> -   that that directory is not readable or writable by any other users.
> +   as schema dumps, stored within pg_upgrade_output.d in
> +   the directory of the new cluster.
>

Uh, how are we instructing people to delete that pg_upgrade output
directory?  If pg_upgrade completes cleanly, would it be removed
automatically?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: pg_upgrade should truncate/remove its logs before running

2022-01-24 Thread Justin Pryzby
On Mon, Jan 24, 2022 at 12:39:30PM -0500, Bruce Momjian wrote:
> On Mon, Jan 24, 2022 at 10:59:40AM +0900, Michael Paquier wrote:
> > On Thu, Jan 20, 2022 at 07:51:37PM +0900, Michael Paquier wrote:
> > > Neat idea.  That would work fine for my case.  So I am fine to stick
> > > with this suggestion. 
> > 
> > I have been looking at this idea, and the result is quite nice, being
> > simpler than anything that has been proposed on this thread yet.  We
> > get a simpler removal logic, and there is no need to perform any kind
> > of sanity checks with the output path provided as long as we generate
> > the paths and the dirs after adjust_data_dir().
> ...
> >  
> >
> > pg_upgrade creates various working files, 
> > such
> > -   as schema dumps, in the current working directory.  For security, be 
> > sure
> > -   that that directory is not readable or writable by any other users.
> > +   as schema dumps, stored within pg_upgrade_output.d in
> > +   the directory of the new cluster.
> >
> 
> Uh, how are we instructing people to delete that pg_upgrade output
> directory?  If pg_upgrade completes cleanly, would it be removed
> automatically?

Clearly.

@@ -689,28 +751,5 @@ cleanup(void)  



/* Remove dump and log files? */

if (!log_opts.retain)   

-   {   

-   int dbnum;  

-   char  **filename;   

-   

-   for (filename = output_files; *filename != NULL; filename++)

-   unlink(*filename);  

-   

-   /* remove dump files */ 

-   unlink(GLOBALS_DUMP_FILE);  

-   

-   if (old_cluster.dbarr.dbs)  

-   for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; 
dbnum++)
-   {   

-   charsql_file_name[MAXPGPATH],   

-   
log_file_name[MAXPGPATH];   

-   DbInfo *old_db = 
_cluster.dbarr.dbs[dbnum];  
   
-   

-   snprintf(sql_file_name, sizeof(sql_file_name), 
DB_DUMP_FILE_MASK, old_db->db_oid);  
-   unlink(sql_file_name);  

-   

-   snprintf(log_file_name, sizeof(log_file_name), 
DB_DUMP_LOG_FILE_MASK, old_db->db_oid);  
-   unlink(log_file_name);

Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2022-01-24 Thread Bruce Momjian
On Sat, Jan 22, 2022 at 12:47:35PM +0530, Shruthi Gowda wrote:
> On Sat, Jan 22, 2022 at 12:27 AM Tom Lane  wrote:
> >
> > Robert Haas  writes:
> > > It seems to me that what this comment is saying is that OIDs in the
> > > second and third categories are doled out by counters. Therefore, we
> > > can't know which of those OIDs will get used, or how many of them will
> > > get used, or which objects will get which OIDs. Therefore, I think we
> > > should go back to the approach that you were using for template0 and
> > > handle both that database and postgres using that method. That is,
> > > assign an OID manually, and make sure unused_oids knows that it should
> > > be counted as already used.
> >
> > Indeed.  If you're going to manually assign OIDs to these databases,
> > do it honestly, and put them into the range intended for that purpose.
> > Trying to take short-cuts is just going to cause trouble down the road.
> 
> Understood. I will rework the patch accordingly. Thanks

Thanks.  To get the rsync update reduction we need to preserve database
oids.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: DSA failed to allocate memory

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 4:59 AM Dongming Liu  wrote:
> Maybe we can use one of the following methods to fix it:
> 1. re-bin segment to suitable segment index when called dsa_free
> 2. get_best_segment search all bins

(2) is definitely the wrong idea. The comments say:

/*
 * What is the lowest bin that holds segments that *might* have n contiguous
 * free pages?  There is no point in looking in segments in lower bins; they
 * definitely can't service a request for n free pages.
 */
#define contiguous_pages_to_segment_bin(n) Min(fls(n), DSA_NUM_SEGMENT_BINS - 1)

So it's OK for a segment to be in a bin that suggests that it has more
consecutive free pages than it really does. But it's NOT ok for a
segment to be in a bin that suggests it has fewer consecutive pages
than it really does. If dsa_free() is putting things back into the
wrong place, that's what we need to fix.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2022-01-24 Thread Anton A. Melnikov

Hi, Andrey!

I've checked the 5th version of the patch and there are some remarks.

>I've created a new view named pg_stat_statements_aux. But for now both
>views are using the same function pg_stat_statements which returns all
>fields. It seems reasonable to me - if sampling solution will need all
>values it can query the function.

Agreed, it might be useful in some cases.

>But it seems "stats_reset" term is not quite correct in this case. The
>"stats_since" field holds the timestamp of hashtable entry, but not the
>reset time. The same applies to aux_stats_since - for new statement it
>holds its entry time, but in case of reset it will hold the reset time.

Thanks for the clarification. Indeed if we mean the word 'reset' as the 
removal of all the hashtable entries during pg_stat_statements_reset() 
then we shouldn't use it for the first occurrence timestamp in the 
struct pgssEntry.

So with the stats_since field everything is clear.
On the other hand aux_stats_since field can be updated for two reasons:
1) The same as for stats_since due to first occurrence of entry in the 
hashtable. And it will be equal to stats_since timestamp in that case.

2) Due to an external reset from an upper level sampler.
I think it's not very important how to name this field, but it would be 
better to mention both these reasons in the comment.


As for more important things, if the aux_min_time initial value is zero 
like now, then if condition on line 1385 of pg_stat_statements.c will 
never be true and aux_min_time will remain zero. Init aux_min_time with 
INT_MAX can solve this problem.


It is possible to reduce size of entry_reset_aux() function  via 
removing if condition on line 2606 and entire third branch from line 
2626. Thanks to check in 2612 this will work in all cases.


Also it would be nice to move the repeating several times
lines 2582-2588 into separate function. I think this can make 
entry_reset_aux() more shorter and clearer.


In general, the 5th patch applies with no problems, make check-world and 
CI gives no error and patch seems to be closely to become RFC.


With best regards,
--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Index Skip Scan (new UniqueKeys)

2022-01-24 Thread Dmitry Dolgov
> On Sun, Jan 23, 2022 at 04:25:04PM -0800, Zhihong Yu wrote:
> On Sat, Jan 22, 2022 at 1:32 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > Besides that the new patch version contains some cleaning up and
> > addresses commentaries around leaf page pinning from [1]. The idea
> > behind the series structure is still the same: the first three patches
> > contains the essence of the implementation (hoping to help concentrate
> > review), the rest are more "experimental".
> >
> > [1]:
> > https://www.postgresql.org/message-id/flat/CAH2-WzmUscvoxVkokHxP=uptdjsi0tjkfpupd-cea35dvn-...@mail.gmail.com
>
> Hi,
>
> +   /* If same EC already is already in the list, then not unique */
>
> The word already is duplicated.
>
> + * make_pathkeys_for_uniquekeyclauses
>
> The func name in the comment is different from the actual func name.

Thanks for the review! Right, both above make sense. I'll wait a bit if
there will be more commentaries, and then post a new version with all
changes at once.

> + * Portions Copyright (c) 2020, PostgreSQL Global Development Group
>
> The year should be 2022 :-)

Now you see how old is this patch :)

> make_pathkeys_for_uniquekeys() is called by build_uniquekeys(). Should
> make_pathkeys_for_uniquekeys() be moved into uniquekeys.c ?

It's actually placed there by analogy with make_pathkeys_for_sortclauses
(immediately preceding function), so I think moving it into uniquekeys
will only make more confusion.

> +query_has_uniquekeys_for(PlannerInfo *root, List *path_uniquekeys,
> +bool allow_multinulls)
>
> It seems allow_multinulls is not checked in the func. Can the parameter be
> removed ?

Right, it could be removed. I believe it was somewhat important when the
patch was tightly coupled with the UniqueKeys patch, where it was put in
use.

> +   Path   *newpath;
> +
> +   newpath = (Path *) create_projection_path(root, rel, subpath,
> + scanjoin_target);
>
> You can remove variable newpath and assign to lfirst(lc) directly.

Yes, but I've followed the same style for create_projection_path as in
many other invocations of this function in planner.c -- I would prefer
to keep it uniform.

>  +add_path(RelOptInfo *parent_rel, Path *new_path)
> +add_unique_path(RelOptInfo *parent_rel, Path *new_path)
>
> It seems the above two func's can be combined into one func which
> takes parent_rel->pathlist / parent_rel->unique_pathlist as third parameter.

Sure, but here I've intentionally split it into separate functions,
otherwise a lot of not relevant call sites have to be updated to provide
the third parameter.




Re: Printing backtrace of postgres processes

2022-01-24 Thread Fujii Masao




On 2022/01/24 16:35, torikoshia wrote:

On 2022-01-14 19:48, Bharath Rupireddy wrote:

On Sat, Nov 20, 2021 at 11:50 AM Bharath Rupireddy
 wrote:


On Fri, Nov 19, 2021 at 4:07 PM vignesh C  wrote:
> The Attached v15 patch has the fixes for the same.

Thanks. The v15 patch LGTM and the cf bot is happy hence marking it as RfC.


The patch was not applying because of the recent commit [1]. I took
this opportunity and tried a bunch of things without modifying the
core logic of the pg_log_backtrace feature that Vignesh has worked on.


I have one question about this feature. When the PID of auxiliary process like 
archiver is specified, probably the function always reports the same result, 
doesn't it? Because, for example, the archiver always logs its backtrace in 
HandlePgArchInterrupts().

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: makefiles writing to $@ should first write to $@.new

2022-01-24 Thread Peter Eisentraut

On 24.01.22 04:23, Justin Pryzby wrote:

There are many Makefile rules like

foo: bar
./tool $< > $@

If the rule is interrupted (due to ^C or ENOSPC), foo can be 0 bytes or
partially written, but won't be rebuilt until someone runs distclean or debugs
it and removes the individual file, as I did for errcodes.h.


If a rule fails, make removes the target file.  So I don't see how this 
can happen unless you hard kill -9 make or something like that.





Re: PSA: Autoconf has risen from the dead

2022-01-24 Thread Peter Eisentraut

On 24.01.22 15:14, Tom Lane wrote:

Do these versions fix any bugs that affect us (i.e., that we've
not already created workarounds for)?


The only thing that could be of interest is that the workaround we are 
carrying in config/check_decls.m4 was originally upstreamed by Noah, but 
was then later partially reverted and replaced by a different solution. 
Further explanation is here:


https://git.savannah.gnu.org/cgit/autoconf.git/commit/?id=ec90049dfcf4538750e61d675d885157fa5ca7f8

I don't think it has affected us in practice, though.




Re: UNIQUE null treatment option

2022-01-24 Thread Peter Eisentraut

On 13.01.22 19:36, Peter Geoghegan wrote:

I wonder if the logic for setting BTScanInsertData.anynullkeys inside
_bt_mkscankey() is the place to put your test for
rel->rd_index->indnullsnotdistinct -- not inside _bt_doinsert(). That
would probably necessitate renaming anynullkeys, but that's okay. This
feels more natural to me because a NULL key column in a NULLS NOT
DISTINCT unique constraint is very similar to a NULL non-key column in
an INCLUDE index, as far as our requirements go -- and so both cases
should probably be dealt with at the same point.


Makes sense.  Here is an updated patch with this change.

I didn't end up renaming anynullkeys.  I came up with names like 
"anyalwaysdistinctkeys", but in the end that felt too abstract, and 
moreover, it would require rewriting a bunch of code comments that refer 
to null values in this context.  Since as you wrote, anynullkeys is just 
a local concern between two functions, this slight inaccuracy is perhaps 
better than some highly general but unclear terminology.From c07b757ff5a31427de07e394f7c4736cb0a05378 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 24 Jan 2022 16:47:25 +0100
Subject: [PATCH v3] Add UNIQUE null treatment option

The SQL standard has been ambiguous about whether null values in
unique constraints should be considered equal or not.  Different
implementations have different behaviors.  In the SQL:202x draft, this
has been formalized by making this implementation-defined and adding
an option on unique constraint definitions UNIQUE [ NULLS [NOT]
DISTINCT ] to choose a behavior explicitly.

This patch adds this option to PostgreSQL.  The default behavior
remains UNIQUE NULLS DISTINCT.  Making this happen in the btree code
is pretty easy; most of the patch is just to carry the flag around to
all the places that need it.

The CREATE UNIQUE INDEX syntax extension is not from the standard,
it's my own invention.

XXX I named all the internal flags, catalog columns, etc. in the
negative ("nulls not distinct") so that the default PostgreSQL
behavior is the default if the flag is false.  But perhaps the double
negatives make some code harder to read.

Discussion: 
https://www.postgresql.org/message-id/flat/84e5ee1b-387e-9a54-c326-9082674bd...@enterprisedb.com
---
 doc/src/sgml/catalogs.sgml | 13 +
 doc/src/sgml/ddl.sgml  | 29 --
 doc/src/sgml/information_schema.sgml   | 12 +
 doc/src/sgml/ref/alter_table.sgml  |  4 +-
 doc/src/sgml/ref/create_index.sgml | 13 +
 doc/src/sgml/ref/create_table.sgml | 11 ++--
 src/backend/access/nbtree/nbtinsert.c  |  6 +--
 src/backend/access/nbtree/nbtsort.c| 15 +-
 src/backend/access/nbtree/nbtutils.c   |  7 +++
 src/backend/catalog/index.c|  7 +++
 src/backend/catalog/information_schema.sql |  9 +++-
 src/backend/catalog/sql_features.txt   |  1 +
 src/backend/catalog/toasting.c |  1 +
 src/backend/commands/indexcmds.c   |  3 +-
 src/backend/nodes/copyfuncs.c  |  2 +
 src/backend/nodes/equalfuncs.c |  2 +
 src/backend/nodes/makefuncs.c  |  3 +-
 src/backend/nodes/outfuncs.c   |  2 +
 src/backend/parser/gram.y  | 47 ++---
 src/backend/parser/parse_utilcmd.c |  3 ++
 src/backend/utils/adt/ruleutils.c  | 23 +---
 src/backend/utils/cache/relcache.c |  1 +
 src/backend/utils/sort/tuplesort.c |  8 ++-
 src/bin/pg_dump/pg_dump.c  | 19 +--
 src/bin/pg_dump/pg_dump.h  |  1 +
 src/bin/psql/describe.c| 19 +--
 src/include/catalog/pg_index.h |  1 +
 src/include/nodes/execnodes.h  |  1 +
 src/include/nodes/makefuncs.h  |  2 +-
 src/include/nodes/parsenodes.h |  2 +
 src/include/utils/tuplesort.h  |  1 +
 src/test/regress/expected/constraints.out  | 23 
 src/test/regress/expected/create_index.out | 61 ++
 src/test/regress/sql/constraints.sql   | 14 +
 src/test/regress/sql/create_index.sql  | 37 +
 35 files changed, 347 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1e65c426b2..2acd6ce98c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -4265,6 +4265,19 @@ pg_index Columns
   
  
 
+ 
+  
+   indnullsnotdistinct bool
+  
+  
+   This value is only used for unique indexes.  If false, this unique
+   index will consider null values distinct (so the index can contain
+   multiple null values in a column, the default PostgreSQL behavior).  If
+   it is true, it will consider null values to be equal (so the index can
+   only contain one null value in a column).
+  
+ 
+
  
   
indisprimary bool
diff --git a/doc/src/sgml/ddl.sgml 

Re: automatically generating node support functions

2022-01-24 Thread Peter Eisentraut

Rebased patch to resolve some merge conflicts

On 29.12.21 12:08, Peter Eisentraut wrote:

On 12.10.21 15:52, Andrew Dunstan wrote:

I haven't been through the whole thing, but I did notice this: the
comment stripping code looks rather fragile. I think it would blow up if
there were a continuation line not starting with  qr/\s*\*/. It's a lot
simpler and more robust to do this if you slurp the file in whole.
Here's what we do in the buildfarm code:

 my $src = file_contents($_);
 # strip C comments
 # We used to use the recipe in perlfaq6 but there is actually no 
point.

 # We don't need to keep the quoted string values anyway, and
 # on some platforms the complex regex causes perl to barf and crash.
 $src =~ s{/\*.*?\*/}{}gs;

After you've done that splitting it into lines is pretty simple.


Here is an updated patch, with some general rebasing, and the above 
improvement.  It now also generates #include lines necessary in 
copyfuncs etc. to pull in all the node types it operates on.


Further, I have looked more into the "metadata" approach discussed in 
[0].  It's pretty easy to generate that kind of output from the data 
structures my script produces.  You just loop over all the node types 
and print stuff and keep a few counters.  I don't plan to work on that 
at this time, but I just wanted to point out that if people wanted to 
move into that direction, my patch wouldn't be in the way.



[0]: 
https://www.postgresql.org/message-id/flat/20190828234136.fk2ndqtld3onfrrp%40alap3.anarazel.de
From d99c818447ee4a68b762fb8a8c645ee6ef051ff9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 24 Jan 2022 16:13:21 +0100
Subject: [PATCH v4] Automatically generate node support functions

Add a script to automatically generate the node support functions
(copy, equal, out, and read, as well as the node tags enum) from the
struct definitions.

For each of the four node support files, it creates two include files,
e.g., copyfuncs.inc1.c and copyfuncs.inc2.c, to include in the main
file.  All the scaffolding of the main file stays in place.

TODO: In this patch, I have only ifdef'ed out the code to could be
removed, mainly so that it won't constantly have merge conflicts.
Eventually, that should all be changed to delete the code.  When we do
that, some code comments should probably be preserved elsewhere, so
that will need another pass of consideration.

I have tried to mostly make the coverage of the output match what is
currently there.  For example, one could now do out/read coverage of
utility statement nodes, but I have manually excluded those for now.
The reason is mainly that it's easier to diff the before and after,
and adding a bunch of stuff like this might require a separate
analysis and review.

Subtyping (TidScan -> Scan) is supported.

For the hard cases, you can just write a manual function and exclude
generating one.  For the not so hard cases, there is a way of
annotating struct fields to get special behaviors.  For example,
pg_node_attr(equal_ignore) has the field ignored in equal functions.

Discussion: 
https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
---
 src/backend/Makefile  |   8 +-
 src/backend/nodes/.gitignore  |   3 +
 src/backend/nodes/Makefile|  46 ++
 src/backend/nodes/copyfuncs.c |  19 +-
 src/backend/nodes/equalfuncs.c|  22 +-
 src/backend/nodes/gen_node_support.pl | 660 ++
 src/backend/nodes/outfuncs.c  |  30 +-
 src/backend/nodes/readfuncs.c |  23 +-
 src/include/nodes/.gitignore  |   2 +
 src/include/nodes/nodes.h |   8 +
 src/include/nodes/parsenodes.h|   2 +-
 src/include/nodes/pathnodes.h | 136 +++---
 src/include/nodes/plannodes.h |  90 ++--
 src/include/nodes/primnodes.h |  20 +-
 src/include/pg_config_manual.h|   4 +-
 src/include/utils/rel.h   |   6 +-
 src/tools/msvc/Solution.pm|  46 ++
 17 files changed, 977 insertions(+), 148 deletions(-)
 create mode 100644 src/backend/nodes/.gitignore
 create mode 100644 src/backend/nodes/gen_node_support.pl
 create mode 100644 src/include/nodes/.gitignore

diff --git a/src/backend/Makefile b/src/backend/Makefile
index add9560be4..f9ce7cefcc 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -143,11 +143,15 @@ storage/lmgr/lwlocknames.h: 
storage/lmgr/generate-lwlocknames.pl storage/lmgr/lw
 submake-catalog-headers:
$(MAKE) -C catalog distprep generated-header-symlinks
 
+# run this unconditionally to avoid needing to know its dependencies here:
+submake-nodes-headers:
+   $(MAKE) -C nodes distprep generated-header-symlinks
+
 # run this unconditionally to avoid needing to know its dependencies here:
 submake-utils-headers:
$(MAKE) -C utils distprep generated-header-symlinks
 
-.PHONY: submake-catalog-headers submake-utils-headers
+.PHONY: 

Re: Skipping logical replication transactions on subscriber side

2022-01-24 Thread David G. Johnston
On Monday, January 24, 2022, Amit Kapila  wrote:

> On Mon, Jan 24, 2022 at 1:30 PM David G. Johnston
>  wrote:
> >
> > That said, at present my two dislikes:
> >
> > 1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further
> the timing of when this resets to zero)
> >
>
> I think this is required for future extension of this feature wherein
> I think there could be multiple such xids say when we support parallel
> apply workers. I think if we get a good way to do it even after the
> first version like by making a xid an optional parameter.
>
>
Extending the behavior is doable, and maybe we end up without this
limitation in the future, so be it.  But I’m having a hard time imagining a
scenario where the xid is not already known to the system, and the user,
and wants to be in effect for a very short window.

David J.


Re: makefiles writing to $@ should first write to $@.new

2022-01-24 Thread Tom Lane
Julien Rouhaud  writes:
> On Mon, Jan 24, 2022 at 12:41:49PM +0900, Michael Paquier wrote:
>> Honestly, I am not sure that this worth bothering about.  This comes
>> down to a balance between the code complexity and the likelihood of a
>> failure, and the odds are not in favor of the later IMO.  Now, it
>> could be perhaps possible to make such a change simple enough while it
>> avoids a lot of technical debt, but we have a lot of custom rules
>> particularly in src/bin/, so changing all that or even require that in
>> future changes is not really appealing.

> I agree, it doesn't seem worth it.

Agreed.  Another reason to not bother about this is the likelihood
that it'd all be wasted effort as soon as we switch to meson.
If that project fails, I'd be open to revisiting this issue;
but I don't think we should spend time improving the Makefiles
right now.

regards, tom lane




Re: refactoring basebackup.c

2022-01-24 Thread Dipesh Pandit
Hi,

> Here is a more detailed review.

Thanks for the feedback, I have incorporated the suggestions
and updated a new version of the patch (v3-0001).

The required documentation changes are also incorporated in
updated patch (v3-0001).

> Interesting approach. This unfortunately has the effect of making that
> test case file look a bit incoherent -- the comment at the top of the
> file isn't really accurate any more, for example, and the plain_format
> flag does more than just cause us to use -Fp; it also causes us NOT to
> use --target server:X. However, that might be something we can figure
> out a way to clean up. Alternatively, we could have a new test case
> file that is structured like 002_algorithm.pl but looping over
> compression methods rather than checksum algorithms, and testing each
> one with --server-compress and -Fp. It might be easier to make that
> look nice (but I'm not 100% sure).

Added a new test case file "009_extract.pl" to test server compressed plain
format backup (v3-0002).

> I committed the base backup target patch yesterday, and today I
> updated the remaining code in light of Michael Paquier's commit
> 5c649fe153367cdab278738ee4aebbfd158e0546. Here is the resulting patch.

v13 patch does not apply on the latest head, it requires a rebase. I have
applied
it on commit dc43fc9b3aa3e0fa9c84faddad6d301813580f88 to validate gzip
decompression patches.

Thanks,
Dipesh
From 9ec2efcc908e988409cd9ba19ea64a50012163a2 Mon Sep 17 00:00:00 2001
From: Dipesh Pandit 
Date: Mon, 24 Jan 2022 15:28:48 +0530
Subject: [PATCH 1/2] Support for extracting gzip compressed archive

pg_basebackup can support server side compression using gzip. In
order to support plain format backup with option '-Fp' we need to
add support for decompressing the compressed blocks at client. This
patch addresses the extraction of gzip compressed blocks at client.
---
 doc/src/sgml/ref/pg_basebackup.sgml |   8 +-
 src/bin/pg_basebackup/Makefile  |   1 +
 src/bin/pg_basebackup/bbstreamer.h  |   1 +
 src/bin/pg_basebackup/bbstreamer_file.c | 182 
 src/bin/pg_basebackup/bbstreamer_gzip.c | 376 
 src/bin/pg_basebackup/pg_basebackup.c   |  19 +-
 6 files changed, 401 insertions(+), 186 deletions(-)
 create mode 100644 src/bin/pg_basebackup/bbstreamer_gzip.c

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 1d0df34..19849be 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -428,8 +428,12 @@ PostgreSQL documentation


 When the tar format is used, the suffix .gz will
-automatically be added to all tar filenames. Compression is not
-available in plain format.
+automatically be added to all tar filenames.
+   
+   
+Server compression can be specified with plain format backup. It
+enables compression of the archive at server and extract the
+compressed archive at client.

   
  
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 5b18851..78d96c6 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -38,6 +38,7 @@ OBJS = \
 BBOBJS = \
 	pg_basebackup.o \
 	bbstreamer_file.o \
+	bbstreamer_gzip.o \
 	bbstreamer_inject.o \
 	bbstreamer_tar.o
 
diff --git a/src/bin/pg_basebackup/bbstreamer.h b/src/bin/pg_basebackup/bbstreamer.h
index fc88b50..270b0df 100644
--- a/src/bin/pg_basebackup/bbstreamer.h
+++ b/src/bin/pg_basebackup/bbstreamer.h
@@ -205,6 +205,7 @@ extern bbstreamer *bbstreamer_extractor_new(const char *basepath,
 			const char *(*link_map) (const char *),
 			void (*report_output_file) (const char *));
 
+extern bbstreamer *bbstreamer_gzip_extractor_new(bbstreamer *next);
 extern bbstreamer *bbstreamer_tar_parser_new(bbstreamer *next);
 extern bbstreamer *bbstreamer_tar_terminator_new(bbstreamer *next);
 extern bbstreamer *bbstreamer_tar_archiver_new(bbstreamer *next);
diff --git a/src/bin/pg_basebackup/bbstreamer_file.c b/src/bin/pg_basebackup/bbstreamer_file.c
index 77ca222..d721f87 100644
--- a/src/bin/pg_basebackup/bbstreamer_file.c
+++ b/src/bin/pg_basebackup/bbstreamer_file.c
@@ -11,10 +11,6 @@
 
 #include "postgres_fe.h"
 
-#ifdef HAVE_LIBZ
-#include 
-#endif
-
 #include 
 
 #include "bbstreamer.h"
@@ -30,15 +26,6 @@ typedef struct bbstreamer_plain_writer
 	bool		should_close_file;
 } bbstreamer_plain_writer;
 
-#ifdef HAVE_LIBZ
-typedef struct bbstreamer_gzip_writer
-{
-	bbstreamer	base;
-	char	   *pathname;
-	gzFile		gzfile;
-}			bbstreamer_gzip_writer;
-#endif
-
 typedef struct bbstreamer_extractor
 {
 	bbstreamer	base;
@@ -62,22 +49,6 @@ const bbstreamer_ops bbstreamer_plain_writer_ops = {
 	.free = bbstreamer_plain_writer_free
 };
 
-#ifdef HAVE_LIBZ
-static void bbstreamer_gzip_writer_content(bbstreamer *streamer,
-		   bbstreamer_member *member,
-		   const char *data, int 

Re: PSA: Autoconf has risen from the dead

2022-01-24 Thread Tom Lane
Peter Eisentraut  writes:
> I have patches ready for this at 
> https://github.com/petere/postgresql/tree/autoconf-updates.
> My thinking was to wait until Autoconf 2.71 has trickled down into the 
> OS versions that developers are likely to use.

I find that kind of irrelevant, because we expect people to install
autoconf from source anyway to avoid distro-specific behavior.
I suppose that waiting for it to get out into the wild might be good
from the standpoint of being sure it's bug-free, though.

Do these versions fix any bugs that affect us (i.e., that we've
not already created workarounds for)?

regards, tom lane




Re: Skipping logical replication transactions on subscriber side

2022-01-24 Thread Peter Eisentraut

On 22.01.22 10:41, Amit Kapila wrote:

Additionally, the description for pg_stat_subscription_workers should describe what 
happens once the transaction represented by last_error_xid has either been successfully 
processed or skipped.  Does this "last error" stick around until another error 
happens (which is hopefully very rare) or does it reset to blanks?


It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.


Is this going to be a problem with transaction ID wraparound?  Do we 
need to use 64-bit xids for this?





Re: Skipping logical replication transactions on subscriber side

2022-01-24 Thread Peter Eisentraut

On 22.01.22 03:54, Amit Kapila wrote:

Won't we already do that for Alter Subscription command which means
nothing special needs to be done for this? However, it seems to me
that the idea we are trying to follow here is that as this option can
lead to data inconsistency, it is good to allow only superusers to
specify this option. The owner of the subscription can be changed to
non-superuser as well in which case I think it won't be a good idea to
allow this option. OTOH, if we think it is okay to allow such an
option to users that don't have superuser privilege then I think
allowing it to the owner of the subscription makes sense to me.


I don't think this functionality allows a nonprivileged user to do 
anything they couldn't otherwise do.  You can create inconsistent data 
in the sense that you can choose not to apply certain replicated data. 
But a subscription owner has to have write access to the target tables 
of the subscription, so they already have the ability to write or not 
write any data they want.






Re: Fix BUG #17335: Duplicate result rows in Gather node

2022-01-24 Thread Yura Sokolov
В Вс, 23/01/2022 в 14:56 +0300, Yura Sokolov пишет:
> В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> > On Fri, 31 Dec 2021 at 00:14, Yura Sokolov  wrote:
> > > Suggested quick (and valid) fix in the patch attached:
> > > - If Append has single child, then copy its parallel awareness.
> > 
> > I've been looking at this and I've gone through changing my mind about
> > what's the right fix quite a number of times.
> > 
> > My current thoughts are that I don't really like the fact that we can
> > have plans in the following shape:
> > 
> >  Finalize Aggregate
> >->  Gather
> >  Workers Planned: 1
> >  ->  Partial Aggregate
> >->  Parallel Hash Left Join
> >  Hash Cond: (gather_append_1.fk = gather_append_2.fk)
> >  ->  Index Scan using gather_append_1_ix on 
> > gather_append_1
> >Index Cond: (f = true)
> >  ->  Parallel Hash
> >->  Parallel Seq Scan on gather_append_2
> > 
> > It's only made safe by the fact that Gather will only use 1 worker.
> > To me, it just seems too fragile to assume that's always going to be
> > the case. I feel like this fix just relies on the fact that
> > create_gather_path() and create_gather_merge_path() do
> > "pathnode->num_workers = subpath->parallel_workers;". If someone
> > decided that was to work a different way, then we risk this breaking
> > again. Additionally, today we have Gather and GatherMerge, but we may
> > one day end up with more node types that gather results from parallel
> > workers, or even a completely different way of executing plans.
> 
> It seems strange parallel_aware and parallel_safe flags neither affect
> execution nor are properly checked.
> 
> Except parallel_safe is checked in ExecSerializePlan which is called from
> ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
> But looks like this check doesn't affect execution as well.
> 
> > I think a safer way to fix this is to just not remove the
> > Append/MergeAppend node if the parallel_aware flag of the only-child
> > and the Append/MergeAppend don't match. I've done that in the
> > attached.
> > 
> > I believe the code at the end of add_paths_to_append_rel() can remain as is.
> 
> I found clean_up_removed_plan_level also called from 
> set_subqueryscan_references.
> Is there a need to patch there as well?
> 
> And there is strange state:
> - in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
>   all its subpath's parallel_safe
>   (therefore there were need to copy it in my patch version),
> - that means, our AppendPath is parallel_aware but not parallel_safe.
> It is ridiculous a bit.
> 
> And it is strange AppendPath could have more parallel_workers than sum of
> its children parallel_workers.
> 
> So it looks like whole machinery around parallel_aware/parallel_safe has
> no enough consistency.
> 
> Either way, I attach you version of fix with my tests as new patch version.

Looks like volatile "Memory Usage:" in EXPLAIN brokes 'make check'
sporadically.

Applied replacement in style of memoize.sql test.

Why there is no way to disable "Buckets: %d Buffers: %d Memory Usage: %dkB"
output in show_hash_info?

regards,
Yura Sokolov
From 9ed2139495b2026433ff5e7c4092fcfa8f10e4d1 Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Sun, 23 Jan 2022 14:53:21 +0300
Subject: [PATCH v3] Fix duplicate result rows after Append path removal.

It could happen Append path is created with "parallel_aware" flag,
but its single child is not. Append path parent (Gather or Gather Merge)
thinks its child is parallel_aware, but after Append path removal Gather's
child become not parallel_aware. Then when Gather/Gather Merge decides
to run child in several workers or worker + leader participation, it
gathers duplicate result rows from several child path invocations.

To fix it don't remove Append/MergeAppend node if it's parallel_aware !=
single child parallel_aware.

Authors: David Rowley, Sokolov Yura.
---
 src/backend/optimizer/plan/setrefs.c  |  24 ++-
 .../expected/gather_removed_append.out| 154 ++
 src/test/regress/parallel_schedule|   1 +
 .../regress/sql/gather_removed_append.sql | 102 
 4 files changed, 277 insertions(+), 4 deletions(-)
 create mode 100644 src/test/regress/expected/gather_removed_append.out
 create mode 100644 src/test/regress/sql/gather_removed_append.sql

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e44ae971b4b..a7b11b7f03a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1512,8 +1512,16 @@ set_append_references(PlannerInfo *root,
 		lfirst(l) = set_plan_refs(root, (Plan *) lfirst(l), rtoffset);
 	}
 
-	/* Now, if there's just one, forget the Append and return that child */
-	if (list_length(aplan->appendplans) == 1)
+	/*
+	 * See if it's 

Re: missing indexes in indexlist with partitioned tables

2022-01-24 Thread Arne Roland
Hi!

> From: Zhihong Yu 
> Subject: Re: missing indexes in indexlist with partitioned tables
>
> Hi,
>
> -   if (indexRelation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
> +   if (inhparent && (!index->indisunique || 
> indexRelation->rd_rel->relkind != RELKIND_PARTITIONED_INDEX))
>
> The check on RELKIND_PARTITIONED_INDEX seems to negate what the comment above 
> says:
>
> +* Don't add partitioned indexes to the indexlist
>
> Cheers

The comment at my end goes on:


/*
* Don't add partitioned indexes to the indexlist, since they are
* not usable by the executor. If they are unique add them to the
* partindexlist instead, to use for further pruning. If they
* aren't that either, simply skip them.
*/

Regarding the structure: I think, that we probably should remove the first two 
sentences here. They reoccur 50 lines below anyways, which seems a dubious 
practice. The logic that enforces the first two sentences is mainly down below, 
so that place is probably on one to keep.

Regarding the semantics: This is sort of what the statement checks for (skip 
for inhparent, if not unique or not partitioned index), i.e. it checks for the 
case, where the index shouldn't be added to either list.

Side note: I personally think the name inhparent is mildly confusing, since 
it's not really about inheritance. I don't have a significantly better idea 
though.

From: Alvaro Herrera 
Sent: Wednesday, January 19, 2022 23:26
> Ah, apologies, I didn't realize that that test was so new.

No offense taken. Unless one was involved in the creation of the corresponding 
patch, it's unreasonable to know that. I like the second part of your message 
very much:

> See src/tools/ci/README (for multi-platform testing of patches on
> several platforms) and http://commitfest.cputube.org/

Thanks, I didn't know of cputube. Neat! That's pretty much what I was looking 
for!
Is there a way to get an email notification if some machine fails (turns bright 
red)? For the threads I'm explicitly subscribed to, that would seem helpful to 
me.

Regards
Arne



  1   2   >