date:20231005

Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)

2023-10-05 Thread Michael Paquier

On Thu, Oct 05, 2023 at 03:08:37PM +0300, Heikki Linnakangas wrote:
> This seems pretty uncontroversial, and I heard no objections, so I went
> ahead and committed that.

It looks like e29c4643951 is causing issues here.  While doing
benchmarking on a cluster compiled with -O2, I got a crash:
LOG:  system logger process (PID 27924) was terminated by signal 11: 
Segmentation fault 

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x55ef3b9aed20 in pfree ()
(gdb) bt
#0  0x55ef3b9aed20 in pfree ()
#1  0x55ef3b7e0e41 in ClosePostmasterPorts ()
#2  0x55ef3b7e6649 in SysLogger_Start ()
#3  0x55ef3b7e4413 in PostmasterMain () 

Okay, the backtrace is not that useful.  I'll see if I can get
something better, still it seems like this has broken the way the
syslogger closes these ports.
--
Michael


signature.asc
Description: PGP signature

Re: Synchronizing slots from primary to standby

2023-10-05 Thread Peter Smith

Hi Ajin. Thanks for addressing my previous review comments from v19.

I checked all the changes. Below are a few follow-up remarks.

On Thu, Oct 5, 2023 at 7:54 PM Ajin Cherian  wrote:
>
> On Wed, Sep 27, 2023 at 2:37 PM Peter Smith  wrote:
> >
> > Here are some more review comments for the patch v19-0002.

> > 3. get_local_synced_slot_names
> >
> > + for (int i = 0; i < max_replication_slots; i++)
> > + {
> > + ReplicationSlot *s = >replication_slots[i];
> > +
> > + /* Check if it is logical synchronized slot */
> > + if (s->in_use && SlotIsLogical(s) && s->data.synced)
> > + {
> > + for (int j = 0; j < MySlotSyncWorker->dbcount; j++)
> > + {
> >
> > Loop variables are not declared in the common PG code way.
> >
>
> fixed.

Yes, new declarations were added, but some of them (e.g. 'j') could
have been declared at a lower scope closer to where they are being
used.

> > 5. use_slot_in_query
> >
> > +static bool
> > +use_slot_in_query(char *slot_name, Oid *dbids)
> >
> > There are multiple non-standard for-loop variable declarations in this 
> > function.
> >
>
> fixed.

Yes, new declarations were added, but some of them (e.g. 'j') could
have been declared at a lower scope closer to where they are being
used.

> > 11. get_remote_invalidation_cause
> >
> > +/*
> > + * Get Remote Slot's invalidation cause.
> > + *
> > + * This gets invalidation cause of remote slot.
> > + */
> > +static ReplicationSlotInvalidationCause
> > +get_remote_invalidation_cause(WalReceiverConn *wrconn, char *slot_name)
> > +{
> >
> > Isn't that function comment just repeating itself?
> >
>
> Fixed.

/remote slot./the remote slot./

> > 27.
> > + /* We are done, free remot_slot_list elements */
> > + foreach(cell, remote_slot_list)
> > + {
> > + RemoteSlot *remote_slot = (RemoteSlot *) lfirst(cell);
> > +
> > + pfree(remote_slot);
> > + }
> >
> > 27a.
> > /remot_slot_list/remote_slot_list/
> >
>
> Fixed.
>
> > ~
> >
> > 27b.
> > Isn't this just the same as the one-liner:
> >
> > list_free_deep(remote_slot_list);

It looks like the #27b comment was accidentally missed (??)

> > 29. remote_connect
> >
> > +/*
> > + * Connect to remote (primary) server.
> > + *
> > + * This uses primary_conninfo in order to connect to primary. For slot-sync
> > + * to work, primary_conninfo is expected to have dbname as well.
> > + */
> > +static WalReceiverConn *
> > +remote_connect()
> >
> > 29a.
> > I felt it might be more helpful to say "GUC primary_conninfo" instead
> > of just 'primary_conninfo' the first time this is mentioned.
> >
>
> fixed.

The changed v21 comment now refers to "GUC PrimaryConnInfo" but I
think that is wrong. The GUC really is called "pnmary_conninfo" ---
PrimaryConnInfo is just the code static variable name.

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Opportunistically pruning page before update

2023-10-05 Thread Dilip Kumar

On Thu, Oct 5, 2023 at 2:35 AM James Coleman  wrote:
>
> I talked to Andres and Peter again today, and out of that conversation
> I have some observations and ideas for future improvements.
>
> 1. The most trivial case where this is useful is INSERT: we have a
> target page, and it may have dead tuples, so trying to prune may
> result in us being able to use the target page rather than getting a
> new page.
> 2. The next most trivial case is where UPDATE (potentially after
> failing to find space for a HOT tuple on the source tuple's page);
> much like the INSERT case our backend's target page may benefit from
> pruning.

By looking at the patch I believe that v2-0003 is implementing these 2
ideas.  So my question is are we planning to prune the backend's
current target page only or if we can not find space in that then we
are targetting to prune the other target pages as well which we are
getting from FSM?  Because in the patch you have put a check in a loop
it will try to prune every page it gets from the FSM not just the
current target page of the backend.  Just wanted to understand if this
is intentional.

In general, all 4 ideas look promising.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: pg_get_indexdef() modification to use TxnSnapshot

2023-10-05 Thread Tom Lane

kuroda.keis...@nttcom.co.jp writes:
> On 2023-06-14 15:31, vignesh C wrote:
>> I have attempted to convert pg_get_indexdef() to use
>> systable_beginscan() based on transaction-snapshot rather than using
>> SearchSysCache().

Has anybody thought about the fact that ALTER TABLE ALTER TYPE
(specifically RememberIndexForRebuilding) absolutely requires seeing
the latest version of the index's definition?

>>> it would be going to be a large refactoring and potentially make the
>>> future implementations such as pg_get_tabledef() etc hard. Have you
>>> considered changes to the SearchSysCache() family so that they
>>> internally use a transaction snapshot that is registered in advance.

A very significant fraction of other SearchSysCache callers likewise
cannot afford to see stale data.  We might be able to fix things so
that the SQL-accessible ruleutils functionality works differently, but
we can't just up and change the behavior of cache lookups everywhere.

regards, tom lane

Re: pg_get_indexdef() modification to use TxnSnapshot

2023-10-05 Thread kuroda . keisuke

Hi hackers,

With '0001-pg_get_indexdef-modification-to-access-catalog-based.patch' 
patch,

I confirmed that definition information
can be collected even if the index is droped during pg_dump.
The regression test (make check-world) has passed.

I also tested the view definition for a similar problem.
As per the attached patch and test case,
By using SetupHistoricSnapshot for pg_get_viewdef() as well,
Similarly, definition information can be collected
for VIEW definitions even if the view droped during pg_dump.
The regression test (make check-world) has passed,
and pg_dump's SERIALIZABLE results are improved.
However, in a SERIALIZABLE transaction,
If you actually try to access it, it will cause ERROR,
so it seems to cause confusion.
I think the scope of this improvement should be limited
to the functions listed as pg_get_*** functions at the moment.

---

# test2 pg_get_indexdef,pg_get_viewdef

## create table,index,view
drop table test1;
create table test1(id int);
create index idx1_test1 on test1(id);
create view view1_test1 as select * from test1;

## begin Transaction-A
begin transaction isolation level serializable;
select pg_current_xact_id();

## begin Transaction-B
begin transaction isolation level serializable;
select pg_current_xact_id();

## drop index,view in Transaction-A
drop index idx1_test1;
drop view view1_test1;
commit;

## SELECT pg_get_indexdef,pg_get_viewdef in Transaction-B
SELECT pg_get_indexdef(oid) FROM pg_class WHERE relname = 'idx1_test1';
SELECT pg_get_viewdef(oid) FROM pg_class WHERE relname = 'view1_test1';

## correct info from index and view
postgres=*# SELECT pg_get_indexdef(oid) FROM pg_class WHERE relname = 
'idx1_test1';

 pg_get_indexdef
--
 CREATE INDEX idx1_test1 ON public.test1 USING btree (id)
(1 row)

postgres=*# SELECT pg_get_viewdef(oid) FROM pg_class WHERE relname = 
'view1_test1';

 pg_get_viewdef

  SELECT id+
FROM test1;
(1 row)

## However, SELECT * FROM view1_test1 cause ERROR because view does not 
exist

postgres=*# SELECT * FROM view1_test1;
ERROR:  relation "view1_test1" does not exist
LINE 1: SELECT * FROM view1_test1;

Best Regards,
Keisuke Kuroda
NTT COMWARE

On 2023-06-14 15:31, vignesh C wrote:
On Tue, 13 Jun 2023 at 11:49, Masahiko Sawada  
wrote:

On Fri, May 26, 2023 at 6:55 PM shveta malik  
wrote:

>
> I have attempted to convert pg_get_indexdef() to use
> systable_beginscan() based on transaction-snapshot rather than using
> SearchSysCache(). The latter does not have any old info and thus
> provides only the latest info as per the committed txns, which could
> result in errors in some scenarios. One such case is mentioned atop
> pg_dump.c. The patch is an attempt to fix the same pg_dump's issue.
> Any feedback is welcome.

Even only in pg_get_indexdef_worker(), there are still many places
where we use syscache lookup: generate_relation_name(),
get_relation_name(), get_attname(), get_atttypetypmodcoll(),
get_opclass_name() etc.

>
> There is a long list of pg_get_* functions which use SearchSysCache()
> and thus may expose similar issues. I can give it a try to review the
> possibility of converting all of them. Thoughts?
>

it would be going to be a large refactoring and potentially make the
future implementations such as pg_get_tabledef() etc hard. Have you
considered changes to the SearchSysCache() family so that they
internally use a transaction snapshot that is registered in advance.
Since we are already doing similar things for catalog lookup in
logical decoding, it might be feasible. That way, the changes to
pg_get_XXXdef() functions would be much easier.

I feel registering an active snapshot before accessing the system
catalog as suggested by Sawada-san will be a better fix to solve the
problem. If this approach is fine, we will have to similarly fix the
other pg_get_*** functions like pg_get_constraintdef,
pg_get_function_arguments, pg_get_function_result,
pg_get_function_identity_arguments, pg_get_function_sqlbody,
pg_get_expr, pg_get_partkeydef, pg_get_statisticsobjdef and
pg_get_triggerdef.
The Attached patch has the changes for the same.

Regards,
Vigneshdiff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 8d5eac4791..c95287104e 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -657,7 +657,14 @@ pg_get_viewdef(PG_FUNCTION_ARGS)

prettyFlags = PRETTYFLAG_INDENT;

+   /*
+* Setup the snapshot to get the catalog information using this snapshot
+* which will prevent accessing a future catalog data which has occurred
+* due to concurrent catalog change.
+*/
+   SetupHistoricSnapshot(GetActiveSnapshot(), NULL);
res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
+   TeardownHistoricSnapshot(false);

if (res == NULL)
PG_RETURN_NULL();
@@ -1154,10 +1161,17

Re: Rethink the wait event names for postgres_fdw, dblink and etc

2023-10-05 Thread Masahiro Ikeda


On 2023-10-05 10:28, Michael Paquier wrote:

On Wed, Oct 04, 2023 at 05:19:40PM +0900, Michael Paquier wrote:

I am lacking a bit of time now, but I have applied the bits for
test_shm_mq and worker_spi.  Note that I have not added tests for
test_shm_mq as it may be possible that the two events (for the
bgworker startup and for a message to be queued) are never reached
depending on the timing.  I'll handle the rest tomorrow, with likely
some adjustments to the tests.  (I may as well just remove them, this
API is already covered by worker_spi.)


After sleeping on it, I've taken the decision to remove the tests.  As
far as I have tested, this was stable, but this does not really
improve the test coverage as WaitEventExtensionNew() is covered in
worker_spi.  I have done tweaks to the docs and the variable names,
and applied that into its own commit.

Note as well that the docs of dblink were wrong for DblinkGetConnect:
the wait event could be seen in other functions than dblink() and
dblink_exec().


Thanks for modifying and committing. I agree your comments.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Re: Remove distprep

2023-10-05 Thread Michael Paquier

On Thu, Oct 05, 2023 at 05:46:46PM +0200, Peter Eisentraut wrote:
> Ok, I think I found a better way to address this.  It requires keeping a
> subset of the old distprep target in src/backend/Makefile for use by nls.mk.
> I have checked that the above sequence now works, and that the generated
> .pot files are identical to before this patch.

generated-parser-sources looks like an elegant split.

> (Not sure if it would be good to commit it that way, but it's easier to look
> at for now for sure.)

Not sure.  I'd be OK if the patch set is committed into two pieces as
well.  I guess that's up to how you feel about that at the end ;)

While looking at the last references of the distribution tarball, this
came out:
# This allows removing some files from the distribution tarballs while
# keeping the dependencies satisfied.
.SECONDARY: $(GENERATED_SGML)
.SECONDARY: postgres-full.xml
.SECONDARY: INSTALL.html INSTALL.xml
.SECONDARY: postgres-A4.fo postgres-US.fo

That's not really something for this patch, but I got to ask.  What's
the final plan for the documentation when it comes to releases?  A
second tarball separated from the git-only tarball that includes all
that and the HTML docs, generated with a new "doc-only" set of meson
commands?
--
Michael

signature.asc
Description: PGP signature

Re: Move bki file pre-processing from initdb to bootstrap

2023-10-05 Thread Krishnakumar R

Thank you, Peter, Andres and Tom for your comments and thoughts.

Hi Peter,

> For example, for NAMEDATALEN, SIZEOF_POINTER, ALIGNOF_POINTER,
> FLOAT8PASSBYVAL, these are known at build time, so we could have
> genbki.pl substitute them at build time.

I have modified the patch to use genbki to generate these ones during
build time.

> The locale-related symbols (ENCODING, LC_COLLATE, etc.), I wonder
> whether we can eliminate the need for them.  Right now, these are only
> used in the bki entry for the template1 database.  How about initdb
> creates template0 first, with hardcoded default encoding, collation,
> etc., and then creates template1 from that, using the normal CREATE
> DATABASE command with the appropriate options.  Or initdb could just run
> an UPDATE on pg_database to put the right settings in place.

Using a combination of this suggestion and discussions Andres pointed
to in this thread, updated the patch to add placeholder values first
into template1 and then do UPDATEs in initdb itself.

> You should use an option letter that isn't already in use in one of the
> other modes of "postgres".  We try to keep those consistent.
>
> New options should be added to the --help output (usage() in main.c).

Used a -b option under bootstrap mode and added help.

> elog(INFO, "Open bki file %s\n", bki_file);
> +   boot_yyin = fopen(bki_file, "r");
>
> Why is this needed?  It already reads the bki file from stdin?

We no longer open the bki file in initdb and pass to postgres to parse
from stdin, instead we open the bki file directly in bootstrap and
pass the file stream to the parser. Hence the need to switch the yyin.
Have added a comment in the commit logs to capture this.

The version comparison has been moved from initdb to bootstrap. This
created some compatibility problems with windows tests. For now I kept
the version check to not have \n added, which worked fine and serves
the purpose. However hoping to have something better in v3 in addition
to addressing any other comments.

Please let me know your thoughts and review comments.

--
Thanks and Regards,
Krishnakumar (KK).
[Microsoft]

On Tue, Sep 19, 2023 at 3:18 AM Peter Eisentraut  wrote:
>
> On 01.09.23 10:01, Krishnakumar R wrote:
> > This patch moves the pre-processing for tokens in the bki file from
> > initdb to bootstrap. With these changes the bki file will only be
> > opened once in bootstrap and parsing will be done by the bootstrap
> > parser.
>
> I did some rough performance tests on this.  I get about a 10%
> improvement on initdb run time, so this appears to have merit.
>
> I wonder whether we can reduce the number of symbols that we need this
> treatment for.
>
> For example, for NAMEDATALEN, SIZEOF_POINTER, ALIGNOF_POINTER,
> FLOAT8PASSBYVAL, these are known at build time, so we could have
> genbki.pl substitute them at build time.
>
> The locale-related symbols (ENCODING, LC_COLLATE, etc.), I wonder
> whether we can eliminate the need for them.  Right now, these are only
> used in the bki entry for the template1 database.  How about initdb
> creates template0 first, with hardcoded default encoding, collation,
> etc., and then creates template1 from that, using the normal CREATE
> DATABASE command with the appropriate options.  Or initdb could just run
> an UPDATE on pg_database to put the right settings in place.
>
> I don't like this part so much, because it adds like 4 more places each
> of these variables is mentioned, which increases the mental and testing
> overhead for dealing with these features (which are an area of active
> development).
>
> In general, it would be good if this could be factored a bit more so
> each variable doesn't have to be hardcoded in so many places.
>
>
> Some more detailed comments on the code:
>
> +   boot_yylval.str = pstrdup(yytext);
> +   sprintf(boot_yylval.str, "%d", NAMEDATALEN);
>
> This is weird.  You are first assigning the text and then overwriting it
> with the numeric value?
>
> You can also use boot_yylval.ival for storing numbers.
>
> +   if (bootp_null(ebootp, ebootp->username)) return
> NULLVAL;
>
> Add proper line breaks in the code.
>
> +bool bootp_null(extra_bootstrap_params *e, char *s)
>
> Add a comment what this function is supposed to do.
>
> This function could be static.
>
> +   while ((flag = getopt(argc, argv, "B:c:d:D:Fi:kr:X:-:")) != -1)
>
> You should use an option letter that isn't already in use in one of the
> other modes of "postgres".  We try to keep those consistent.
>
> New options should be added to the --help output (usage() in main.c).
>
> +   elog(INFO, "Open bki file %s\n", bki_file);
> +   boot_yyin = fopen(bki_file, "r");
>
> Why is this needed?  It already reads the bki file from stdin?
>
> +   printfPQExpBuffer(, "\"%s\" --boot -X %d %s %s %s %s -i
> %s=%s,%s=%s,%s=%s,"
> + "%s=%s,%s=%s,%s=%s,%s=%s,%s=%c",
> + backend_exec,
> +

Re: pg16: invalid page/page verification failed

2023-10-05 Thread Michael Paquier

On Thu, Oct 05, 2023 at 11:45:18AM -0500, Justin Pryzby wrote:
> This table is what it sounds like: a partition into which CSV logs are
> COPY'ed.  It would've been created around 8am.  There's no special
> params set for the table nor for autovacuum.

This may be an important bit of information.  31966b151e6a is new as
of Postgres 16, has changed the way relations are extended and COPY
was one area touched.  I am adding Andres in CC.
--
Michael


signature.asc
Description: PGP signature

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-05 Thread Michael Paquier

On Thu, Oct 05, 2023 at 05:51:31PM +0530, Bharath Rupireddy wrote:
> I think changing BGWORKER_BYPASS_ALLOWCONN to 0x0001 and having bits32
> for InitPostgres inputs load_session_libraries and
> override_allow_connections can be 0001 in this patch series so that it
> can go first, no? This avoids the new code being in the old format and
> changing things right after it commits.

I am not completely sure what you mean here.  We've never supported
load_session_libraries for background workers, and I'm not quite sure
that there is a case for it.  FWIW, my idea around that would be to
have two separate sets of flags: one set for the bgworkers and one set
for PostgresInit() as an effect of load_session_libraries which looks
like a fuzzy concept for bgworkers.
--
Michael


signature.asc
Description: PGP signature

Re: False "pg_serial": apparent wraparound” in logs

2023-10-05 Thread Imseih (AWS), Sami

Correct a typo in my last message: 

Instead of:
“ I added an additional condition to make sure that the tailPage proceeds the 
headPage
as well. “

It should be:
“ I added an additional condition to make sure that the tailPage precedes the 
headPage
as well. ”

Regards,

Sami

> On Oct 5, 2023, at 6:29 PM, Imseih (AWS), Sami  wrote:
> 
> I added an additional condition to make sure that the tailPage proceeds the 
> headPage
> as well.

Re: False "pg_serial": apparent wraparound” in logs

2023-10-05 Thread Imseih (AWS), Sami

> I think the smallest fix here would be to change CheckPointPredicate()
> so that if tailPage > headPage, pass headPage to SimpleLruTruncate()
> instead of tailPage. Or perhaps it should go into the "The SLRU is no
> longer needed" codepath in that case. If tailPage > headPage, the SLRU
> isn't needed at the moment.

I spent sometime studying this and it appears to be a good approach. 

Passing the cutoff page as headPage (SLRU not needed code path ) instead of the 
tailPage to 
SimpleLruTruncate is already being done when the tailXid is not a valid XID. 
I added an additional condition to make sure that the tailPage proceeds the 
headPage
as well. 

Attached is v2 of the patch.

> In addition to that, we could change SerialAdd() to not zero out the
> pages between old headXid and tailXid unnecessarily, but that's more of
> an optimization than bug fix.

Yes, I did notice that in my debugging, but will not address this in the 
current patch.


Regards,

Sami



0001-v2-Fix-false-apparent-wraparound-detected-in-pg_serial.patch
Description: 0001-v2-Fix-false-apparent-wraparound-detected-in-pg_serial.patch

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-05 Thread Gurjeet Singh

On Thu, Oct 5, 2023 at 1:09 PM Jeff Davis  wrote:
>
> >
> > I think this would be mighty confusing to users since it's not clear
> > that
> > adding a password will potentially invalidate a current password
> > (which
> > might be actively in use), but only if there are already 2 in the
> > stack.  I
> > worry that such a desіgn might be too closely tailored to the
> > implementation details.  If we proceed with this design, perhaps we
> > should
> > consider ERROR-ing if a user tries to add a third password.
>
> I agree that the proposed language is confusing, especially because ADD
> causes a password to be added and another one to be removed. But
> perhaps there are some non-confusing ways to expose a similar idea.

How about a language like the following (I haven't tried if this will
work in the grammar we have):

CREATE ROLE u1 PASSWORD 'p1';

ALTER ROLE u1ADD NEW PASSWORD 'p2';

ALTER ROLE u1 ADD NEW PASSOWRD 'p3';
ERROR: Cannot have more than 2 passwords at the same time.

ALTER ROLE u1 DROP OLD PASSWORD;

ALTER ROLE u1 ADD NEW PASSOWRD 'p3';
-- succeeds; forgets password 'p1'; p2 and p3 can be used to login

ALTER ROLE u1 DROP NEW PASSWORD;
-- forgets password 'p3'. Only 'p2' can be used to login

ALTER ROLE u1 ADD NEW PASSOWRD 'p4';
-- succeeds; 'p2' and 'p4' can be used to login

-- Set the valid-until of 'new' (p4) password
ALTER ROLE u1 VALID UNTIL '2024/01/01';

-- If we need the ability to change valid-until of both old and new,
we may allow something like the following.
ALTER ROLE u1 [_NEW_ | OLD] VALID UNTIL '2024/01/01';

This way there's a notion of a 'new' and 'old' passwords. User cannot
add a third password without explicitly dropping one of existing
passwords (either old or new). At any time the user can choose to drop
the old or the new password. Adding a new password will mark the
current password as 'old'; if there's only old password (because 'new'
was dropped) the 'old' password will remain intact and the new one
will be placed in 'current'/new spot.

So in normal course of operation, even for automated jobs, the
expected flow to roll over the passwords would be:

ALTER USER u1 DROP OLD PASSWORD;
-- success if there is an old password; otherwise NOTICE: no old password
ALTER USER u1 ADD NEW PASSWORD 'new-secret';

> The thing I like about Gurjeet's proposal is that it's well-targeted at
> a specific use case rather than trying to be too general. That makes it
> a little easier to avoid certain problems like having a process that
> adds passwords and never removes the old ones (leading to weird
> problems like 47000 passwords for one user).
>
> But it also feels strange to be limited to two -- perhaps the password
> rotation schedule or policy just doesn't work with a limit of two, or
> perhaps that introduces new kinds of mistakes.
>
> Another idea: what if we introduce the notion of deprecating a
> password?

I'll have to think more about it, but perhaps my above proposal
addresses the use case you describe.

> if we allow multiple deprecated passwords, we'd still have to come up
> with some way to address them (names, numbers, validity period,
> something). But by isolating the problem to deprecated passwords only,
> it feels like the system is still being restored to a clean state with
> at most one single current password. The awkwardness is contained to
> old passwords which will hopefully go away soon anyway and not
> represent permanent clutter.

+1

Best regards,
Gurjeet
http://Gurje.et

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-05 Thread Gurjeet Singh

On Thu, Oct 5, 2023 at 12:04 PM Nathan Bossart  wrote:
>
> On Wed, Oct 04, 2023 at 10:41:15PM -0700, Gurjeet Singh wrote:
> > The patches posted in this thread so far attempt to add the ability to
> > allow the user to have an arbitrary number of passwords. I believe
> > that allowing arbitrary number of passwords is not only unnecessary,
> > but the need to name passwords, the need to store them in a shared
> > catalog, etc. may actually create problems in the field. The
> > users/admins will have to choose names for passwords, which they
> > didn't have to previously. The need to name them may also lead to
> > users storing password-hints in the password names (e.g. 'mom''s
> > birthday', 'ex''s phone number', 'third password'), rendering the
> > passwords weak.
> >
> > Moreover, allowing an arbitrarily many number of passwords will
> > require us to provide additional infrastructure to solve problems like
> > observability (which passwords are currently in use, and which ones
> > have been effectively forgotten by applications), or create a nuisance
> > for admins that can create more problems than it solves.
>
> IMHO neither of these problems seems insurmountable.

Agreed.

> Besides advising
> against using hints as names, we could also automatically generate safe
> names, or even disallow user-provided names entirely.

Somehow naming passwords doesn't feel palatable to me.

> And adding
> observability for passwords seems worthwhile anyway.

Agreed.

> > So I propose that the feature should allow no more than 2 passwords
> > for a role, each with their own validity periods. This eliminates the
> > need to name passwords, because at any given time there are no more
> > than 2 passwords; current one, and old one. This also eliminates the
> > need for a shared catalog to hold passwords, because with the limit of
> > 2 imposed, we can store the old password and its validity period in
> > additional columns in the pg_authid table.
>
> Another approach could be to allow an abritrary number of passwords but to
> also allow administrators to limit how many passwords can be associated to
> each role.  That way, we needn't restrict this feature to 2 passwords for
> everyone.  Perhaps 2 should be the default, but in any case, IMO we
> shouldn't design to only support 2.

I don't see a real use case to support more than 2 passwords. Allowing
an arbitrary number of passwords might look good and clean from
aesthetics and documentation perspective (no artificially enforced
limits, as in the zero-one-infinity rule), but in absence of real use
cases for that many passwords, I'm afraid we might end up with a
feature that creates more and worse problems for the users than it
solves.

> > In essence, we create a stack that can hold 2 passwords. Pushing an
> > element when it's full will make it forget the bottom element. Popping
> > the stack makes it forget the top element, and the only remaining
> > element, if any, becomes the top.
>
> I think this would be mighty confusing to users since it's not clear that
> adding a password will potentially invalidate a current password (which
> might be actively in use), but only if there are already 2 in the stack.

Fair point. We can aid the user by emitting a NOTICE (or a WARNING)
message whenever an old password is removed from the system because of
addition of a new password.

> I
> worry that such a desіgn might be too closely tailored to the
> implementation details.  If we proceed with this design, perhaps we should
> consider ERROR-ing if a user tries to add a third password.

I did not have a stack in mind when developing the use case and the
grammar, so implementation details did not drive this design. This new
design was more of a response to the manageability nightmare that I
could see the old approach may lead to. When writing the email I
thought mentioning the stack analogy may make it easier to develop a
mental model. I certainly won't suggest using it in the docs for
explanation :-)

> > -- If, for some reason, the user wants to get rid of the latest password 
> > added.
> > -- Remove 'p2' (current password). The old password (p1), will be restored 
> > to
> > -- rolpassword, along with its valid-until value.
> > ALTER ROLE u1 PASSWORD NULL;
> > -- This may come as a surprise to some users, because currently they expect 
> > the
> > -- user to completely lose the ability to use passwords for login after this
> > -- command. To get the old behavior, the user must now use the ALL PASSWORD
> > -- NULL incantation; see below.
> > -- Issuing this command one more time will remove even the restored 
> > password,
> > -- hence leaving the user with no passwords.
>
> Is it possible to remove the oldest password added without removing the
> latest password added?

In the patch I have so far, ALTER ROLE u1 ADD PASSWORD '' (empty
string) will drop the old password (what you asked for), and move the
current password to rololdpassword (which is not exactly what you
asked

Re: Allow tests to pass in OpenSSL FIPS mode

2023-10-05 Thread Tom Lane

I found another bit of fun we'll need to deal with: on my F38
platform, pgcrypto/3des fails as attached.  Some googling finds
this relevant info:

https://github.com/pyca/cryptography/issues/6875

That is, FIPS deprecation of 3DES is happening even as we speak.
So apparently we'll have little choice but to deal with two
different behaviors for that.

As before, I'm not too pleased with the user-friendliness
of the error:

+ERROR:  encrypt error: Cipher cannot be initialized

That's even less useful to a user than "unsupported".

FWIW, everything else seems to pass with this patchset.
I ran check-world as well as the various "must run manually"
test suites.

regards, tom lane

diff -U3 /home/tgl/pgsql/contrib/pgcrypto/expected/3des.out /home/tgl/pgsql/contrib/pgcrypto/results/3des.out
--- /home/tgl/pgsql/contrib/pgcrypto/expected/3des.out	2023-10-05 15:25:46.922080156 -0400
+++ /home/tgl/pgsql/contrib/pgcrypto/results/3des.out	2023-10-05 16:29:32.416972002 -0400
@@ -5,61 +5,25 @@
 SELECT encrypt('\x8000',
'\x010101010101010101010101010101010101010101010101',
'3des-ecb/pad:none');
-  encrypt   
-
- \x95f8a5e5dd31d900
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized
 select encrypt('', 'foo', '3des');
-  encrypt   
-
- \x752111e37a2d7ac3
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized
 -- 10 bytes key
 select encrypt('foo', '0123456789', '3des');
-  encrypt   
-
- \xd2fb8baa1717cb02
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized
 -- 22 bytes key
 select encrypt('foo', '0123456789012345678901', '3des');
-  encrypt   
-
- \xa44360e699269817
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized
 -- decrypt
 select encode(decrypt(encrypt('foo', '0123456', '3des'), '0123456', '3des'), 'escape');
- encode 
-
- foo
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized
 -- iv
 select encrypt_iv('foo', '0123456', 'abcd', '3des');
- encrypt_iv 
-
- \x50735067b073bb93
-(1 row)
-
+ERROR:  encrypt_iv error: Cipher cannot be initialized
 select encode(decrypt_iv('\x50735067b073bb93', '0123456', 'abcd', '3des'), 'escape');
- encode 
-
- foo
-(1 row)
-
+ERROR:  decrypt_iv error: Cipher cannot be initialized
 -- long message
 select encrypt('Lets try a longer message.', '0123456789012345678901', '3des');
-  encrypt   
-
- \xb71e3422269d0ded19468f33d65cd663c28e0871984792a7b3ba0ddcecec8d2c
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized
 select encode(decrypt(encrypt('Lets try a longer message.', '0123456789012345678901', '3des'), '0123456789012345678901', '3des'), 'escape');
-   encode   
-
- Lets try a longer message.
-(1 row)
-
+ERROR:  encrypt error: Cipher cannot be initialized

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-05 Thread Jeff Davis

On Thu, 2023-10-05 at 14:04 -0500, Nathan Bossart wrote:
> IMHO neither of these problems seems insurmountable.  Besides
> advising
> against using hints as names, we could also automatically generate
> safe
> names, or even disallow user-provided names entirely.  

I'd like to see what this looks like as a user-interface. Using a name
seems weird because of the reasons Gurjeet mentioned.

Using a number seems weird to me because either:

 (a) if the number is always increasing you'd have to look to find the
number of the new slot to add and the old slot to remove; or
 (b) if switched between two numbers (say 0 and 1), it would be error
prone because you'd have to remember which is the old one that can be
safely replaced

Maybe a password is best described by its validity period rather than a
name? But what about passwords that don't expire?

> And adding
> observability for passwords seems worthwhile anyway.

That might be useful just to know whether a user's password is even
being used -- in case the admin makes a mistake and some other auth
method is being used. Also it would help to know when a password can
safely be removed.
> 

>   That way, we needn't restrict this feature to 2 passwords for
> everyone.  Perhaps 2 should be the default, but in any case, IMO we
> shouldn't design to only support 2.

Are there use cases for lots of passwords, or is it just a matter of
not introducing an artificial limitation?

Would it ever make sense to have a role that has two permanent
passwords, or would that be an abuse of this feature? Any use cases I
can think of would be better solved with multiple user that are part of
the same group.

> 
> I think this would be mighty confusing to users since it's not clear
> that
> adding a password will potentially invalidate a current password
> (which
> might be actively in use), but only if there are already 2 in the
> stack.  I
> worry that such a desіgn might be too closely tailored to the
> implementation details.  If we proceed with this design, perhaps we
> should
> consider ERROR-ing if a user tries to add a third password.

I agree that the proposed language is confusing, especially because ADD
causes a password to be added and another one to be removed. But
perhaps there are some non-confusing ways to expose a similar idea.

The thing I like about Gurjeet's proposal is that it's well-targeted at
a specific use case rather than trying to be too general. That makes it
a little easier to avoid certain problems like having a process that
adds passwords and never removes the old ones (leading to weird
problems like 47000 passwords for one user).

But it also feels strange to be limited to two -- perhaps the password
rotation schedule or policy just doesn't work with a limit of two, or
perhaps that introduces new kinds of mistakes.

Another idea: what if we introduce the notion of deprecating a
password? To remove a password, it would have to be deprecated first,
and maybe that would cause a LOG or WARNING message to be emitted when
used, or show up differently in some system view. And perhaps you could
have at most one non-deprecated password. That would give a workflow
something like (I'm not proposing these exact keywords):

  CREATE USER foo PASSWORD 'secret1';
  ALTER USER foo DEPRECATE PASSWORD; -- 'secret1' still works
  ALTER USER foo PASSWORD 'secret2'; -- 'secret1' or 'secret2' works
  ... fix some applications
  SET log_deprecated_password_use = WARNING;

  ...
  WARNING: deprecated password used for user 'foo'
  ... fix some applications you forgot about
  ... warnings quiet down
  ALTER USER foo DROP DEPRECATED PASSWORD; -- only 'secret2' works

If the user wants to un-deprecate a password (before they drop it, of
course), they can do something like:

  ALTER USER foo PASSWORD NULL; -- 'secret2' removed
  ALTER USER foo UNDEPRECATE PASSWORD; -- 'secret1' restored

if we allow multiple deprecated passwords, we'd still have to come up
with some way to address them (names, numbers, validity period,
something). But by isolating the problem to deprecated passwords only,
it feels like the system is still being restored to a clean state with
at most one single current password. The awkwardness is contained to
old passwords which will hopefully go away soon anyway and not
represent permanent clutter.

Regards,
Jeff Davis

Re: Allow tests to pass in OpenSSL FIPS mode

2023-10-05 Thread Tom Lane

Peter Eisentraut  writes:
> Continuing this, we have fixed many issues since.  Here is a patch set 
> to fix all remaining issues.

On the way to testing this, I discovered that we have a usability
regression with recent OpenSSL releases.  The Fedora 35 installation
I used to use for testing FIPS-mode behavior would produce errors like

 select md5('') = 'd41d8cd98f00b204e9800998ecf8427e' AS "TRUE";
- TRUE 
---
- t
-(1 row)
-
+ERROR:  could not compute MD5 hash: disabled for FIPS

In the shiny new Fedora 38 installation I just set up for the
same purpose, I'm seeing

 select md5('') = 'd41d8cd98f00b204e9800998ecf8427e' AS "TRUE";
- TRUE 
---
- t
-(1 row)
-
+ERROR:  could not compute MD5 hash: unsupported

This is less user-friendly; moreover it indicates that we're
going to get different output depending on the vintage of
OpenSSL we're testing against, which is going to be a pain for
expected-file maintenance.

I think we need to make an effort to restore the old output
if possible, although I grant that this may be mostly a whim
of OpenSSL's that we can't do much about.

The F35 installation has openssl 1.1.1q, where F38 has
openssl 3.0.9.

regards, tom lane

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Laurenz Albe

On Thu, 2023-10-05 at 14:58 +, Jon Erdman wrote:
> > > > For the proposal (this one is a bit Apple specific): because my team
> > > > offers managed postgres to our Apple-internal customers, many of whom
> > > > are not database experts, or at least not postgres experts, we'd like to
> > > > be able to toggle the availability of UNLOGGED tables in
> > > > postgresql.conf, so our less clueful users have fewer footguns.
> 
> Someone on linked-in suggested an event trigger, so now I'm thinking of 
> a custom extension that would do nothing but create said event trigger, 
> and maybe could be toggled with a customized setting (but that might 
> allow users to turn it off themselves...which is maybe ok).

An event trigger is the perfect solution for this requirement.

> If the extension were installed by the DBA user, the customer wouldn't 
> be able to drop it, and if we decided to give them an exception, we just 
> drop or disable the extension.

Right.  Also, only a superuser can create or drop event triggers.

> As a second more general question: could my original idea (i.e. sans 
> event trigger) be implemented in an extension somehow, or is that not 
> technically possible (I suspect not)?

You could perhaps use "object_access_hook" in an extension.

Yours,
Laurenz Albe

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Nico Williams

On Thu, Oct 05, 2023 at 03:49:37PM -0400, Tom Lane wrote:
> Nico Williams  writes:
> > Text+encoding can be just like bytea with a one- or two-byte prefix
> > indicating what codeset+encoding it's in.  That'd be how to encode
> > such text values on the wire, though on disk the column's type should
> > indicate the codeset+encoding, so no need to add a prefix to the value.
> 
> The precedent of BOMs (byte order marks) suggests strongly that
> such a solution would be horrible to use.

This is just how you encode the type of the string.  You have any number
of options.  The point is that already PG can encode binary data, so if
how to encode text of disparate encodings on the wire, building on top
of the encoding of bytea is an option.

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Tom Lane

Nico Williams  writes:
> Text+encoding can be just like bytea with a one- or two-byte prefix
> indicating what codeset+encoding it's in.  That'd be how to encode
> such text values on the wire, though on disk the column's type should
> indicate the codeset+encoding, so no need to add a prefix to the value.

The precedent of BOMs (byte order marks) suggests strongly that
such a solution would be horrible to use.

regards, tom lane

Re: Rights Control within DB (which SuperUser cannot access, but user can)

2023-10-05 Thread Tom Lane

Rajesh Mittal  writes:
> Is there a way, where an authorized user (Creates Table / Inserts Data) in
> a DB, which the SuperUser cannot access the same.
> I understand SuperUser can revoke the access of the user, but he should not
> be able to see the table structure and data inserted in those tables.

You might be able to do something with contrib/sepgsql, if you're
on a selinux-enabled platform.  But TBH the correct solution here
is to not give out superuser to people you don't trust.  There is
no way that you're likely to make an entirely bulletproof setup.
(Consider, just to begin with, how you're going to prevent a rogue
superuser from de-installing sepgsql, or even simply doing
"set role other_user".)

Also keep in mind that "prevent user A from seeing the structure
of user B's tables" is not part of Postgres' threat models at all.
Most system catalogs are world-readable, and you can't change that
without breaking an awful lot of tools.  If you don't like this,
a plausible answer is to give each user their own database.

regards, tom lane

Rights Control within DB (which SuperUser cannot access, but user can)

2023-10-05 Thread Rajesh Mittal

Hello,

Is there a way, where an authorized user (Creates Table / Inserts Data) in
a DB, which the SuperUser cannot access the same.

I understand SuperUser can revoke the access of the user, but he should not
be able to see the table structure and data inserted in those tables.

Best,

Rajesh

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Jeff Davis

On Thu, 2023-10-05 at 09:10 -0400, Isaac Morland wrote:
> In the case you describe, the users don’t have text at all; they have
> bytes, and a vague belief about what encoding the bytes might be in
> and therefore what characters they are intended to represent. The
> correct way to store that in the database is using bytea.

I wouldn't be so absolute. It's text data to the user, and is
presumably working fine for them now, and if they switched to bytea
today then 'foo' would show up as '\x666f6f' in psql.

The point is that this is a somewhat messy problem because there's so
much software out there that treats byte strings and textual data
interchangably. Rust goes the extra mile to organize all of this, and
it ends up with:

  * String -- always UTF-8, never NUL-terminated
  * CString -- NUL-terminated byte sequence with no internal NULs
  * OsString[3] -- needed to make a Path[4], which is needed to open a
file[5]
  * Vec -- any byte sequence

and I suppose we could work towards offering better support for these
different types, the casts between them, and delivering them in a form
the client can understand. But I wouldn't describe it as a solved
problem with one "correct" solution.

One takeaway from this discussion is that it would be useful to provide
more flexibility in how values are represented to the client in a more
general way. In addition to encoding, representational issues have come
up with binary formats, bytea, extra_float_digits, etc.

The collection of books by CJ Date & Hugh Darwen, et al. (sorry I don't
remember exactly which books), made the theoretical case for explicitly
distinguishing values from representations at the lanugage level. We're
starting to see that representational issues can't be satisfied with a
few special cases and hacks -- it's worth thinking about a general
solution to that problem. There was also a lot of relevant discussion
about how to think about overlapping domains (e.g. ASCII is valid in
any of these text domains).

>  Text types should be for when you know what characters you want to
> store. In this scenario, the implementation detail of what encoding
> the database uses internally to write the data on the disk doesn't
> matter, any more than it matters to a casual user how a table is
> stored on disk.

Perhaps the user and application do know, and there's some kind of
subtlety that we're missing, or some historical artefact that we're not
accounting for, and that somehow makes UTF-8 unsuitable. Surely there
are applications that treat certain byte sequences in non-standard
ways, and perhaps not all of those byte sequences can be reproduced by
transcoding from UTF-8 to the client_encoding. In any case, I would
want to understand in detail why a user thinks UTF8 is not good enough
before I make too strong of a statement here.

Even the terminal font that I use renders some "identical" unicode
characters slightly differently depending on the code points from which
they are composed. I believe that's an intentional convenience to make
it more apparent why the "diff" command (or other byte-based tool) is
showing a difference between two textually identical strings, but it's
also a violation of unicode. (This is another reason why normalization
might not be for everyone, but I believe it's still good in typical
cases.)

Regards,
Jeff Davis

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Nico Williams

On Thu, Oct 05, 2023 at 07:31:54AM -0400, Robert Haas wrote:
> [...] On the other hand, to do that in PostgreSQL, we'd need to
> propagate the character set/encoding information into all of the
> places that currently get the typmod and collation, and that is not a
> small number of places. It's a lot of infrastructure for the project
> to carry around for a feature that's probably only going to continue
> to become less relevant.

Text+encoding can be just like bytea with a one- or two-byte prefix
indicating what codeset+encoding it's in.  That'd be how to encode
such text values on the wire, though on disk the column's type should
indicate the codeset+encoding, so no need to add a prefix to the value.

Complexity would creep in around when and whether to perform automatic
conversions.  The easy answer would be "never, on the server side", but
on the client side it might be useful to convert to/from the locale's
codeset+encoding when displaying to the user or accepting user input.

If there's no automatic server-side codeset/encoding conversions then
the server-side cost of supporting non-UTF-8 text should not be too high
dev-wise -- it's just (famous last words) a generic text type
parameterized by codeset+ encoding type.  There would not even be a hard
need for functions for conversions, though there would be demand for
them.

But I agree that if there's no need, there's no need.  UTF-8 is great,
and if only all PG users would just switch then there's not much more to
do.

Nico
--

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-05 Thread Nathan Bossart

On Wed, Oct 04, 2023 at 10:41:15PM -0700, Gurjeet Singh wrote:
> The patches posted in this thread so far attempt to add the ability to
> allow the user to have an arbitrary number of passwords. I believe
> that allowing arbitrary number of passwords is not only unnecessary,
> but the need to name passwords, the need to store them in a shared
> catalog, etc. may actually create problems in the field. The
> users/admins will have to choose names for passwords, which they
> didn't have to previously. The need to name them may also lead to
> users storing password-hints in the password names (e.g. 'mom''s
> birthday', 'ex''s phone number', 'third password'), rendering the
> passwords weak.
> 
> Moreover, allowing an arbitrarily many number of passwords will
> require us to provide additional infrastructure to solve problems like
> observability (which passwords are currently in use, and which ones
> have been effectively forgotten by applications), or create a nuisance
> for admins that can create more problems than it solves.

IMHO neither of these problems seems insurmountable.  Besides advising
against using hints as names, we could also automatically generate safe
names, or even disallow user-provided names entirely.  And adding
observability for passwords seems worthwhile anyway.

> So I propose that the feature should allow no more than 2 passwords
> for a role, each with their own validity periods. This eliminates the
> need to name passwords, because at any given time there are no more
> than 2 passwords; current one, and old one. This also eliminates the
> need for a shared catalog to hold passwords, because with the limit of
> 2 imposed, we can store the old password and its validity period in
> additional columns in the pg_authid table.

Another approach could be to allow an abritrary number of passwords but to
also allow administrators to limit how many passwords can be associated to
each role.  That way, we needn't restrict this feature to 2 passwords for
everyone.  Perhaps 2 should be the default, but in any case, IMO we
shouldn't design to only support 2.

> In essence, we create a stack that can hold 2 passwords. Pushing an
> element when it's full will make it forget the bottom element. Popping
> the stack makes it forget the top element, and the only remaining
> element, if any, becomes the top.

I think this would be mighty confusing to users since it's not clear that
adding a password will potentially invalidate a current password (which
might be actively in use), but only if there are already 2 in the stack.  I
worry that such a desіgn might be too closely tailored to the
implementation details.  If we proceed with this design, perhaps we should
consider ERROR-ing if a user tries to add a third password.

> -- If, for some reason, the user wants to get rid of the latest password 
> added.
> -- Remove 'p2' (current password). The old password (p1), will be restored to
> -- rolpassword, along with its valid-until value.
> ALTER ROLE u1 PASSWORD NULL;
> -- This may come as a surprise to some users, because currently they expect 
> the
> -- user to completely lose the ability to use passwords for login after this
> -- command. To get the old behavior, the user must now use the ALL PASSWORD
> -- NULL incantation; see below.
> -- Issuing this command one more time will remove even the restored password,
> -- hence leaving the user with no passwords.

Is it possible to remove the oldest password added without removing the
latest password added?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: pg16: invalid page/page verification failed

2023-10-05 Thread Justin Pryzby

On Thu, Oct 05, 2023 at 07:16:31PM +0200, Matthias van de Meent wrote:
> On Thu, 5 Oct 2023 at 18:48, Justin Pryzby  wrote:
> >
> > On an instance running pg16.0:
> >
> > log_time | 2023-10-05 10:03:00.014-05
> > backend_type | autovacuum worker
> > left | page verification failed, calculated checksum 5074 but 
> > expected 5050
> > context  | while scanning block 119 of relation 
> > "public.postgres_log_2023_10_05_0900"
> >
> > This is the only error I've seen so far, and for all I know there's a
> > issue on the storage behind the VM, or a cosmic ray hit.  But I moved
> > the table out of the way and saved a copy of get_raw_page() in case
> > someone wants to ask about it.
> >
> > postgres=# SELECT * FROM 
> > heap_page_item_attrs(get_raw_page(801594131::regclass::text, 119), 
> > 801594131);
> >  lp  | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
> > t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_attrs
> >1 |   2304 |1 | 16 |||  ||   
> >   ||||   |
> >2 |   8160 |1 | 16 |||  ||   
> >   ||||   |
> >3 |   8144 |1 | 16 |||  ||   
> >   ||||   |
> > ...all the same except for lp_off...
> >  365 |   2352 |1 | 16 |||  ||   
> >   ||||   |
> >  366 |   2336 |1 | 16 |||  ||   
> >   ||||   |
> >  367 |   2320 |1 | 16 |||  ||   
> >   ||||   |
> 
> That's not a HEAP page; it looks more like a btree page: lp_len is too
> short for heap (which starts at lp_len = 24), and there are too many
> line pointers for an 8KiB heap page. btree often has lp_len of 16: 8
> bytes indextuple header, one maxalign of data (e.g. int or bigint).
> 
> So, assuming it's a block of a different relation kind, then it's also
> likely it was originally located elsewhere in that other relation,
> indeed causing the checksum failure. You can further validate this by
> looking at the page header's pd_special value - if it is 8176, that'd
> be another indicator for it being a btree.

Nice point.

postgres=# SET ignore_checksum_failure=on; SELECT * FROM 
generate_series(115,119) AS a, 
page_header(get_raw_page(801594131::regclass::text, a)) AS b;
WARNING:  page verification failed, calculated checksum 5074 but expected 5050
  a  | lsn  | checksum | flags | lower | upper | special | pagesize | 
version | prune_xid 
-+--+--+---+---+---+-+--+-+---
 115 | B61/A9436C8  |   -23759 | 4 |92 |   336 |8192 | 8192 |   
4 | 0
 116 | B61/A944FA0  | 3907 | 4 |   104 |   224 |8192 | 8192 |   
4 | 0
 117 | B61/A946828  |   -24448 | 4 |76 |   264 |8192 | 8192 |   
4 | 0
 118 | B61/A94CCE0  |26915 | 4 |28 |  6256 |8192 | 8192 |   
4 | 0
 119 | B5C/9F30D1C8 | 5050 | 0 |  1492 |  2304 |8176 | 8192 |   
4 | 0

The table itself has a few btree indexes on text columns and a brin
index on log_timestamp, but not on the integers.

It sounds like it's what's expected at this point, but after I
"SET ignore_checksum_failure=on", and read the page in, vacuum kicked
off and then crashed (in heap_page_prune() if that half of the stack
trace can be trusted).

*** stack smashing detected ***: postgres: autovacuum worker postgres terminated

< 2023-10-05 12:35:30.764 CDT  >LOG:  server process (PID 30692) was terminated 
by signal 11: Segmentation fault
< 2023-10-05 12:35:30.764 CDT  >DETAIL:  Failed process was running: 
autovacuum: VACUUM ANALYZE public.BROKEN_postgres_log_2023_10_05_0900

I took the opportunity to fsck the FS, which showed no errors.

I was curious if the relfilenodes had gotten confused/corrupted/??
But this seems to indicate not; the problem is only one block.

postgres=# SELECT oid, relfilenode, oid=relfilenode, relname FROM pg_class 
WHERE oid BETWEEN 80155 AND 801594199 ORDER BY 1;
oid| relfilenode | ?column? | relname   
  
---+-+--+-
 801564542 |   801564542 | t| postgres_log_2023_10_05_0800
 801564545 |   801564545 | t| pg_toast_801564542
 801564546 |   801564546 | t| pg_toast_801564542_index
 801564547 |   801564547 | t| postgres_log_2023_10_05_0800_log_time_idx
 801564548 |   801564548 | t| 
postgres_log_2023_10_05_0800_error_severity_idx
 801564549 |   801564549 | t|

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-05 Thread Nathan Bossart

On Thu, Oct 05, 2023 at 05:51:31PM +0530, Bharath Rupireddy wrote:
> I think changing BGWORKER_BYPASS_ALLOWCONN to 0x0001 and having bits32
> for InitPostgres inputs load_session_libraries and
> override_allow_connections can be 0001 in this patch series so that it
> can go first, no? This avoids the new code being in the old format and
> changing things right after it commits.

+1

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: New WAL record to detect the checkpoint redo location

2023-10-05 Thread Andres Freund

Hi,

On 2023-10-02 10:42:37 -0400, Robert Haas wrote:
> I was trying to think of a test case where XLogInsertRecord would be
> exercised as heavily as possible, so I really wanted to generate a lot
> of WAL while doing as little real work as possible. The best idea that
> I had was to run pg_create_restore_point() in a loop.

What I use for that is pg_logical_emit_message(). Something like

SELECT count(*)
FROM
(
SELECT pg_logical_emit_message(false, '1', 'short'), generate_series(1, 
1)
);

run via pgbench does seem to exercise that path nicely.

> One possible conclusion is that the differences here aren't actually
> big enough to get stressed about, but I don't want to jump to that
> conclusion without investigating the competing hypothesis that this
> isn't the right way to test this, and that some better test would show
> clearer results. Suggestions?

I saw some small differences in runtime running pgbench with the above query,
with a single client. Comparing profiles showed a surprising degree of
difference. That turns out to mostly a consequence of the fact that
ReserveXLogInsertLocation() isn't inlined anymore, because there now are two
callers of the function in XLogInsertRecord().

Unfortunately, I still see a small performance difference after that. To get
the most reproducible numbers, I disable turbo boost, bound postgres to one
cpu core, bound pgbench to another core. Over a few runs I quite reproducibly
get ~319.323 tps with your patches applied (+ always inline), and ~324.674
with master.

If I add an unlikely around if (rechdr->xl_rmid == RM_XLOG_ID), the
performance does improve. But that "only" brings it up to 322.406. Not sure
what the rest is.

One thing that's notable, but not related to the patch, is that we waste a
fair bit of cpu time below XLogInsertRecord() with divisions. I think they're
all due to the use of UsableBytesInSegment in
XLogBytePosToRecPtr/XLogBytePosToEndRecPtr.  The multiplication of
XLogSegNoOffsetToRecPtr() also shows.

Greetings,

Andres Freund

Re: Annoying build warnings from latest Apple toolchain

2023-10-05 Thread Tom Lane

Andres Freund  writes:
> I think you can just pass c_args directly to executable() here, I think adding
> c_args to default_bin_args would be a bad idea.

Hm.  IIUC that would result in an error if someone did try to
put some c_args into default_bin_args, and I didn't think it would
be appropriate for this code to fail in such a case.  Still, I see
there are a bunch of other ways to inject globally-used compilation
flags, so maybe you're right that it'd never need to happen.

regards, tom lane

Re: Annoying build warnings from latest Apple toolchain

2023-10-05 Thread Andres Freund

Hi,

On 2023-10-05 13:17:25 -0400, Tom Lane wrote:
> I wrote:
> > So I experimented with fixing things so that the versions of these
> > functions exported by libpq have physically different names from those
> > that you'd get from linking to libpgcommon.a or libpgcommon_srv.a.
> > Then, there's certainty about which one a given usage will link to,
> > based on what the #define environment is when the call is compiled.

I think that's a good plan. IIRC I previously complained about the symbols
existing in multiple places...  Don't remember the details, IIRCI I saw
warnings about symbol conflicts in extensions using libpq.


> > This leads to a pleasingly small patch, at least in the Makefile
> > universe (I've not attempted to sync the meson or MSVC infrastructure
> > with this yet).
> 
> Here's a v2 that addresses the meson infrastructure as well.
> (This is my first attempt at writing meson code, so feel free
> to critique.)  I propose not bothering to fix src/tools/msvc/,
> on the grounds that (a) that infrastructure will likely be gone
> before v17 ships, and (b) the latent problem with version
> skew between libpq.dll and calling programs seems much less
> likely to matter in the Windows world in the first place.

Makes sense.


> +# Note: it's important that we link to encnames.o from libpgcommon, not
> +# from libpq, else we have risks of version skew if we run with a libpq
> +# shared library from a different PG version.  Define
> +# USE_PRIVATE_ENCODING_FUNCS to ensure that that happens.
> +c_args = default_bin_args.get('c_args', []) + 
> ['-DUSE_PRIVATE_ENCODING_FUNCS']
> +
>  initdb = executable('initdb',
>initdb_sources,
>include_directories: [timezone_inc],
>dependencies: [frontend_code, libpq, icu, icu_i18n],
> -  kwargs: default_bin_args,
> +  kwargs: default_bin_args + {
> +'c_args': c_args,
> +  },

I think you can just pass c_args directly to executable() here, I think adding
c_args to default_bin_args would be a bad idea.

Greetings,

Andres Freund

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Jeff Davis

On Thu, 2023-10-05 at 07:31 -0400, Robert Haas wrote:
> It's a lot of infrastructure for the project to carry
> around for a feature that's probably only going to continue to become
> less relevant.

Agreed, at least until we understand the set of users per-column
encoding is important to. I acknowledge that the presence of per-column
encoding in the standard is some kind of signal there, but not enough
by itself to justify something so invasive.

> I suppose you never know, though.

On balance I think it's better to keep the code clean enough that we
can adapt to whatever unanticipated things happen in the future; rather
than to make the code very complicated trying to anticipate everything,
and then being completely unable to adapt it when something
unanticipated happens anyway.

Regards,
Jeff Davis

Re: Annoying build warnings from latest Apple toolchain

2023-10-05 Thread Tom Lane

I wrote:
> So I experimented with fixing things so that the versions of these
> functions exported by libpq have physically different names from those
> that you'd get from linking to libpgcommon.a or libpgcommon_srv.a.
> Then, there's certainty about which one a given usage will link to,
> based on what the #define environment is when the call is compiled.

> This leads to a pleasingly small patch, at least in the Makefile
> universe (I've not attempted to sync the meson or MSVC infrastructure
> with this yet).

Here's a v2 that addresses the meson infrastructure as well.
(This is my first attempt at writing meson code, so feel free
to critique.)  I propose not bothering to fix src/tools/msvc/,
on the grounds that (a) that infrastructure will likely be gone
before v17 ships, and (b) the latent problem with version
skew between libpq.dll and calling programs seems much less
likely to matter in the Windows world in the first place.

Barring comments or CI failures, I intend to push this soon.

regards, tom lane

diff --git a/src/bin/initdb/Makefile b/src/bin/initdb/Makefile
index d69bd89572..e80e57e457 100644
--- a/src/bin/initdb/Makefile
+++ b/src/bin/initdb/Makefile
@@ -16,13 +16,12 @@ subdir = src/bin/initdb
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-override CPPFLAGS := -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(ICU_CFLAGS) $(CPPFLAGS)
-
 # Note: it's important that we link to encnames.o from libpgcommon, not
 # from libpq, else we have risks of version skew if we run with a libpq
-# shared library from a different PG version.  The libpq_pgport macro
-# should ensure that that happens.
-#
+# shared library from a different PG version.  Define
+# USE_PRIVATE_ENCODING_FUNCS to ensure that that happens.
+override CPPFLAGS := -DUSE_PRIVATE_ENCODING_FUNCS -I$(libpq_srcdir) -I$(top_srcdir)/src/timezone $(ICU_CFLAGS) $(CPPFLAGS)
+
 # We need libpq only because fe_utils does.
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(ICU_LIBS)
 
diff --git a/src/bin/initdb/meson.build b/src/bin/initdb/meson.build
index 49743630aa..28a2df99e4 100644
--- a/src/bin/initdb/meson.build
+++ b/src/bin/initdb/meson.build
@@ -7,19 +7,25 @@ initdb_sources = files(
 
 initdb_sources += timezone_localtime_source
 
-#fixme: reimplement libpq_pgport logic
-
 if host_system == 'windows'
   initdb_sources += rc_bin_gen.process(win32ver_rc, extra_args: [
 '--NAME', 'initdb',
 '--FILEDESC', 'initdb - initialize a new database cluster',])
 endif
 
+# Note: it's important that we link to encnames.o from libpgcommon, not
+# from libpq, else we have risks of version skew if we run with a libpq
+# shared library from a different PG version.  Define
+# USE_PRIVATE_ENCODING_FUNCS to ensure that that happens.
+c_args = default_bin_args.get('c_args', []) + ['-DUSE_PRIVATE_ENCODING_FUNCS']
+
 initdb = executable('initdb',
   initdb_sources,
   include_directories: [timezone_inc],
   dependencies: [frontend_code, libpq, icu, icu_i18n],
-  kwargs: default_bin_args,
+  kwargs: default_bin_args + {
+'c_args': c_args,
+  },
 )
 bin_targets += initdb
 
diff --git a/src/common/Makefile b/src/common/Makefile
index cc5c54dcee..70884be00c 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -140,6 +140,13 @@ libpgcommon.a: $(OBJS_FRONTEND)
 	rm -f $@
 	$(AR) $(AROPT) $@ $^
 
+#
+# Files in libpgcommon.a should use/export the "xxx_private" versions
+# of pg_char_to_encoding() and friends.
+#
+$(OBJS_FRONTEND): CPPFLAGS += -DUSE_PRIVATE_ENCODING_FUNCS
+
+
 #
 # Shared library versions of object files
 #
diff --git a/src/common/meson.build b/src/common/meson.build
index 3b97497d1a..ae05ac63cf 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -112,8 +112,8 @@ common_sources_frontend_static += files(
   'logging.c',
 )
 
-# Build pgport once for backend, once for use in frontend binaries, and once
-# for use in shared libraries
+# Build pgcommon once for backend, once for use in frontend binaries, and
+# once for use in shared libraries
 #
 # XXX: in most environments we could probably link_whole pgcommon_shlib
 # against pgcommon_static, instead of compiling twice.
@@ -131,6 +131,9 @@ pgcommon_variants = {
   '': default_lib_args + {
 'sources': common_sources_frontend_static,
 'dependencies': [frontend_common_code],
+# Files in libpgcommon.a should use/export the "xxx_private" versions
+# of pg_char_to_encoding() and friends.
+'c_args': ['-DUSE_PRIVATE_ENCODING_FUNCS'],
   },
   '_shlib': default_lib_args + {
 'pic': true,
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index 25276b199f..29cd5732f1 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -13,6 +13,9 @@
  *		included by libpq client programs.  In particular, a libpq client
  *		should not assume that the encoding IDs used by the version of libpq
  *		it's linked to match up with the IDs declared here.
+ *

Re: pg16: invalid page/page verification failed

2023-10-05 Thread Matthias van de Meent

On Thu, 5 Oct 2023 at 18:48, Justin Pryzby  wrote:
>
> On an instance running pg16.0:
>
> log_time | 2023-10-05 10:03:00.014-05
> backend_type | autovacuum worker
> left | page verification failed, calculated checksum 5074 but 
> expected 5050
> context  | while scanning block 119 of relation 
> "public.postgres_log_2023_10_05_0900"
>
> This is the only error I've seen so far, and for all I know there's a
> issue on the storage behind the VM, or a cosmic ray hit.  But I moved
> the table out of the way and saved a copy of get_raw_page() in case
> someone wants to ask about it.
>
> postgres=# SELECT * FROM 
> heap_page_item_attrs(get_raw_page(801594131::regclass::text, 119), 801594131);
>  lp  | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
> t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_attrs
>1 |   2304 |1 | 16 |||  || 
> ||||   |
>2 |   8160 |1 | 16 |||  || 
> ||||   |
>3 |   8144 |1 | 16 |||  || 
> ||||   |
> ...all the same except for lp_off...
>  365 |   2352 |1 | 16 |||  || 
> ||||   |
>  366 |   2336 |1 | 16 |||  || 
> ||||   |
>  367 |   2320 |1 | 16 |||  || 
> ||||   |

That's not a HEAP page; it looks more like a btree page: lp_len is too
short for heap (which starts at lp_len = 24), and there are too many
line pointers for an 8KiB heap page. btree often has lp_len of 16: 8
bytes indextuple header, one maxalign of data (e.g. int or bigint).

So, assuming it's a block of a different relation kind, then it's also
likely it was originally located elsewhere in that other relation,
indeed causing the checksum failure. You can further validate this by
looking at the page header's pd_special value - if it is 8176, that'd
be another indicator for it being a btree.

Kind regards,

Matthias van de Meent.

pg16: invalid page/page verification failed

2023-10-05 Thread Justin Pryzby

On an instance running pg16.0:

log_time | 2023-10-05 10:03:00.014-05
backend_type | autovacuum worker
left | page verification failed, calculated checksum 5074 but 
expected 5050
context  | while scanning block 119 of relation 
"public.postgres_log_2023_10_05_0900"

This is the only error I've seen so far, and for all I know there's a
issue on the storage behind the VM, or a cosmic ray hit.  But I moved
the table out of the way and saved a copy of get_raw_page() in case
someone wants to ask about it.

 public | BROKEN_postgres_log_2023_10_05_0900 | table | postgres | permanent   
| heap | 1664 kB

This table is what it sounds like: a partition into which CSV logs are
COPY'ed.  It would've been created around 8am.  There's no special
params set for the table nor for autovacuum.

Although we have a ZFS tablespace, these tables aren't on it, and
full_page_writes=on.  There's no crashes, and the instance has been up
since it was pg_upgraded from v15.4 on sep25.

pg_stat_all_tables indicates that the table was never (successfully)
vacuumed.

This was compiled to RPM on centos7, and might include a few commits
made since v16.0.

postgres=# SELECT * FROM 
heap_page_item_attrs(get_raw_page(801594131::regclass::text, 119), 801594131);
 lp  | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_attrs 
   1 |   2304 |1 | 16 |||  ||   
  ||||   | 
   2 |   8160 |1 | 16 |||  ||   
  ||||   | 
   3 |   8144 |1 | 16 |||  ||   
  ||||   | 
...all the same except for lp_off...
 365 |   2352 |1 | 16 |||  ||   
  ||||   | 
 366 |   2336 |1 | 16 |||  ||   
  ||||   | 
 367 |   2320 |1 | 16 |||  ||   
  ||||   | 

postgres=# SELECT FROM (SELECT tuple_data_split(801594131, t_data, t_infomask, 
t_infomask2, t_bits) a FROM 
heap_page_items(get_raw_page(801594131::regclass::text, 119))) WHERE a IS NOT 
NULL;
WARNING:  page verification failed, calculated checksum 5074 but expected 5050
(0 rows)

-- 
Justin

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-05 Thread Bharath Rupireddy

On Thu, Oct 5, 2023 at 9:32 PM Drouvot, Bertrand
 wrote:
>
> +  CREATE ROLE nologrole with nologin;
> +  GRANT CREATE ON DATABASE mydb TO nologrole;

A few nit-picks:

1. s/with/WITH
2. s/nologin/NOLOGIN
3. +   is specified as flags it is possible to
bypass the login check to connect to databases.
How about "it is possible to bypass the login check for the role used
to connect to databases."?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-05 Thread Drouvot, Bertrand


Hi,

On 10/5/23 2:21 PM, Bharath Rupireddy wrote:

On Thu, Oct 5, 2023 at 12:22 PM Drouvot, Bertrand
 wrote:



A comment on v6-0002:
1.
+  CREATE ROLE nologrole with nologin;
+  ALTER ROLE nologrole with superuser;
+]);
We don't need superuser privileges here, do we? Or do we need it for
the worker_spi to access pg_catalog and stuff in worker_spi_main? If
not, can we remove it to showcase non-superusers requesting bg
workers?


superuser is not needed here.
I removed it but had to change it in v7 attached to:

+  CREATE ROLE nologrole with nologin;
+  GRANT CREATE ON DATABASE mydb TO nologrole;

To avoid things like:

"
2023-10-05 15:59:39.189 UTC [2830732] LOG:  worker_spi dynamic worker 13 
initialized with schema13.counted
2023-10-05 15:59:39.191 UTC [2830732] ERROR:  permission denied for database 
mydb
2023-10-05 15:59:39.191 UTC [2830732] CONTEXT:  SQL statement "CREATE SCHEMA "schema13" 
CREATE TABLE "counted"
"

Regards,
 
--

Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 6dd6cdb14646cb32f4aedc1905229a8b5c1d215c Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 5 Oct 2023 14:06:37 +0900
Subject: [PATCH v7 2/2] Allow background workers to bypass login check

This adds a new flag value (BGWORKER_BYPASS_ROLELOGINCHECK) to the flags
being used in BackgroundWorkerInitializeConnection() and
BackgroundWorkerInitializeConnectionByOid().  This flag allows the
background workers to bypass the login check done in
InitializeSessionUserId().
---
 doc/src/sgml/bgworker.sgml|  3 +-
 src/backend/bootstrap/bootstrap.c |  2 +-
 src/backend/postmaster/autovacuum.c   |  5 +--
 src/backend/postmaster/postmaster.c   |  2 +
 src/backend/tcop/postgres.c   |  1 +
 src/backend/utils/init/miscinit.c |  4 +-
 src/backend/utils/init/postinit.c |  5 ++-
 src/include/miscadmin.h   |  4 +-
 src/include/postmaster/bgworker.h |  6 ++-
 .../modules/worker_spi/t/001_worker_spi.pl| 37 +++
 src/test/modules/worker_spi/worker_spi.c  |  2 +
 11 files changed, 59 insertions(+), 12 deletions(-)
   9.1% doc/src/sgml/
  10.7% src/backend/postmaster/
  15.1% src/backend/utils/init/
   9.4% src/include/postmaster/
  47.5% src/test/modules/worker_spi/t/
   3.1% src/test/modules/worker_spi/
   4.8% src/

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index 9ad1146ba0..7335c02ed8 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -200,7 +200,8 @@ typedef struct BackgroundWorker
InvalidOid, the process will run as the superuser created
during initdb. If 
BGWORKER_BYPASS_ALLOWCONN
is specified as flags it is possible to bypass the 
restriction
-   to connect to databases not allowing user connections.
+   to connect to databases not allowing user connections. If 
BGWORKER_BYPASS_ROLELOGINCHECK
+   is specified as flags it is possible to bypass the login 
check to connect to databases.
A background worker can only call one of these two functions, and only
once.  It is not possible to switch databases.
   
diff --git a/src/backend/bootstrap/bootstrap.c 
b/src/backend/bootstrap/bootstrap.c
index 5810f8825e..bdd005b66a 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -345,7 +345,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
if (pg_link_canary_is_frontend())
elog(ERROR, "backend is incorrectly linked to frontend 
functions");
 
-   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, false, NULL);
+   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, false, false, 
NULL);
 
/* Initialize stuff for bootstrap-file processing */
for (i = 0; i < MAXATTR; i++)
diff --git a/src/backend/postmaster/autovacuum.c 
b/src/backend/postmaster/autovacuum.c
index ae9be9b911..4686b81f68 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -488,7 +488,7 @@ AutoVacLauncherMain(int argc, char *argv[])
/* Early initialization */
BaseInit();
 
-   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, false, NULL);
+   InitPostgres(NULL, InvalidOid, NULL, InvalidOid, false, false, false, 
NULL);
 
SetProcessingMode(NormalProcessing);
 
@@ -1706,8 +1706,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 * Note: if we have selected a just-deleted database (due to 
using
 * stale stats info), we'll fail and exit here.
 */
-   InitPostgres(NULL, dbid, NULL, InvalidOid, false, false,
-dbname);
+   InitPostgres(NULL, dbid, NULL, InvalidOid, false, false, false, 
dbname);
SetProcessingMode(NormalProcessing);
set_ps_display(dbname);
ereport(DEBUG1,

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Julien Rouhaud

On Thu, Oct 5, 2023 at 11:11 PM Jon Erdman  wrote:
>
> As a second more general question: could my original idea (i.e. sans
> event trigger) be implemented in an extension somehow, or is that not
> technically possible (I suspect not)?

It should be easy to do using the ProcessUtility_hook hook, defined in
a custom module written in C.  As long as your module is preloaded
(one of the *_preload_libraries GUC), your code will be called without
the need for any SQL-level object and you would be free to add any
custom GUC you want to enable it on a per-user basis or anything else.

Re: tablecmds.c/MergeAttributes() cleanup

2023-10-05 Thread Peter Eisentraut


On 19.09.23 15:11, Peter Eisentraut wrote:
Here is an updated version of this patch set.  I resolved some conflicts 
and addressed this comment of yours.  I also dropped the one patch with 
some catalog header edits that people didn't seem to particularly like.


The patches that are now 0001 through 0004 were previously reviewed and 
just held for the not-null constraint patches, I think, so I'll commit 
them in a few days if there are no objections.


Patches 0005 through 0007 are also ready in my opinion, but they haven't 
really been reviewed, so this would be something for reviewers to focus 
on.  (0005 and 0007 are trivial, but they go to together with 0006.)


The remaining 0008 and 0009 were still under discussion and contemplation.


I have committed through 0007, and I'll now close this patch set as 
"Committed", and I will (probably) bring back the rest (especially 0008) 
as part of a different patch set soon.

Re: Remove distprep

2023-10-05 Thread Peter Eisentraut


On 26.09.23 06:49, Michael Paquier wrote:

On Wed, Aug 23, 2023 at 12:46:45PM +0200, Peter Eisentraut wrote:

Apparently, the headerscheck and cpluspluscheck targets still didn't work
right in the Cirrus jobs.  Here is an updated patch to address that.  This
is also rebased over some recent changes that affected this patch (generated
wait events stuff).


-gettext-files: distprep
+gettext-files: postgres

This bit in src/backend/nls.mk does not seem right to me.  The
following sequence works on HEAD, but breaks with the patch because
the files that should be automatically generated from the perl scripts
aren't anymore:
$ ./configure ...
$ make -C src/backend/ gettext-files
[...]
In file included from ../../../../src/include/postgres.h:46,
from brin.c:16:
../../../../src/include/utils/elog.h:79:10: fatal error:
utils/errcodes.h: No such file or directory
79 | #include "utils/errcodes.h"


Ok, I think I found a better way to address this.  It requires keeping a 
subset of the old distprep target in src/backend/Makefile for use by 
nls.mk.  I have checked that the above sequence now works, and that the 
generated .pot files are identical to before this patch.



# Technically, this should depend on Makefile.global, but then
# version.sgml would need to be rebuilt after every configure run,
# even in distribution tarballs.  So this is cheating a bit, but it

There is this comment in doc/src/sgml/Makefile.  Does it still apply?


I have removed the "even in distribution tarballs" bit, but the general 
idea is still valid I think.



Note that building PostgreSQL from the source
repository requires reasonably up-to-date versions of 
bison,
flex, and Perl.
These tools are not needed to build from a distribution tarball, because
the files generated with these tools are included in the tarball.
Other tool requirements

This paragraph exists in sourcerepo.sgml, but it should be updated, I
guess, because now these three binaries would be required when
building from a tarball.


I have removed that paragraph.


# specparse.c and specscanner.c are in the distribution tarball,
# so do not clean them here

This comment in src/test/isolation/Makefile should be removed.


done

The attached updated patch is also split up like Andres suggested 
nearby.  (Not sure if it would be good to commit it that way, but it's 
easier to look at for now for sure.)
From 9b4504ce0e247b840b8cef424e44d922bc7cde88 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 5 Oct 2023 17:39:24 +0200
Subject: [PATCH v5 1/2] Make bison, flex, perl hard build requirements

... see next patch
---
 config/perl.m4 |  9 +---
 config/programs.m4 | 21 ++---
 configure  | 62 --
 doc/src/sgml/installation.sgml | 80 +++---
 doc/src/sgml/sourcerepo.sgml   | 10 -
 src/Makefile.global.in | 16 +--
 6 files changed, 52 insertions(+), 146 deletions(-)

diff --git a/config/perl.m4 b/config/perl.m4
index 8126e79f67..1a3bb55649 100644
--- a/config/perl.m4
+++ b/config/perl.m4
@@ -13,19 +13,14 @@ if test "$PERL"; then
   if echo "$pgac_perl_version" | sed ['s/[.a-z_]/ /g'] | \
 $AWK '{ if ([$]1 == 5 && ([$]2 >= 14)) exit 1; else exit 0;}'
   then
-AC_MSG_WARN([
+AC_MSG_ERROR([
 *** The installed version of Perl, $PERL, is too old to use with PostgreSQL.
 *** Perl version 5.14 or later is required, but this is $pgac_perl_version.])
-PERL=""
   fi
 fi
 
 if test -z "$PERL"; then
-  AC_MSG_WARN([
-*** Without Perl you will not be able to build PostgreSQL from Git.
-*** You can obtain Perl from any CPAN mirror site.
-*** (If you are using the official distribution of PostgreSQL then you do not
-*** need to worry about this, because the Perl output is pre-generated.)])
+  AC_MSG_ERROR([Perl not found])
 fi
 ])# PGAC_PATH_PERL
 
diff --git a/config/programs.m4 b/config/programs.m4
index 8a118b4e03..490ec9fe9f 100644
--- a/config/programs.m4
+++ b/config/programs.m4
@@ -33,10 +33,9 @@ if test "$BISON"; then
   AC_MSG_NOTICE([using $pgac_bison_version])
   if echo "$pgac_bison_version" | $AWK '{ if ([$]4 < 2.3) exit 0; else exit 
1;}'
   then
-AC_MSG_WARN([
+AC_MSG_ERROR([
 *** The installed version of Bison, $BISON, is too old to use with PostgreSQL.
 *** Bison version 2.3 or later is required, but this is $pgac_bison_version.])
-BISON=""
   fi
   # Bison >=3.0 issues warnings about %name-prefix="base_yy", instead
   # of the now preferred %name-prefix "base_yy", but the latter
@@ -49,12 +48,7 @@ if test "$BISON"; then
 fi
 
 if test -z "$BISON"; then
-  AC_MSG_WARN([
-*** Without Bison you will not be able to build PostgreSQL from Git nor
-*** change any of the parser definition files.  You can obtain Bison from
-*** a GNU mirror site.  (If you are using the official distribution of
-*** PostgreSQL then you do not need to worry about this, because the Bison
-*** output is pre-generated.)])
+

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Jon Erdman



On 10/5/23 8:53 AM, Tom Lane wrote:
> Laurenz Albe  writes:
>> On Thu, 2023-10-05 at 02:22 +, Jon Erdman wrote:
>>> For the proposal (this one is a bit Apple specific): because my team
>>> offers managed postgres to our Apple-internal customers, many of whom
>>> are not database experts, or at least not postgres experts, we'd like to
>>> be able to toggle the availability of UNLOGGED tables in
>>> postgresql.conf, so our less clueful users have fewer footguns.
> 
> I'm doubtful that this is a problem that needs a solution.
> If anything, the right answer is to fix whatever part of the
> documentation isn't warning of the hazards strongly enough.
> 
> Even more to the point: if we accept this, how many other
> footgun-preventing GUCs will have the same or stronger claim to
> existence?
> 
>> It certainly sounds harmless, but there are two things that make me
>> unhappy about this:
> 
>> - Yet another GUC.  It's not like we don't have enough of them.
>>(This is a small quibble.)
> 
>> - This setting would influence the way SQL is processed.
>>We have had bad experiences with those; an often-quoted example is
>>the "autocommit" parameter that got removed in 7.4.
>>This certainly is less harmfuls, but still another pitfall that
>>can confuse users.
> 
> Same objections here.  Also note that the semantics we've defined
> for GUCs (when they can be set and where) don't always line up
> nicely with requirements of this sort.  It's far from clear to me
> whether such a GUC should be SUSET (making it a hard prohibition
> for ordinary users) or USERSET (making it just a training wheel).

Someone on linked-in suggested an event trigger, so now I'm thinking of 
a custom extension that would do nothing but create said event trigger, 
and maybe could be toggled with a customized setting (but that might 
allow users to turn it off themselves...which is maybe ok).

If the extension were installed by the DBA user, the customer wouldn't 
be able to drop it, and if we decided to give them an exception, we just 
drop or disable the extension.

As a second more general question: could my original idea (i.e. sans 
event trigger) be implemented in an extension somehow, or is that not 
technically possible (I suspect not)?
-- 
Jon Erdman (aka StuckMojo on IRC)
 PostgreSQL Zealot

Two Window aggregate node for logically same over clause

2023-10-05 Thread "Anitha S"

Hi team,

      We have observed that for logically same over clause two different window 
aggregate nodes are created in plan.  
The below query contains two window functions. Both Over clause contain the 
same partition & order clause in it. But in one over clause ordering option is 
mentioned as ascending but not in another over clause which represents the 
default option "ascending". 

 
Sample Query:
 







 CREATE TEMPORARY TABLE empsalary (

 depname varchar,

 empno bigint,

 salary int,

 enroll_date date

 );



 INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 1, 
5000, '2006-10-01'),

 ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, 
'2007-08-08'),('personnel', 2, 3900, '2006-12-23'),

 ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, 
'2008-01-01'),('sales', 3, 4800, '2007-08-01'),

 ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');



 explain verbose select rank() over (partition by depname order by salary),  
percent_rank() over(partition by depname order by salary asc) from empsalary;

  QUERY PLAN

 
--

 WindowAgg  (cost=174.54..1000114.66 rows=1070 width=52)

 Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname

  ->  WindowAgg  (cost=174.54..195.94 rows=1070 width=44)

  Output: salary, depname, rank() OVER (?)

->  Sort  (cost=174.54..177.21 rows=1070 width=36)

Output: salary, depname

Sort Key: empsalary.depname, empsalary.salary

   ->  Seq Scan on pg_temp_7.empsalary  (cost=0.00..20.70 
rows=1070 width=36)

   Output: salary, depname 





Ordering option of Sort is represented by enum SortByDir (parsenodes.h). 

List of SortBy is present in WindowDef structure which has info of order by 
clause in Over clause







 /* Sort ordering options for ORDER BY and CREATE INDEX */ 

 typedef enum SortByDir

 {

  SORTBY_DEFAULT,

  SORTBY_ASC,

  SORTBY_DESC,

  SORTBY_USING /* not allowed in CREATE INDEX ... */

 } SortByDir;  





 typedef struct SortBy

 {

  NodeTag type;

  Node *node; /* expression to sort on */

  SortByDir sortby_dir; /* ASC/DESC/USING/default */

  SortByNulls sortby_nulls; /* NULLS FIRST/LAST */

  List *useOp; /* name of op to use, if SORTBY_USING */

  int location; /* operator location, or -1 if none/unknown */

 } SortBy;  





In transformWindowFuncCall API, Equality check of order clause in window 
definition failed while comparing SortByDir enum of both over clause i.e 
SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created in 
parse tree resulting in the creation of two different window aggregate node.

This check can be modified to form a single window aggregate node for the above 
results in faster query execution. Is there any reason for creating two 
different window aggregate node?

Thanks 
Anitha S

Re: Allow tests to pass in OpenSSL FIPS mode

2023-10-05 Thread Daniel Gustafsson

> On 5 Oct 2023, at 15:44, Peter Eisentraut  
> wrote:
> 
> On 04.10.22 17:45, Peter Eisentraut wrote:
>> While working on the column encryption patch, I wanted to check that what is 
>> implemented also works in OpenSSL FIPS mode.  I tried running the normal 
>> test suites after switching the OpenSSL installation to FIPS mode, but that 
>> failed all over the place.  So I embarked on fixing that.   Attached is a 
>> first iteration of a patch.
> 
> Continuing this, we have fixed many issues since.  Here is a patch set to fix 
> all remaining issues.
> 
> v4-0001-citext-Allow-tests-to-pass-in-OpenSSL-FIPS-mode.patch
> v4-0002-pgcrypto-Allow-tests-to-pass-in-OpenSSL-FIPS-mode.patch

+ERROR:  crypt(3) returned NULL

Not within scope here, but I wish we had a better error message here. That's 
for another patch though clearly.

> v4-0003-Allow-tests-to-pass-in-OpenSSL-FIPS-mode-TAP-test.patch
> 
> This one does some delicate surgery and could use some thorough review.

I don't have a FIPS enabled build handy to test in, but reading the patch I
don't see anything that sticks out apart from very minor comments:

+my $md5_works = ($node->psql('postgres', "select md5('')") == 0);

I think this warrants an explanatory comment for readers not familiar with
FIPS, without that it may seem quite an odd test.

+), 0, 'created user with scram password');

Tiny nitpick, I think we use SCRAM when writing it in text.

> v4-0004-Allow-tests-to-pass-in-OpenSSL-FIPS-mode-rest.patch
> 
> This just adds alternative expected files.  The question is mainly just 
> whether there are better ways to organize this.

Without inventing a new structure for alternative outputs I don't see how.

--
Daniel Gustafsson

Re: Good News Everyone! + feature proposal

2023-10-05 Thread David G. Johnston

On Wednesday, October 4, 2023, Jon Erdman  wrote:

>
> So I'd like to get a general idea how likely this would be to getting
> accepted if it did it, and did it right?
>

Run a cron job checking for them.  Allow for overrides by adding a comment
to any unclogged tables you’ve identified as being acceptable.

David J.

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Tom Lane

Laurenz Albe  writes:
> On Thu, 2023-10-05 at 02:22 +, Jon Erdman wrote:
>> For the proposal (this one is a bit Apple specific): because my team 
>> offers managed postgres to our Apple-internal customers, many of whom 
>> are not database experts, or at least not postgres experts, we'd like to 
>> be able to toggle the availability of UNLOGGED tables in 
>> postgresql.conf, so our less clueful users have fewer footguns.

I'm doubtful that this is a problem that needs a solution.
If anything, the right answer is to fix whatever part of the
documentation isn't warning of the hazards strongly enough.

Even more to the point: if we accept this, how many other
footgun-preventing GUCs will have the same or stronger claim to
existence?

> It certainly sounds harmless, but there are two things that make me
> unhappy about this:

> - Yet another GUC.  It's not like we don't have enough of them.
>   (This is a small quibble.)

> - This setting would influence the way SQL is processed.
>   We have had bad experiences with those; an often-quoted example is
>   the "autocommit" parameter that got removed in 7.4.
>   This certainly is less harmfuls, but still another pitfall that
>   can confuse users.

Same objections here.  Also note that the semantics we've defined
for GUCs (when they can be set and where) don't always line up
nicely with requirements of this sort.  It's far from clear to me
whether such a GUC should be SUSET (making it a hard prohibition
for ordinary users) or USERSET (making it just a training wheel).

regards, tom lane

Re: Allow tests to pass in OpenSSL FIPS mode

2023-10-05 Thread Peter Eisentraut


On 04.10.22 17:45, Peter Eisentraut wrote:
While working on the column encryption patch, I wanted to check that 
what is implemented also works in OpenSSL FIPS mode.  I tried running 
the normal test suites after switching the OpenSSL installation to FIPS 
mode, but that failed all over the place.  So I embarked on fixing that. 
  Attached is a first iteration of a patch.


Continuing this, we have fixed many issues since.  Here is a patch set 
to fix all remaining issues.


v4-0001-citext-Allow-tests-to-pass-in-OpenSSL-FIPS-mode.patch
v4-0002-pgcrypto-Allow-tests-to-pass-in-OpenSSL-FIPS-mode.patch

These two are pretty straightforward.

v4-0003-Allow-tests-to-pass-in-OpenSSL-FIPS-mode-TAP-test.patch

This one does some delicate surgery and could use some thorough review.

v4-0004-Allow-tests-to-pass-in-OpenSSL-FIPS-mode-rest.patch

This just adds alternative expected files.  The question is mainly just 
whether there are better ways to organize this.


v4-0005-WIP-Use-fipshash-in-brin_multi-test.patch

Here, some previously fixed md5() uses have snuck back in.  I will need 
to track down the origin of this and ask for a proper fix there.  This 
is just included here for completeness.
From 7faeec85be6d445eca21e8132b4bf151ee6f8ee2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 5 Oct 2023 14:45:35 +0200
Subject: [PATCH v4 1/5] citext: Allow tests to pass in OpenSSL FIPS mode

citext doesn't define an md5() function, so the value of using it in
its tests is dubious.  At best this shows in an indirect way that the
cast from citext to text works.  Avoid the issue and remove the test.
---
 contrib/citext/expected/citext.out   | 9 -
 contrib/citext/expected/citext_1.out | 9 -
 contrib/citext/sql/citext.sql| 1 -
 3 files changed, 19 deletions(-)

diff --git a/contrib/citext/expected/citext.out 
b/contrib/citext/expected/citext.out
index 1c55598136..8c0bf54f0f 100644
--- a/contrib/citext/expected/citext.out
+++ b/contrib/citext/expected/citext.out
@@ -1744,15 +1744,6 @@ SELECT ltrim('zzzytrim'::citext, 'xyz'::text  ) = 'trim' 
AS t;
  t
 (1 row)
 
-SELECT md5( name ) = md5( name::text ) AS t FROM srt;
- t 

- t
- t
- t
- t
-(4 rows)
-
 -- pg_client_encoding() takes no args and returns name.
 SELECT quote_ident( name ) = quote_ident( name::text ) AS t FROM srt;
  t 
diff --git a/contrib/citext/expected/citext_1.out 
b/contrib/citext/expected/citext_1.out
index 4a979d7a0d..c5e5f180f2 100644
--- a/contrib/citext/expected/citext_1.out
+++ b/contrib/citext/expected/citext_1.out
@@ -1744,15 +1744,6 @@ SELECT ltrim('zzzytrim'::citext, 'xyz'::text  ) = 'trim' 
AS t;
  t
 (1 row)
 
-SELECT md5( name ) = md5( name::text ) AS t FROM srt;
- t 

- t
- t
- t
- t
-(4 rows)
-
 -- pg_client_encoding() takes no args and returns name.
 SELECT quote_ident( name ) = quote_ident( name::text ) AS t FROM srt;
  t 
diff --git a/contrib/citext/sql/citext.sql b/contrib/citext/sql/citext.sql
index b329253d37..aa1cf9abd5 100644
--- a/contrib/citext/sql/citext.sql
+++ b/contrib/citext/sql/citext.sql
@@ -574,7 +574,6 @@ CREATE TABLE caster (
 SELECT ltrim('zzzytrim'::text,   'xyz'::citext) = 'trim' AS t;
 SELECT ltrim('zzzytrim'::citext, 'xyz'::text  ) = 'trim' AS t;
 
-SELECT md5( name ) = md5( name::text ) AS t FROM srt;
 -- pg_client_encoding() takes no args and returns name.
 SELECT quote_ident( name ) = quote_ident( name::text ) AS t FROM srt;
 SELECT quote_literal( name ) = quote_literal( name::text ) AS t FROM srt;

base-commit: 4f2994647ff1e1209829a0085ca0c8d237d4
-- 
2.42.0

From 693baf3cc2d2dfa6399ddb8d8c874d1ef56df86d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 5 Oct 2023 14:45:35 +0200
Subject: [PATCH v4 2/5] pgcrypto: Allow tests to pass in OpenSSL FIPS mode

This adds several alternative expected files for when md5 is not
available.  This is similar to the alternative expected files for when
the legacy provider is disabled.  In fact, running the pgcrypto tests
in FIPS mode makes use of some of these existing alternative expected
files as well (e.g., for blowfish).
---
 contrib/pgcrypto/expected/crypt-md5_1.out   |  16 ++
 contrib/pgcrypto/expected/hmac-md5_1.out|  44 +
 contrib/pgcrypto/expected/md5_1.out |  17 ++
 contrib/pgcrypto/expected/pgp-encrypt_1.out | 204 
 4 files changed, 281 insertions(+)
 create mode 100644 contrib/pgcrypto/expected/crypt-md5_1.out
 create mode 100644 contrib/pgcrypto/expected/hmac-md5_1.out
 create mode 100644 contrib/pgcrypto/expected/md5_1.out
 create mode 100644 contrib/pgcrypto/expected/pgp-encrypt_1.out

diff --git a/contrib/pgcrypto/expected/crypt-md5_1.out 
b/contrib/pgcrypto/expected/crypt-md5_1.out
new file mode 100644
index 00..0ffda34ab4
--- /dev/null
+++ b/contrib/pgcrypto/expected/crypt-md5_1.out
@@ -0,0 +1,16 @@
+--
+-- crypt() and gen_salt(): md5
+--
+SELECT crypt('', '$1$Szzz0yzz');
+ERROR:  crypt(3) returned NULL
+SELECT crypt('foox', '$1$Szzz0yzz');
+ERROR:  crypt(3)

Re: FDW pushdown of non-collated functions

2023-10-05 Thread Jean-Christophe Arnu

Dear Hackers,

I figured out this email was sent at release time. The worst time to ask
for thoughts on a subject IMHO. Anyway, I hope this email will pop the
topic over the stack!
Thank you!

Le ven. 8 sept. 2023 à 16:41, Jean-Christophe Arnu  a
écrit :

> Dear hackers,
>
> I recently found a weird behaviour involving FDW (postgres_fdw) and
> planning.
>
> Here’s a simplified use-case:
>
> Given a remote table (say on server2) with the following definition:
>
> CREATE TABLE t1(
>   ts timestamp without time zone,
>   x  bigint,
>   x2 text
> );
> --Then populate t1 table:INSERT INTO t1
>SELECT
> current_timestamp - 1000*random()*'1 day'::interval
> ,x
> ,''||x
>FROM
> generate_series(1,10) as x;
>
>
> This table is imported in a specific schema on server1 (we do not use
> use_remote_estimate) also with t1 name in a specific schema:
>
> On server1:
>
> CREATE SERVER server2
>FOREIGN DATA WRAPPER  postgres_fdw
>OPTIONS (
>   host '127.0.0.1',
>   port '9002',
>   dbname 'postgres',
>   use_remote_estimate 'false'
>);
> CREATE USER MAPPING FOR jc
>SERVER server2
>OPTIONS (user 'jc');
> CREATE SCHEMA remote;
>
> IMPORT FOREIGN SCHEMA public
>FROM SERVER server2
>INTO remote ;
>
> On a classic PostgreSQL 15 version the following query using date_trunc()
> is executed and results in the following plan:
>
> jc=# explain (verbose,analyze) select date_trunc('day',ts), count(1) from 
> remote.t1 group by date_trunc('day',ts) order by 1;
> QUERY PLAN
>  
> ---
>  Sort  (cost=216.14..216.64 rows=200 width=16) (actual time=116.699..116.727 
> rows=1001 loops=1)
>Output: (date_trunc('day'::text, ts)), (count(1))
>Sort Key: (date_trunc('day'::text, t1.ts))
>Sort Method: quicksort  Memory: 79kB
>->  HashAggregate  (cost=206.00..208.50 rows=200 width=16) (actual 
> time=116.452..116.532 rows=1001 loops=1)
>  Output: (date_trunc('day'::text, ts)), count(1)
>  Group Key: date_trunc('day'::text, t1.ts)
>  Batches: 1  Memory Usage: 209kB
>  ->  Foreign Scan on remote.t1  (cost=100.00..193.20 rows=2560 
> width=8) (actual time=0.384..106.225 rows=10 loops=1)
>Output: date_trunc('day'::text, ts)
>Remote SQL: SELECT ts FROM public.t1
>  Planning Time: 0.077 ms
>  Execution Time: 117.028 ms
>
>
> Whereas the same query with date_bin()
>
> jc=# explain (verbose,analyze) select date_bin('1day',ts,'2023-01-01'), 
> count(1) from remote.t1 group by 1 order by 1;
>   
>   QUERY PLAN  
>   
> 
> --
>  Foreign Scan  (cost=113.44..164.17 rows=200 width=16) (actual 
> time=11.297..16.312 rows=1001 loops=1)
>Output: (date_bin('1 day'::interval, ts, '2023-01-01 00:00:00'::timestamp 
> without time zone)), (count(1))
>Relations: Aggregate on (remote.t1)
>Remote SQL: SELECT date_bin('1 day'::interval, ts, '2023-01-01 
> 00:00:00'::timestamp without time zone), count(1) FROM public.t1 GROUP BY 1 
> ORDER BY date_bin('1 day'::interval, ts, '2023-01-01 00:00:00'::timestamp 
> without time zone) ASC NULLS LAST
>  Planning Time: 0.114 ms
>  Execution Time: 16.599 ms
>
>
>
> With date_bin() the whole expression is pushed down to the remote server,
> whereas with date_trunc() it’s not.
>
> I dived into the code and live debugged. It turns out that decisions to
> pushdown or not a whole query depends on many factors like volatility and
> collation. In the date_trunc() case, the problem is all about collation (
> date_trunc() on timestamp without time zone). And decision is made in the
> foreign_expr_walker() in deparse.c (
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/postgres_fdw/deparse.c;h=efaf387890e3f85c419748ec3af5d1e9696c9c4c;hb=86648dcdaec67b83cec20a9d25b45ec089a7c624#l468
> )
>
> First the function is tested as shippable (able to be pushed down) and
> date_trunc() and date_bin() both are.
>
> Then parameters sub-expressions are evaluated with collation and
> “shippability”, and they all are with both functions.
>
> Then we arrive at this code portion:
>
> if (fe->inputcollid == InvalidOid)
>   /* OK, inputs are all noncollatable */ ;else if

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-05 Thread Bharath Rupireddy

On Thu, Oct 5, 2023 at 1:48 AM Amit Kapila  wrote:
>
> Then, we should also try to create slots before invoking pg_resetwal.
> The idea is that we can write a new binary mode function that will do
> exactly what pg_resetwal does to compute the next segment and use that
> location as a new location (restart_lsn) to create the slots in a new
> node. Then, pass it pg_resetwal by using the existing option '-l
> walfile'. As we don't have any API that takes restart_lsn as input, we
> can write a new API probably for binary mode to create slots that do
> take restart_lsn as input. This will ensure that there is no new WAL
> inserted by background processes between resetwal and the creation of
> slots.

+1. I think this approach makes it foolproof. pg_resetwal uses
FindEndOfXLOG and we need that to be in a binary mode SQL callable
function. FindEndOfXLOG ignores TLI to compute the new WAL file name,
but that seems to be okay for the new binary mode function because
pg_upgrade uses TLI 1 anyways and doesn't copy WAL files from old
cluster.

FWIW, pg_upgrades does use -l in copy_xact_xlog_xid, I'm not sure if
it has anything to do with the above proposed change.

> The other potential problem Andres pointed out is that during shutdown
> if due to some reason, the walreceiver goes down, we won't be able to
> send the required WAL and users won't be able to ensure that because
> even after restart the same situation can happen. The ideal way is to
> have something that puts the system in READ ONLY state during shutdown
> and then we can probably allow walreceivers to reconnect and receive
> the required WALs. As we don't have such functionality available and
> it won't be easy to achieve the same, we can leave this for now.
>
> Thoughts?

You mean walreceiver for streaming replication? Or the apply workers
going down for logical replication? If there's yet-to-be-sent-out WAL,
pg_upgrade will fail no? How does the above scenario a problem for
pg_upgrade of a cluster with just logical replication slots?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Isaac Morland

On Thu, 5 Oct 2023 at 07:32, Robert Haas  wrote:

> But I do think that sometimes users are reluctant to perform encoding
> conversions on the data that they have. Sometimes they're not
> completely certain what encoding their data is in, and sometimes
> they're worried that the encoding conversion might fail or produce
> wrong answers. In theory, if your existing data is validly encoded and
> you know what encoding it's in and it's easily mapped onto UTF-8,
> there's no problem. You can just transcode it and be done. But a lot
> of times the reality is a lot messier than that.
>

In the case you describe, the users don’t have text at all; they have
bytes, and a vague belief about what encoding the bytes might be in and
therefore what characters they are intended to represent. The correct way
to store that in the database is using bytea. Text types should be for when
you know what characters you want to store. In this scenario, the
implementation detail of what encoding the database uses internally to
write the data on the disk doesn't matter, any more than it matters to a
casual user how a table is stored on disk.

Similarly, I don't believe we have a "YMD" data type which stores year,
month, and day, without being specific as to whether it's Gregorian or
Julian; if you have that situation, make a 3-tuple type or do something
else. "Date" is for when you actually know what day you want to record.

Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-05 Thread Robert Haas

On Thu, Oct 5, 2023 at 6:25 AM Nazir Bilal Yavuz  wrote:
> What do you think about the second patch, counting extend calls'
> timings in blk_write_time? In my opinion if something increments
> {shared|local}_blks_written, then it needs to be counted in
> blk_write_time too. I am not sure why it is decided like that.

I agree that an extend should be counted the same way as a write. But
I'm suspicious that here too we have confusion about whether
blk_write_time is supposed to be covering shared buffers and local
buffers or just shared buffers.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Change of behaviour for creating same type name in multiple schemas

2023-10-05 Thread Matthias van de Meent

On Thu, 5 Oct 2023 at 14:13, Dave Cramer  wrote:
>
> Greetings,
>
> Before 16 if I created an array type in schema1 it would be named 
> schema1._array_type
> if I created the same type in schema 2 it would have been named
>
> schema2.__array_type
>
> Can someone point me to where the code was changed ?

This was with commit 70988b7b [0] in July 2022, based on this thread
[1] (moved from -bugs).

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] 
https://github.com/postgres/postgres/commits/70988b7b0a0bd03c59a2314d0b5bcf2135692349
[1] 
https://www.postgresql.org/message-id/flat/b84cd82c-cc67-198a-8b1c-60f44e1259ad%40postgrespro.ru

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-05 Thread Bharath Rupireddy

On Thu, Oct 5, 2023 at 12:22 PM Drouvot, Bertrand
 wrote:
>
> >
> > @@ -719,6 +719,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
> >const char *username, Oid useroid,
> >bool load_session_libraries,
> >bool override_allow_connections,
> > +  bool bypass_login_check,
> >char *out_dbname)
> >
> > I was not paying much attention here, but load_session_libraries gives
> > a good argument in favor of switching all these booleans to a single
> > bits32 argument as that would make three of them now, with a different
> > set of flags than the bgworker ones.  This can be refactored on its
> > own.
>
> Yeah good point, will work on it once the current one is committed.
>
> >
> > -#define BGWORKER_BYPASS_ALLOWCONN 1
> > +#define BGWORKER_BYPASS_ALLOWCONN 0x0001
> >
> > Can be a change of its own as well.
>
> Yeah, but I think it's simple enough to just keep this change here
> (and I don't think it's "really" needed without introducing 0x0002)

I think changing BGWORKER_BYPASS_ALLOWCONN to 0x0001 and having bits32
for InitPostgres inputs load_session_libraries and
override_allow_connections can be 0001 in this patch series so that it
can go first, no? This avoids the new code being in the old format and
changing things right after it commits.

v6-0001 LGTM.

A comment on v6-0002:
1.
+  CREATE ROLE nologrole with nologin;
+  ALTER ROLE nologrole with superuser;
+]);
We don't need superuser privileges here, do we? Or do we need it for
the worker_spi to access pg_catalog and stuff in worker_spi_main? If
not, can we remove it to showcase non-superusers requesting bg
workers?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Gurjeet Singh

On Thu, Oct 5, 2023 at 1:52 AM Jon Erdman  wrote:
>
> So I have some good news! At long last I've found a company/manager that
> wants to actually factually pay me to do some work on PG!!

Congratulations!

> For the proposal (this one is a bit Apple specific): because my team
> offers managed postgres to our Apple-internal customers, many of whom
> are not database experts, or at least not postgres experts, we'd like to
> be able to toggle the availability of UNLOGGED tables in
> postgresql.conf, so our less clueful users have fewer footguns.
>
> So, my proposal is for a GUC to implement that, which would *OF COURSE*
> undefault to allowing UNLOGGED.

That was difficult to parse at first glance. I guess you mean the
GUC's default value will not change the current behaviour, as you
mention below.

> The reasoning here is we have autofailover set up for our standard
> cluster offering that we give to customers, using sync-rep to guarantee
> no data loss if we flop to the HA node. Any users not up to speed on
> what UNLOGGED really means could inadvertently lose data, so we'd like
> to be able to force it to be off, and turn it on upon request after
> educating the customer in question it's it's a valid use case.
>
> So to begin with: I'm sure some folks will hate this idea, but maybe a
> good many wont, and remember, this would default to UNLOGGED enabled, so
> no change to current behaviour. and no encouragement to disallow it, but
> just the ability to do so, which i think is useful in
> hosted/managed/multi-tenant environment where most things are taken care
> of for the customer.

I see the need to disable this feature and agree that some
installations may need it, where the users are not savvy enough to
realize its dangers and fall for its promise to increase INSERT/UPDATE
performance. Your specific example of an internal hosted/managed
service is a good example. Even in smaller installations the DBA might
want to disable this, so that unwary developers don't willy-nilly
create unlogged tables and end up losing data after a failover. I hope
others can provide more examples and situations where this may be
useful, to make it obvious that we need this feature.

My first reaction was to make it a GRANTable permission. But since one
can always do `ALTER USER savvy_user SET allow_unlogged_tables TO
true` and override the system-wide setting in postgresql.conf, for a
specific user, I feel a GUC would be the right way to implement it.

The complaint about too many GUCs is a valid one, but I'd worry more
about it if it were an optimizer/performance improvement being hidden
behind a GUC. This new GUC would be a on-off switch to override the
SQL/grammar feature provided by the UNLOGGED keyword, hence not really
a concern IMO.

> So I'd like to get a general idea how likely this would be to getting
> accepted if it did it, and did it right?

Like others said, there are no guarantees. A working patch may help
guide people's opinion one way or the other, so I'd work on submitting
a patch while (some) people are in agreement.

> Let the flame war begin!

Heh. I'm sure you already know this, but this community's flame wars
are way more timid compared to what members of other communities may
be used to :-) I consider it lucky if someone throws as much as a lit
match.

> PS: I'm SO happy that this phase of my postgres journey has finally
> started

I, too, am very happy for you! :-)

> Jon Erdman (aka StuckMojo on IRC)

TIL.

I wish there was a directory of IRC identities that pointed to real
identities (at least for folks who don't mind this mapping available
in the open), so that when we converse in IRC, we have a face to go
with the IRC handles. As a human I feel that necessity.

Best regards,
Gurjeet
http://Gurje.et

Change of behaviour for creating same type name in multiple schemas

2023-10-05 Thread Dave Cramer

Greetings,

Before 16 if I created an array type in schema1 it would be named
schema1._array_type
if I created the same type in schema 2 it would have been named

schema2.__array_type

Can someone point me to where the code was changed ?

Thanks,
Dave Cramer

Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)

2023-10-05 Thread Heikki Linnakangas


On 29/08/2023 09:58, Heikki Linnakangas wrote:

On 29/08/2023 09:21, Heikki Linnakangas wrote:

Thinking about this some more, the ListenSockets array is a bit silly
anyway. We fill the array starting from index 0, always append to the
end, and never remove entries from it. It would seem more
straightforward to keep track of the used size of the array. Currently
we always loop through the unused parts too, and e.g.
ConfigurePostmasterWaitSet() needs to walk the array to count how many
elements are in use.


Like this.


This seems pretty uncontroversial, and I heard no objections, so I went 
ahead and committed that.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-05 Thread Bharath Rupireddy

On Thu, Oct 5, 2023 at 4:24 PM Amit Kapila  wrote:
>
> > > Today, I discussed this problem with Andres at PGConf NYC and he
> > > suggested as following. To verify, if there is any pending unexpected
> > > WAL after shutdown, we can have an API like
> > > pg_logical_replication_slot_advance() which will simply process
> > > records without actually sending anything downstream.

+1 for this approach. It looks neat.

I think we also need to add TAP tests to generate decodable WAL
records (RUNNING_XACT, CHECKPOINT_ONLINE, XLOG_FPI_FOR_HINT,
XLOG_SWITCH, XLOG_PARAMETER_CHANGE, XLOG_HEAP2_PRUNE) during
pg_upgrade as described here
https://www.postgresql.org/message-id/TYAPR01MB58660273EACEFC5BF256B133F50DA%40TYAPR01MB5866.jpnprd01.prod.outlook.com.
Basically, these were the exceptional WAL records that may be
generated by pg_upgrade, so having tests for them is good.

> > So I assume in each lower-level decode function (e.g. heap_decode() )
> > we will add the check that if we are checking the WAL for an upgrade
> > then from that level we will return true or false based on whether the
> > WAL is decodable or not.  Is my understanding correct?
> >
>
> Yes, this is one way to achive but I think this will require changing return 
> value of many APIs. Can we somehow just get this via LogicalDecodingContext 
> or some other way at the caller by allowing to set some variable at required 
> places?

+1 for adding the required flags to the decoding context similar to
fast_forward.

Another way without adding any new variables is to pass the WAL record
to LogicalDecodingProcessRecord, and upon return check the reorder
buffer if there's any decoded change generated for the xid associated
with the WAL record. If any decoded change related to the WAL record
xid is found, then that's the end for the new function. Here's what I
think [1], haven't tested it.

[1]
change_found = false;
end_of_wal = false;
ctx = CreateDecodingContext();

XLogBeginRead(ctx->reader, MyReplicationSlot->data.restart_lsn);

while(!end_of_wal || !change_found)
{
XLogRecord *record;
TransactionId xid;
ReorderBufferTXN *txn;

record = XLogReadRecord(ctx->reader, );

if (record)
LogicalDecodingProcessRecord(ctx, ctx->reader);

xid = XLogRecGetXid(record);

txn = ReorderBufferTXNByXid(ctx->reorder, xid, false, NULL,
InvalidXLogRecPtr,
false);

if (txn != NULL)
{
change_found = true;
break;
}

CHECK_FOR_INTERRUPTS();
}

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Pre-proposal: unicode normalized text

2023-10-05 Thread Robert Haas

On Wed, Oct 4, 2023 at 9:02 PM Isaac Morland  wrote:
>> > What about characters not in UTF-8?
>>
>> Honestly I'm not clear on this topic. Are the "private use" areas in
>> unicode enough to cover use cases for characters not recognized by
>> unicode? Which encodings in postgres can represent characters that
>> can't be automatically transcoded (without failure) to unicode?
>
> Here I’m just anticipating a hypothetical objection, “what about characters 
> that can’t be represented in UTF-8?” to my suggestion to always use UTF-8 and 
> I’m saying we shouldn’t care about them. I believe the answers to your 
> questions in this paragraph are “yes”, and “none”.

Years ago, I remember SJIS being cited as an example of an encoding
that had characters which weren't part of Unicode. I don't know
whether this is still a live issue.

But I do think that sometimes users are reluctant to perform encoding
conversions on the data that they have. Sometimes they're not
completely certain what encoding their data is in, and sometimes
they're worried that the encoding conversion might fail or produce
wrong answers. In theory, if your existing data is validly encoded and
you know what encoding it's in and it's easily mapped onto UTF-8,
there's no problem. You can just transcode it and be done. But a lot
of times the reality is a lot messier than that.

Which gives me some sympathy with the idea of wanting multiple
character sets within a database. Such a feature exists in some other
database systems and is, presumably, useful to some people. On the
other hand, to do that in PostgreSQL, we'd need to propagate the
character set/encoding information into all of the places that
currently get the typmod and collation, and that is not a small number
of places. It's a lot of infrastructure for the project to carry
around for a feature that's probably only going to continue to become
less relevant.

I suppose you never know, though. Maybe the Unicode consortium will
explode in a tornado of fiery rage and there will be dueling standards
making war over the proper way of representing an emoji of a dog
eating broccoli for decades to come. In that case, our hypothetical
multi-character-set feature might seem prescient.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-05 Thread Amit Kapila

On Thu, Oct 5, 2023 at 2:29 PM Dilip Kumar  wrote:
>
> On Thu, Oct 5, 2023 at 1:48 AM Amit Kapila 
wrote:
> >
> > On Tue, Oct 3, 2023 at 9:58 AM Bharath Rupireddy
> >  wrote:
> > >
> > > On Fri, Sep 29, 2023 at 5:27 PM Hayato Kuroda (Fujitsu)
> > >  wrote:
> > > >
> > > > Yeah, the approach enforces developers to check the decodability.
> > > > But the benefit seems smaller than required efforts for it because
the function
> > > > would be used only by pg_upgrade. Could you tell me if you have
another use case
> > > > in mind? We may able to adopt if we have...
> > >
> > > I'm attaching 0002 patch (on top of v45) which implements the new
> > > decodable callback approach that I have in mind. IMO, this new
> > > approach is extensible, better than the current approach (hard-coding
> > > of certain WAL records that may be generated during pg_upgrade) taken
> > > by the patch, and helps deal with the issue that custom WAL resource
> > > managers can have with the current approach taken by the patch.
> > >
> >
> > Today, I discussed this problem with Andres at PGConf NYC and he
> > suggested as following. To verify, if there is any pending unexpected
> > WAL after shutdown, we can have an API like
> > pg_logical_replication_slot_advance() which will simply process
> > records without actually sending anything downstream.
>
> So I assume in each lower-level decode function (e.g. heap_decode() )
> we will add the check that if we are checking the WAL for an upgrade
> then from that level we will return true or false based on whether the
> WAL is decodable or not.  Is my understanding correct?
>

Yes, this is one way to achive but I think this will require changing
return value of many APIs. Can we somehow just get this via
LogicalDecodingContext or some other way at the caller by allowing to set
some variable at required places?

Re: pgBufferUsage.blk_{read|write}_time are zero although there are pgBufferUsage.local_blks_{read|written}

2023-10-05 Thread Nazir Bilal Yavuz

Hi,

On Tue, 3 Oct 2023 at 19:44, Robert Haas  wrote:
>
> Given that, I'm inclined to agree that this is a bug. But we might
> need to go through and make sure all of the code that deals with these
> counters is on the same page about what the values represent. Maybe
> there is code lurking somewhere that thinks these counters are only
> supposed to include "shared" rather than, as the fragment above
> suggests, "shared/local".

Thank you for the guidance.

What do you think about the second patch, counting extend calls'
timings in blk_write_time? In my opinion if something increments
{shared|local}_blks_written, then it needs to be counted in
blk_write_time too. I am not sure why it is decided like that.

Regards,
Nazir Bilal Yavuz
Microsoft

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-05 Thread Hayato Kuroda (Fujitsu)

Dear Amit, Andres,

Thank you for giving the decision! Basically I will follow your idea and make
a patch accordingly.

> Today, I discussed this problem with Andres at PGConf NYC and he
> suggested as following. To verify, if there is any pending unexpected
> WAL after shutdown, we can have an API like
> pg_logical_replication_slot_advance() which will simply process
> records without actually sending anything downstream. In this new API,
> we will start with each slot's restart_lsn location and try to process
> till the end of WAL, if we encounter any WAL that needs to be
> processed (like we need to send the decoded WAL downstream) we can
> return a false indicating that there is an unexpected WAL. The reason
> to start with restart_lsn is that it is the location that we use to
> start scanning the WAL anyway.

I felt the approach seems similar to Hou-san's suggestion[1], but we can avoid 
to
use test_decoding. I'm planning to do that the upgrading function decodes WALs
and check whether there are reorderbuffer changes.

> Then, we should also try to create slots before invoking pg_resetwal.
> The idea is that we can write a new binary mode function that will do
> exactly what pg_resetwal does to compute the next segment and use that
> location as a new location (restart_lsn) to create the slots in a new
> node. Then, pass it pg_resetwal by using the existing option '-l
> walfile'. As we don't have any API that takes restart_lsn as input, we
> can write a new API probably for binary mode to create slots that do
> take restart_lsn as input. This will ensure that there is no new WAL
> inserted by background processes between resetwal and the creation of
> slots.

It seems better because we can create every objects before pg_resetwal.

I will handle above two points and let's see how it work.

[1]: 
https://www.postgresql.org/message-id/OS0PR01MB5716506A1A1B20EFBFA7B52994C1A%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Aleksander Alekseev

Hi Jon,

> Had my performance review today, and Apple wants me to get a patch
> accepted this quarter, with the promise of more to come after that.
> Luckily, this first patch can be anything (doesn't have to be of use to
> Apple, more to prove that I can get a patch accepted), so I'm open to
> suggestions of smaller stuff that is in demand at the moment.

My sincere congratulations!

>From personal experience however delivering any non-trivial patch may
take from several years up to infinity even if the RFC is in agreement
with the community and generally everyone is enthusiastic about the
proposal change. Take "Clarify the behavior of the system when
approaching XID wraparound" [1] as a recent example. It's a fairly
simple change but after 10 months it's only yet to be committed. I
know people who were working on a single patch for 5 years.

Please make sure your employer understands the specifics of working on
open source, especially the fact that no one cares about this
employer's internal deadlines, and also that this is reflected in your
team metrics. There are also many other things to be mindful of. I
would recommend making sure that your team owns only one product (i.e.
PostgreSQL Upstream), no extensions, no forks etc. Make sure the team
charter reflects this, otherwise other products will always be a
priority.

Regarding your deliverables for this quarter. If the size of the patch
is not critical, I would suggest focusing on simple refactorings and
also code reviews. Especially code reviews. Practice shows that it's
realistic for one person to deliver somewhere between 10 to 20 patches
per quarter this way. Then compare the number you got to the average
amount of patches one person (except for the Core Team) typically
contributes. Your goal is to be above the median. If on top of that
you are able, lets say, to make internal technical talks about
PostgreSQL internals and/or speak at conferences and/or ... this will
look great on your PFD and your manager will be extremely happy with
your performance.

I know this may sound like gaming the metrics or something but this is
exactly how large companies work.

I honestly wish you all the best at your new job and will be happy to
share my findings regarding building the processes around OSS
development. Please don't hesitate reaching out to me off-list.

[1]: https://commitfest.postgresql.org/45/4128/

-- 
Best regards,
Aleksander Alekseev

Re: Good News Everyone! + feature proposal

2023-10-05 Thread Laurenz Albe

On Thu, 2023-10-05 at 02:22 +, Jon Erdman wrote:

> For the proposal (this one is a bit Apple specific): because my team 
> offers managed postgres to our Apple-internal customers, many of whom 
> are not database experts, or at least not postgres experts, we'd like to 
> be able to toggle the availability of UNLOGGED tables in 
> postgresql.conf, so our less clueful users have fewer footguns.
> 
> So, my proposal is for a GUC to implement that, which would *OF COURSE* 
> undefault to allowing UNLOGGED.

It certainly sounds harmless, but there are two things that make me
unhappy about this:

- Yet another GUC.  It's not like we don't have enough of them.
  (This is a small quibble.)

- This setting would influence the way SQL is processed.
  We have had bad experiences with those; an often-quoted example is
  the "autocommit" parameter that got removed in 7.4.
  This certainly is less harmfuls, but still another pitfall that
  can confuse users.

This reminds me of the proposal for a GUC to forbid UPDATE and DELETE
without a WHERE clause.  That didn't get anywhere, see
https://www.postgresql.org/message-id/flat/20160721045746.GA25043%40fetter.org

> PS: I'm SO happy that this phase of my postgres journey has finally 
> started

I am happy for you.

Please don't be discouraged if some of your patches get stuck because
no consensus can be reached or because nobody cares enough.  Your
contributions are still welcome.  One good way to gain experience
is to review others' patches.  In fact, you are expected to do that
if you submit your own.

Yours,
Laurenz Albe

Re: Removing unneeded self joins

2023-10-05 Thread Andrei Lepikhov


On 4/10/2023 14:34, Alexander Korotkov wrote:

 > Relid replacement machinery is the most contradictory code here. We used
 > a utilitarian approach and implemented a simplistic variant.

 > > 2) It would be nice to skip the insertion of IS NOT NULL checks when
 > > they are not necessary.  [1] points that infrastructure from [2] might
 > > be useful.  The patchset from [2] seems committed mow.  However, I
 > > can't see it is directly helpful in this matter.  Could we just skip
 > > adding IS NOT NULL clause for the columns, that have
 > > pg_attribute.attnotnull set?
 > Thanks for the links, I will look into that case.
To be more precise, in the attachment, you can find a diff to the main 
patch, which shows the volume of changes to achieve the desired behaviour.

Some explains in regression tests shifted. So, I've made additional tests:

DROP TABLE test CASCADE;
CREATE TABLE test (a int, b int not null);
CREATE UNIQUE INDEX abc ON test(b);
explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a)
WHERE t1.b=t2.b;
CREATE UNIQUE INDEX abc1 ON test(a,b);
explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a)
WHERE t1.b=t2.b;
explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a)
WHERE t1.b=t2.b AND (t1.a=t2.a OR t2.a=t1.a);
DROP INDEX abc1;
explain SELECT * FROM test t1 JOIN test t2 ON (t1.a=t2.a)
WHERE t1.b=t2.b AND (t1.b=t2.b OR t2.b=t1.b);

We have almost the results we wanted to have. But in the last explain 
you can see that nothing happened with the OR clause. We should use the 
expression mutator instead of walker to handle such clauses. But It 
doesn't process the RestrictInfo node ... I'm inclined to put a solution 
of this issue off for a while.


--
regards,
Andrey Lepikhov
Postgres Professional
diff --git a/src/backend/optimizer/plan/analyzejoins.c 
b/src/backend/optimizer/plan/analyzejoins.c
index a127239d30..c12aa15fc9 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -73,7 +73,7 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
   List 
*restrictlist,
   List 
**extra_clauses);
 static Bitmapset *replace_relid(Relids relids, int oldId, int newId);
-static void replace_varno(Node *node, int from, int to);
+static void replace_varno(PlannerInfo *root, Node *node, int from, int to);
 
 
 /*
@@ -388,7 +388,7 @@ remove_rel_from_query_common(PlannerInfo *root, RelOptInfo 
*rel,
}
 
/* Update lateral references. */
-   replace_varno((Node *) otherrel->lateral_vars, relid, subst);
+   replace_varno(root, (Node *) otherrel->lateral_vars, relid, 
subst);
}
 
/*
@@ -425,7 +425,7 @@ remove_rel_from_query_common(PlannerInfo *root, RelOptInfo 
*rel,
sjinf->commute_below_l = replace_relid(sjinf->commute_below_l, 
ojrelid, subst);
sjinf->commute_below_r = replace_relid(sjinf->commute_below_r, 
ojrelid, subst);
 
-   replace_varno((Node *) sjinf->semi_rhs_exprs, relid, subst);
+   replace_varno(root, (Node *) sjinf->semi_rhs_exprs, relid, 
subst);
}
 
/*
@@ -1399,6 +1399,7 @@ is_innerrel_unique_for(PlannerInfo *root,
 
 typedef struct ReplaceVarnoContext
 {
+   PlannerInfo *root;
int from;
int to;
 } ReplaceVarnoContext;
@@ -1420,6 +1421,11 @@ replace_varno_walker(Node *node, ReplaceVarnoContext 
*ctx)
}
return false;
}
+
+   /*
+* Expression walker doesn't know about RestrictInfo node. Do recursive 
pass
+* into the clauses manually.
+*/
if (IsA(node, RestrictInfo))
{
RestrictInfo   *rinfo = (RestrictInfo *) node;
@@ -1429,20 +1435,26 @@ replace_varno_walker(Node *node, ReplaceVarnoContext 
*ctx)
 
if (bms_is_member(ctx->from, rinfo->clause_relids))
{
-   replace_varno((Node *) rinfo->clause, ctx->from, 
ctx->to);
-   replace_varno((Node *) rinfo->orclause, ctx->from, 
ctx->to);
-   rinfo->clause_relids = 
replace_relid(rinfo->clause_relids, ctx->from, ctx->to);
-   rinfo->left_relids = replace_relid(rinfo->left_relids, 
ctx->from, ctx->to);
-   rinfo->right_relids = 
replace_relid(rinfo->right_relids, ctx->from, ctx->to);
+   replace_varno(ctx->root, (Node *) rinfo->clause, 
ctx->from, ctx->to);
+   replace_varno(ctx->root, (Node *) rinfo->orclause, 
ctx->from, ctx->to);
+   rinfo->clause_relids =
+   
replace_relid(rinfo->clause_relids, ctx->from, ctx->to);
+   rinfo->left_relids =
+   
replace_relid(rinfo->left_relids, ctx->from, ctx->to);
+

Re: pgsql: Some refactoring to export json(b) conversion functions

2023-10-05 Thread Alvaro Herrera

On 2023-Aug-08, Alvaro Herrera wrote:

> One idea that Tom floated was to allow the JsonLexContext to be
> optionally stack-allocated.  That reduces palloc() traffic; but some
> callers do need it to be palloc'ed.  Here's a patch that does it that
> way, and adds a freeing routine that knows what to do in either case.
> It may make sense to do some further analysis and remove useless free
> calls.

Pushed this, after some further tweaking.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La verdad no siempre es bonita, pero el hambre de ella sí"

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-05 Thread Dilip Kumar

On Thu, Oct 5, 2023 at 1:48 AM Amit Kapila  wrote:
>
> On Tue, Oct 3, 2023 at 9:58 AM Bharath Rupireddy
>  wrote:
> >
> > On Fri, Sep 29, 2023 at 5:27 PM Hayato Kuroda (Fujitsu)
> >  wrote:
> > >
> > > Yeah, the approach enforces developers to check the decodability.
> > > But the benefit seems smaller than required efforts for it because the 
> > > function
> > > would be used only by pg_upgrade. Could you tell me if you have another 
> > > use case
> > > in mind? We may able to adopt if we have...
> >
> > I'm attaching 0002 patch (on top of v45) which implements the new
> > decodable callback approach that I have in mind. IMO, this new
> > approach is extensible, better than the current approach (hard-coding
> > of certain WAL records that may be generated during pg_upgrade) taken
> > by the patch, and helps deal with the issue that custom WAL resource
> > managers can have with the current approach taken by the patch.
> >
>
> Today, I discussed this problem with Andres at PGConf NYC and he
> suggested as following. To verify, if there is any pending unexpected
> WAL after shutdown, we can have an API like
> pg_logical_replication_slot_advance() which will simply process
> records without actually sending anything downstream.

So I assume in each lower-level decode function (e.g. heap_decode() )
we will add the check that if we are checking the WAL for an upgrade
then from that level we will return true or false based on whether the
WAL is decodable or not.  Is my understanding correct?  At first
thought this approach look better and generic.

 In this new API,
> we will start with each slot's restart_lsn location and try to process
> till the end of WAL, if we encounter any WAL that needs to be
> processed (like we need to send the decoded WAL downstream) we can
> return a false indicating that there is an unexpected WAL. The reason
> to start with restart_lsn is that it is the location that we use to
> start scanning the WAL anyway.

Yeah, that makes sense.

> Then, we should also try to create slots before invoking pg_resetwal.
> The idea is that we can write a new binary mode function that will do
> exactly what pg_resetwal does to compute the next segment and use that
> location as a new location (restart_lsn) to create the slots in a new
> node. Then, pass it pg_resetwal by using the existing option '-l
> walfile'. As we don't have any API that takes restart_lsn as input, we
> can write a new API probably for binary mode to create slots that do
> take restart_lsn as input. This will ensure that there is no new WAL
> inserted by background processes between resetwal and the creation of
> slots.

Yeah, that looks cleaner IMHO.

> The other potential problem Andres pointed out is that during shutdown
> if due to some reason, the walreceiver goes down, we won't be able to
> send the required WAL and users won't be able to ensure that because
> even after restart the same situation can happen. The ideal way is to
> have something that puts the system in READ ONLY state during shutdown
> and then we can probably allow walreceivers to reconnect and receive
> the required WALs. As we don't have such functionality available and
> it won't be easy to achieve the same, we can leave this for now.

+1

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Logical replication and multimaster

2023-10-05 Thread Jon Erdman


Oops,

Looking at my old message from 2015 on this subject, it was Magnus who 
pish-poshed it, and Page who liked it. Don't want to throw shaded at the 
wrong person ;)
--
Jon Erdman (aka StuckMojo)
 PostgreSQL Zealot

On 10/5/23 12:14 AM, Jon Erdman wrote:
> 
> Well,
> 
> Given my earlier message about Apple wanting to pay me for writing 
> patches now, maybe I can revisit this idea.
> 
> For background: I brought up the idea of an FDW that could read custom 
> dump files and expose them as tables so that you could grab just a 
> single record (or of course more) from a custom dump file without having 
> to load the whole thing up, if you were stuck reaching into a backup to 
> get at accidentally deleted tables, rows, etc.. The stopper, which was 
> communicated to me by Tom at the following pgcon was that the code for 
> parsing custom dumps is duplicated in pg_dump only, and would need to be 
> duplicated into the server for the FDW, or broken out into a library.
> 
> And for posterity, Dave Page said that was a stupid idea, while Magnus 
> said that it sounded useful. And IIRC Bruce and Robert H said it was 
> doable, just a good deal of work on the refactor needed.
> 
> This convo went down at the Amsterdam conf where I spoke about using 
> writeable LVM snapshots to expose each developer a copy of prod to 
> noodle on, without having to actually make a full copy for each dev.
> 
> Added trivia: I gave the talk with a can of Heineken in my hand at the 
> podium, and my lightning talk had the work F( Cool in the title ;)
> 
> That was also when I bought my plush Slony which was supposedly the very 
> last one. (turns out they made more)
> -- 
> Jon Erdman (aka StuckMojo)
>      PostgreSQL Zealot
> 
> On 12/15/15 9:48 PM, Jim Nasby wrote:
>> On 12/13/15 7:37 AM, David Fetter wrote:
>>> As I understand it, pushing these into a library has been proposed but
>>> not rejected.  That it hasn't happened yet is mostly about the lack of
>>> tuits (the round ones) to rewrite the functionality as libraries and
>>> refactor pg_dump/pg_restore to use only calls to same.  As usual, it's
>>> less about writing the code and more about the enormous amount of
>>> testing any such a refactor would entail.
>>
>> My understanding as well. IIRC Jon Erdman brought this question up a 
>> couple years ago and the response was "It'd probably be accepted, it's 
>> just that no one has done the work."
>>
>>> I believe that refactoring much of pg_dump's functionality for the
>>> current version of the server into SQL-accessible functions and making
>>> pg_dump use only those functions is achievable with available
>>> resources.
>>>
>>> Such a refactor need not be all-or-nothing.  For example, the
>>> dependency resolution stuff is a first step that appears to be worth
>>> doing by itself even if the effort then pauses, possibly for some
>>> time.
>>
>> If someone wanted to spend time on this, I suspect it'd be worth 
>> looking at how bad some of the backward compatibility issues would be 
>> if done in the server. Maybe they wouldn't be that bad. I suspect the 
>> audience for this code would be much larger if it was in the server as 
>> opposed to a C library.

Re: [HACKERS] Logical replication and multimaster

2023-10-05 Thread Jon Erdman

Well,

Given my earlier message about Apple wanting to pay me for writing 
patches now, maybe I can revisit this idea.

For background: I brought up the idea of an FDW that could read custom 
dump files and expose them as tables so that you could grab just a 
single record (or of course more) from a custom dump file without having 
to load the whole thing up, if you were stuck reaching into a backup to 
get at accidentally deleted tables, rows, etc.. The stopper, which was 
communicated to me by Tom at the following pgcon was that the code for 
parsing custom dumps is duplicated in pg_dump only, and would need to be 
duplicated into the server for the FDW, or broken out into a library.

And for posterity, Dave Page said that was a stupid idea, while Magnus 
said that it sounded useful. And IIRC Bruce and Robert H said it was 
doable, just a good deal of work on the refactor needed.

This convo went down at the Amsterdam conf where I spoke about using 
writeable LVM snapshots to expose each developer a copy of prod to 
noodle on, without having to actually make a full copy for each dev.

Added trivia: I gave the talk with a can of Heineken in my hand at the 
podium, and my lightning talk had the work F( Cool in the title ;)

That was also when I bought my plush Slony which was supposedly the very 
last one. (turns out they made more)
--
Jon Erdman (aka StuckMojo)
 PostgreSQL Zealot

On 12/15/15 9:48 PM, Jim Nasby wrote:
> On 12/13/15 7:37 AM, David Fetter wrote:
>> As I understand it, pushing these into a library has been proposed but
>> not rejected.  That it hasn't happened yet is mostly about the lack of
>> tuits (the round ones) to rewrite the functionality as libraries and
>> refactor pg_dump/pg_restore to use only calls to same.  As usual, it's
>> less about writing the code and more about the enormous amount of
>> testing any such a refactor would entail.
> 
> My understanding as well. IIRC Jon Erdman brought this question up a 
> couple years ago and the response was "It'd probably be accepted, it's 
> just that no one has done the work."
> 
>> I believe that refactoring much of pg_dump's functionality for the
>> current version of the server into SQL-accessible functions and making
>> pg_dump use only those functions is achievable with available
>> resources.
>>
>> Such a refactor need not be all-or-nothing.  For example, the
>> dependency resolution stuff is a first step that appears to be worth
>> doing by itself even if the effort then pauses, possibly for some
>> time.
> 
> If someone wanted to spend time on this, I suspect it'd be worth looking 
> at how bad some of the backward compatibility issues would be if done in 
> the server. Maybe they wouldn't be that bad. I suspect the audience for 
> this code would be much larger if it was in the server as opposed to a C 
> library.

Good News Everyone! + feature proposal

2023-10-05 Thread Jon Erdman



Hiya Hackers!

So I have some good news! At long last I've found a company/manager that 
wants to actually factually pay me to do some work on PG!!

Had my performance review today, and Apple wants me to get a patch 
accepted this quarter, with the promise of more to come after that. 
Luckily, this first patch can be anything (doesn't have to be of use to 
Apple, more to prove that I can get a patch accepted), so I'm open to 
suggestions of smaller stuff that is in demand at the moment.

For the proposal (this one is a bit Apple specific): because my team 
offers managed postgres to our Apple-internal customers, many of whom 
are not database experts, or at least not postgres experts, we'd like to 
be able to toggle the availability of UNLOGGED tables in 
postgresql.conf, so our less clueful users have fewer footguns.

So, my proposal is for a GUC to implement that, which would *OF COURSE* 
undefault to allowing UNLOGGED.

The reasoning here is we have autofailover set up for our standard 
cluster offering that we give to customers, using sync-rep to guarantee 
no data loss if we flop to the HA node. Any users not up to speed on 
what UNLOGGED really means could inadvertently lose data, so we'd like 
to be able to force it to be off, and turn it on upon request after 
educating the customer in question it's it's a valid use case.

So to begin with: I'm sure some folks will hate this idea, but maybe a 
good many wont, and remember, this would default to UNLOGGED enabled, so 
no change to current behaviour. and no encouragement to disallow it, but 
just the ability to do so, which i think is useful in 
hosted/managed/multi-tenant environment where most things are taken care 
of for the customer.

So I'd like to get a general idea how likely this would be to getting 
accepted if it did it, and did it right?

Let the flame war begin!

PS: I'm SO happy that this phase of my postgres journey has finally 
started
-- 
Jon Erdman (aka StuckMojo on IRC)
 PostgreSQL Zealot

Re: Making aggregate deserialization (and WAL receive) functions slightly faster

2023-10-05 Thread David Rowley

On Thu, 5 Oct 2023 at 18:23, Michael Paquier  wrote:
>
> On Wed, Oct 04, 2023 at 07:47:11PM +1300, David Rowley wrote:
> > The original patch had a new function in stringinfo.c which allowed a
> > StringInfoData to be initialised from an existing string with some
> > given length.  Tom wasn't a fan of that because there wasn't any
> > protection against someone trying to use the given StringInfoData and
> > then calling appendStringInfo to append another string. That can't be
> > done in this case as we can't repalloc the VARDATA_ANY(state) pointer
> > due to it not pointing directly to a palloc'd chunk.  Tom's complaint
> > seemed to be about having a reusable function which could be abused,
> > so I modified the patch to remove the reusable code.  I think your
> > macro idea in stringinfo.h would put the patch in the same position as
> > it was initially.
>
> Ahem, well.  Based on this argument my own argument does not hold
> much.  Perhaps I'd still use a macro at the top of array_userfuncs.c
> and numeric.c, to avoid repeating the same pattern respectively two
> and four times, documenting once on top of both macros that this is a
> fake StringInfo because of the reasons documented in these code paths.

I looked at the patch again and I just couldn't bring myself to change
it to that.  If it were a macro going into stringinfo.h then I'd agree
with having a macro or inline function as it would allow the reader to
conceptualise what's happening after learning what the function does.
Having multiple macros defined in various C files means that much
harder as there are more macros to learn.  Since we're only talking 4
lines of code, I think I'd rather reduce the number of hops the reader
must do to find out what's going on and just leave the patch as is.

I considered if it might be better to reduce the 4 lines down to 3 by
chaining the assignments like:

buf.maxlen = buf.cursor = 0;

but I think I might instead change it so that maxlen gets set to -1 to
follow what's done in LogicalParallelApplyLoop() and
LogicalRepApplyLoop(). In the absence of having a function/macro in
stringinfo.h, it might make grepping for this type of thing easier.

If anyone else has a good argument for having multiple macros for this
purpose then I could reconsider.

David

Re: make add_paths_to_append_rel aware of startup cost

2023-10-05 Thread David Rowley

On Thu, 5 Oct 2023 at 14:11, Andy Fan  wrote:
> Patch LGTM.

Thanks. Pushed.

David

Re: Remove MSVC scripts from the tree

2023-10-05 Thread Peter Eisentraut


On 02.10.23 09:38, Michael Paquier wrote:

Attached is a v2 with these adjustments, for now.


General comments:

- I think we can't just delete install-windows.sgml.  Some of that 
content needs to be moved over to installation.sgml.  As a simple 
example, install-windows.sgml shows which MSVC versions are supported. 
That information should surely be kept.


- Is src/backend/utils/README.Gen_dummy_probes still correct after this? 
 AFAICT, the Perl-based MSVC build system uses Gen_dummy_probes.pl, but 
the meson build uses Gen_dummy_probes.sed even on Windows.  Is that 
correct, intended?


- src/port/pgstrsignal.c contains a comment that it is not "built in 
MSVC builds", but AFAICT, that is only correct for the legacy Perl-based 
build system, not for meson.  Again, is that correct, intended?



Detail comments:

(Btw., whatever orderfile you used for the diff, I find that confusing.)

* config/perl.m4: This now contains all the required information, but 
maybe break the text into paragraphs a bit?



* doc/src/sgml/installation.sgml:

I think this paragraph should just be removed altogether:

  
   If you are building PostgreSQL for Microsoft
-  Windows, read this chapter if you intend to build with MinGW or Cygwin;
-  but if you intend to build with Microsoft's Visual
-  C++, see  instead.
+  Windows, read this chapter if you intend to build with Meson, MinGW or
+  Cygwin.
  

Here


 PostgreSQL can be built using Cygwin, a Linux-like environment for
 Windows, but that method is inferior to the native Windows build
-(see linkend="install-windows"/>) and

-running a server under Cygwin is no longer recommended.
+with Meson, and running a server under Cygwin is no longer recommended.


I think "with Meson" should be removed.  The tradeoff is Cygwin vs. 
native, it doesn't have anything to do with Meson.


Also, I think this paragraph needs a complete revision, along with 
however install-windows.sgml gets integrated:



-PostgreSQL for Windows can be built using MinGW, a Unix-like build
 [...]


* meson.build:  I think these comments are unnecessary and can be removed:

-# From Project.pm
+# MSVC flags

+  # Preprocessor definitions.


* src/bin/pgevent/meson.build:  After consideration, I think this 
comment should just be removed:


-# FIXME: copied from Mkvcbuild.pm, but I don't think that's the right 
approach

+# FIXME: this may not not the right approach..

The original site in Mkvcbuild.pm does not contain a comment, so we 
should accept that as canonical.  It doesn't help much if we carry 
around a comment like "this might be wrong" indefinitely without any 
further supporting material.



* src/common/Makefile and src/common/meson.build:  This change is losing 
the period at the end of the first sentence:


 # A few files are currently only built for frontend, not server
-# (Mkvcbuild.pm has a copy of this list, too).  logging.c is excluded
-# from OBJS_FRONTEND_SHLIB (shared library) as a matter of policy,
-# because it is not appropriate for general purpose libraries such
-# as libpq to report errors directly.
+# logging.c is excluded from OBJS_FRONTEND_SHLIB (shared library) as
+# a matter of policy, because it is not appropriate for general purpose
+# libraries such as libpq to report errors directly.

Re: [PATCH] Add CANONICAL option to xmlserialize

2023-10-05 Thread Jim Jones


Hi Chap

On 04.10.23 23:05, Chapman Flack wrote:

I hope I'm not butting in, but I too would be leery of any default
behavior that's going to say thing1 and thing2 are the same thing
but ignoring (name part of thing here). If that's the comparison
I mean to make, and it's as easy as CANONICAL WITHOUT COMMENTS
to say that's what I mean, I'd be happy to write that. It also means
that the next person reading my code will know "oh, he means
'same' in *that* way", without having to think about it.


That's a very compelling argument. Thanks for that!

It is indeed clearer to only remove items from the result set if 
explicitly said so.


v8 attached changes de default behaviour to WITH COMMENTS.

Best,

Jim
From fa3c9c7d5a37c407993f4e2076afed2891e1bfc1 Mon Sep 17 00:00:00 2001
From: Jim Jones 
Date: Wed, 4 Oct 2023 17:58:24 +0200
Subject: [PATCH v8] Add CANONICAL output format to xmlserialize

This patch introduces the CANONICAL option to xmlserialize, which
serializes xml documents in their canonical form - as described in
the W3C Canonical XML Version 1.1 specification. This option can
be used with the additional parameter WITH [NO] COMMENTS to keep
or remove xml comments from the canonical xml output. In case no
parameter is provided, WITH COMMENTS will be used as default. This
feature is based on the function xmlC14NDocDumpMemory from the C14N
module of libxml2.

This patch also includes regression tests and documentation.
---
 doc/src/sgml/datatype.sgml|  41 +++-
 src/backend/executor/execExprInterp.c |   2 +-
 src/backend/parser/gram.y |  21 +-
 src/backend/parser/parse_expr.c   |   2 +-
 src/backend/utils/adt/xml.c   | 272 --
 src/include/nodes/parsenodes.h|   1 +
 src/include/nodes/primnodes.h |  12 +-
 src/include/parser/kwlist.h   |   1 +
 src/include/utils/xml.h   |   2 +-
 src/test/regress/expected/xml.out | 114 +++
 src/test/regress/expected/xml_1.out   | 108 ++
 src/test/regress/expected/xml_2.out   | 114 +++
 src/test/regress/sql/xml.sql  |  63 ++
 src/tools/pgindent/typedefs.list  |   1 +
 14 files changed, 642 insertions(+), 112 deletions(-)

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 5d23765705..df7525208a 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4465,7 +4465,7 @@ xml 'bar'
 xml, uses the function
 xmlserialize:xmlserialize
 
-XMLSERIALIZE ( { DOCUMENT | CONTENT } value AS type [ [ NO ] INDENT ] )
+XMLSERIALIZE ( { DOCUMENT | CONTENT } value AS type [ { [ NO ] INDENT ] | CANONICAL [ WITH [NO] COMMENTS ]})
 
 type can be
 character, character varying, or
@@ -4482,6 +4482,45 @@ XMLSERIALIZE ( { DOCUMENT | CONTENT } value AS 
 
+   
+The option CANONICAL converts a given
+XML document to its https://www.w3.org/TR/xml-c14n11/#Terminology;>canonical form
+based on the https://www.w3.org/TR/xml-c14n11/;>W3C Canonical XML 1.1 Specification.
+It is basically designed to provide applications the ability to compare xml documents or test if they
+have been changed. The optional parameters WITH COMMENTS (which is the default) or
+WITH NO COMMENTS, respectively, keep or remove XML comments from the given document.
+
+
+ 
+ Example:
+
+
+   

 When a character string value is cast to or from type
 xml without going through XMLPARSE or
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 24c2b60c62..d9305c28b0 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -3943,7 +3943,7 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
 *op->resvalue =
 	PointerGetDatum(xmltotext_with_options(DatumGetXmlP(value),
 		   xexpr->xmloption,
-		   xexpr->indent));
+		   xexpr->format));
 *op->resnull = false;
 			}
 			break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e56cbe77cb..4c738ad4b2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -615,12 +615,13 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type 	xml_root_version opt_xml_root_standalone
 %type 	xmlexists_argument
 %type 	document_or_content
-%type 	xml_indent_option xml_whitespace_option
+%type 	xml_whitespace_option
 %type 	xmltable_column_list xmltable_column_option_list
 %type 	xmltable_column_el
 %type 	xmltable_column_option_el
 %type 	xml_namespace_list
 %type 	xml_namespace_el
+%type  	opt_xml_serialize_format
 
 %type 	func_application func_expr_common_subexpr
 %type 	func_expr func_expr_windowless
@@ -692,7 +693,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
 	BOOLEAN_P BOTH BREADTH BY
 
-	CACHE CALL CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
+	CACHE

Re: [PATCH] Fix memory leak in memoize for numeric key

2023-10-05 Thread David Rowley

On Wed, 4 Oct 2023 at 21:08, Orlov Aleksej  wrote:
> I've finished testing the patch.
> I confirm that the patch solves the problem and works just as fast.

Thanks for checking that.

I've pushed the patch now.

David

Re: Removing unneeded self joins

2023-10-05 Thread Andrei Lepikhov

On 4/10/2023 14:34, Alexander Korotkov wrote:

Hi!

On Wed, Oct 4, 2023 at 9:56 AM Andrei Lepikhov 
mailto:a.lepik...@postgrespro.ru>> wrote:

 > On 4/10/2023 07:12, Alexander Korotkov wrote:
 > > On Tue, Sep 12, 2023 at 4:58 PM Andrey Lepikhov
 > > mailto:a.lepik...@postgrespro.ru>> wrote:
 > >> On 5/7/2023 21:28, Andrey Lepikhov wrote:
 > >>> During the significant code revision in v.41 I lost some replacement
 > >>> operations. Here is the fix and extra tests to check this in the 
future.

 > >>> Also, Tom added the JoinDomain structure five months ago, and I added
 > >>> code to replace relids for that place too.
 > >>> One more thing, I found out that we didn't replace SJs, defined by
 > >>> baserestrictinfos if no one self-join clause have existed for the 
join.

 > >>> Now, it is fixed, and the test has been added.
 > >>> To understand changes readily, see the delta file in the attachment.
 > >> Here is new patch in attachment. Rebased on current master and some
 > >> minor gaffes are fixed.
 > >
 > > I went through the thread and I think the patch gets better shape.  A
 > > couple of notes from my side.
 > > 1) Why replace_relid() makes a copy of lids only on insert/replace of
 > > a member, but performs deletion in-place?
 >
 > Shortly speaking, it was done according to the 'Paranoid' strategy.
 > The main reason for copying before deletion was the case with the rinfo
 > required_relids and clause_relids. They both point to the same Bitmapset
 > in some cases. And we feared such things for other fields.
 > Right now, it may be redundant because we resolved the issue mentioned
 > above in replace_varno_walker.

OK, but my point is still that you should be paranoid in all the cases 
or none of the cases.  Right now (newId < 0) branch doesn't copy source 
relids, but bms_is_member(oldId, relids) does copy.  Also, I think 
whether we copy or not should be reflected in the function comment.

/*
  * Substitute newId by oldId in relids.
  */
static Bitmapset *
replace_relid(Relids relids, int oldId, int newId)
{
     if (oldId < 0)
         return relids;

     if (newId < 0)
         /* Delete relid without substitution. */
         return bms_del_member(relids, oldId);

     if (bms_is_member(oldId, relids))
         return bms_add_member(bms_del_member(bms_copy(relids), oldId), 
newId);

     return relids;
}

We tried to use replace_relid() for both cases of JOIN deletion: 
unneeded outer join and self-join, and the relid deletion is used only 
in the first case. Such an approach was used there for a long time, and 
we just didn't change it.

I am prone to removing the copy operation in the code of relid replacement.

 > Relid replacement machinery is the most contradictory code here. We used
 > a utilitarian approach and implemented a simplistic variant.

 > > 2) It would be nice to skip the insertion of IS NOT NULL checks when
 > > they are not necessary.  [1] points that infrastructure from [2] might
 > > be useful.  The patchset from [2] seems committed mow.  However, I
 > > can't see it is directly helpful in this matter.  Could we just skip
 > > adding IS NOT NULL clause for the columns, that have
 > > pg_attribute.attnotnull set?
 > Thanks for the links, I will look into that case.

Thanks for the curious issue.
The new field Var::varnullingrels introduced in [2] doesn't make sense 
here, as I see: we operate with plain relations only, and I don't know 
how it can be applied to an arbitrary subtree contained OUTER JOINs.
The second option, the attnotnull flag, can be used in this code. We 
haven't implemented it because the process_equivalence routine doesn't 
check the attnotnull before creating NullTest.
In general, it is not a difficult operation - we just need to add a 
trivial get_attnotnull() routine to lssycache.c likewise 
get_attgenerated() and other functions.
But, replace_varno uses the walker to change the relid. The mentioned 
replacement, like

X=X --> X IS NOT NULL
can be applied on different levels of the expression, look:
A a1 JOIN A a2 ON (a1.id=a2.id) WHERE (a1.x AND (a1.y=a2.y))
Here, we can replace id=id and y=y. It may need some 'unwanted clauses' 
collection procedure and a second pass through the expression tree to 
remove them. It may add some unpredictable overhead.
We can replace such a clause with a trivial 'TRUE' clause, of course. 
But is it the feature you have requested?

--
regards,
Andrey Lepikhov
Postgres Professional

Re: Modernize const handling with readline

2023-10-05 Thread Peter Eisentraut


On 04.10.23 17:09, Aleksander Alekseev wrote:

Hi,


On 03.10.23 13:28, Aleksander Alekseev wrote:

While examining the code for similar places I noticed that the
following functions can also be const'ified:

- crc32_sz


I suppose this could be changed.


OK, that's a simple change. Here is the patch.


committed this and my patch

Re: Extract numeric filed in JSONB more effectively

2023-10-05 Thread Andy Fan

Hi,

I am feeling this topic has been well discussed and the only pending
issues are below,  it would be great that any committer can have a
look at these,  so I mark this entry as "Ready for Committer".

Things are not addressed yet:
> 1.  the error message handling.
>

You can check [1] for more background of this,  I think blocking this
feature at an error message level is not pretty reasonable.


> 2.  if we have chances to optimize _tz functions, I guess no.
>

patch 002 is dedicated  for this,  I think it should not be committed,
the reason is described in the commit message.

3.  function naming issue. I think I can get it modified once after
> all the other issues are addressed.
>
>
[1]
https://www.postgresql.org/message-id/d70280648894e56f9f0d12c75090c3d8%40anastigmatix.net


-- 
Best Regards
Andy Fan

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

2023-10-05 Thread Drouvot, Bertrand


Hi,

On 10/5/23 7:10 AM, Michael Paquier wrote:

On Wed, Oct 04, 2023 at 12:54:24PM +0200, Drouvot, Bertrand wrote:

Except the Nit that I mentioned in 0001, that looks all good to me (with the
new wait in 001_worker_spi.pl).


Thanks, I've applied the refactoring, including in it the stuff to be
able to control the flags used when launching a dynamic worker.  0001
also needed some indenting.  You'll notice that the diffs in
worker_spi are minimal now.  worker_spi_main is no more, as an effect
of..  Cough..  c8e318b1b.


Thanks!


+# An error message should be issued.
+my $nb_errors = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+   if ($node->log_contains(
+   "role \"nologrole\" is not permitted to log in", 
$log_start))
+   {
+   $nb_errors = 1;
+   last;
+   }
+   usleep(100_000);
+}

This can be switched to use $node->wait_for_log, making the test
simpler.  No need for Time::HiRes, either.



Oh, thanks, did not know about $node->wait_for_log, good to know!


-extern void BackgroundWorkerInitializeConnection(const char *dbname,
const char *username, uint32 flags);
+extern void BackgroundWorkerInitializeConnection(const char *dbname,
const char *username, bits32 flags);
[...]
-BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid,
uint32 flags)
+BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid,
bits32 flags)

That's changing signatures just for the sake of breaking them.  I
would leave that alone, I guess..


Ok, switched back to uint32 in v6-0002 attached (v6-0001 is your v5-0001
unchanged).



@@ -719,6 +719,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
 const char *username, Oid useroid,
 bool load_session_libraries,
 bool override_allow_connections,
+bool bypass_login_check,
 char *out_dbname)

I was not paying much attention here, but load_session_libraries gives
a good argument in favor of switching all these booleans to a single
bits32 argument as that would make three of them now, with a different
set of flags than the bgworker ones.  This can be refactored on its
own.


Yeah good point, will work on it once the current one is committed.



-#define BGWORKER_BYPASS_ALLOWCONN 1
+#define BGWORKER_BYPASS_ALLOWCONN 0x0001

Can be a change of its own as well.


Yeah, but I think it's simple enough to just keep this change here
(and I don't think it's "really" needed without introducing 0x0002)



While looking at the refactoring at worker_spi.  I've noticed that it
would be simple and cheap to have some coverage for BYPASS_ALLOWCONN,
something we've never done since this has been introduced.  Let me
suggest 0001 to add some coverage.


Good idea! I looked at 0001 and it looks ok to me.



0002 is your own patch, with the tests simplified a bit.


Thanks, LGTM.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comFrom 5ebe8aa15f22851ae7771137f58366ea9aa21087 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 5 Oct 2023 14:06:37 +0900
Subject: [PATCH v6 2/2] Allow background workers to bypass login check

This adds a new flag value (BGWORKER_BYPASS_ROLELOGINCHECK) to the flags
being used in BackgroundWorkerInitializeConnection() and
BackgroundWorkerInitializeConnectionByOid().  This flag allows the
background workers to bypass the login check done in
InitializeSessionUserId().
---
 doc/src/sgml/bgworker.sgml|  3 +-
 src/backend/bootstrap/bootstrap.c |  2 +-
 src/backend/postmaster/autovacuum.c   |  5 +--
 src/backend/postmaster/postmaster.c   |  2 +
 src/backend/tcop/postgres.c   |  1 +
 src/backend/utils/init/miscinit.c |  4 +-
 src/backend/utils/init/postinit.c |  5 ++-
 src/include/miscadmin.h   |  4 +-
 src/include/postmaster/bgworker.h |  6 ++-
 .../modules/worker_spi/t/001_worker_spi.pl| 37 +++
 src/test/modules/worker_spi/worker_spi.c  |  2 +
 11 files changed, 59 insertions(+), 12 deletions(-)
   9.2% doc/src/sgml/
  10.7% src/backend/postmaster/
  15.1% src/backend/utils/init/
   9.5% src/include/postmaster/
  47.4% src/test/modules/worker_spi/t/
   3.1% src/test/modules/worker_spi/
   4.8% src/

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index 9ad1146ba0..7335c02ed8 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -200,7 +200,8 @@ typedef struct BackgroundWorker
InvalidOid, the process will run as the superuser created
during initdb. If 
BGWORKER_BYPASS_ALLOWCONN
is specified as flags it is possible to bypass the 
restriction
-   to connect to databases not allowing user connections.
+   to connect to databases not allowing user connections. If

Re: pg16: XX000: could not find pathkey item to sort

2023-10-05 Thread David Rowley

On Tue, 3 Oct 2023 at 20:16, David Rowley  wrote:
> I wonder if the attached patch is too much of a special case fix.  I
> guess from the lack of complaints previously that there are no other
> cases where we could possibly have pathkeys that belong to columns
> that are aggregated.  I've not gone to much effort to see if I can
> craft a case that hits this without the ORDER BY/DISTINCT aggregate
> optimisation, however.

I spent more time on this today.  I'd been wondering if there was any
reason why create_agg_path() would receive a subpath with pathkeys
that were anything but the PlannerInfo's group_pathkeys.  I mean, how
could we do Group Aggregate if it wasn't?  I wondered if grouping sets
might change that, but it seems the group_pathkeys will be set to the
initial grouping set.

Given that, it would seem it's safe to just trim off any pathkey that
was added to the group_pathkeys by
adjust_group_pathkeys_for_groupagg().
PlannerInfo.num_groupby_pathkeys marks the number of pathkeys that
existed in group_pathkeys before adjust_group_pathkeys_for_groupagg()
made any additions, so we can just trim the list length back to that.

I've done this in the attached patch.   I also considered if it was
worth adding a regression test for this and I concluded that there are
better ways to test for this and considered if we should add some code
to createplan.c to check that all Path pathkeys have corresponding
items in the PathTarget.  I've included an additional patch which adds
some code in USE_ASSERT_CHECKING builds to verify this.  Without the
fix it's simple enough to trigger this with a query such as:

select two,count(distinct four) from tenk1 group by two order by two;

Without the fix the additional asserts cause the regression tests to
fail, but with the fix everything passes.

Justin's case is quite an obscure way to hit this as it requires
partitionwise aggregation plus a single partition so that the Append
is removed due to only having a single subplan in setrefs.c.  If there
had been 2 partitions, then the AppendPath wouldn't have inherited the
subpath's pathkeys per code at the end of create_append_path().

So in short, I propose the attached fix without any regression tests
because I feel that any regression test would just mark that there was
a big in create_agg_path() and not really help with ensuring we don't
end up with some similar problem in the future.

I have some concerns that the assert_pathkeys_in_target() function
might be a little heavyweight for USE_ASSERT_CHECKING builds. So I'm
not proposing to commit that without further discussion.

Does anyone feel differently?

If not, I plan to push the attached
strip_aggregate_pathkeys_from_aggpaths_v2.patch early next week.

David

strip_aggregated_pathkeys_from_aggpaths_v2.patch
Description: Binary data

assert_pathkeys_in_target.patch
Description: Binary data

77 matches

Mail list logo