Re: pltcl crashes due to a syntax error

2024-06-02 Thread Tom Lane
Erik Wienhold  writes:
> Tcl_GetVar returns null if errorInfo does not exist.  Omitting econtext
> from errcontext in that case looks like the proper fix to me.

Yeah, that was the conclusion I came to last night while sitting in
the airport, but I didn't have time to prepare a cleaned-up patch.

The new bit of information that this bug report provides is that it's
possible to get a TCL_ERROR result without Tcl having set errorInfo.
That seems a tad odd, and it must happen only in weird corner cases,
else we'd have heard of this decades ago.  Not sure if it's worth
trying to characterize those cases further, however.

> Or just do away with throw_tcl_error and call ereport directly.

I'd say this adds to the importance of having throw_tcl_error,
because now it's even more complex than before, and there are
multiple call sites.

> Compare
> that to pltcl_trigger_handler where the same case is handled like this:

Hm, I wonder why that's not using throw_tcl_error.  I guess because
it wants to give its own primary message, but still ...

        regards, tom lane




Re: meson and check-tests

2024-06-01 Thread Tom Lane
"Tristan Partin"  writes:
> On Fri May 31, 2024 at 12:02 PM CDT, Ashutosh Bapat wrote:
>> We talked this off-list at the conference. It seems we have to somehow
>> avoid passing pg_regress --schedule argument and instead pass the list of
>> tests. Any idea how to do that?

> I think there are 2 solutions to this.
> 1. Avoid passing --schedule by default, which doesn't sound like a great 
>solution.
> 2. Teach pg_regress to ignore the --schedule option if specific tests 
>are passed instead.
> 3. Add a --no-schedule option to pg_regress which would override the 
>previously added --schedule option.
> I personally prefer 2 or 3.

Just to refresh peoples' memory of what the Makefiles do:
src/test/regress/GNUmakefile has

check: all
$(pg_regress_check) $(REGRESS_OPTS) 
--schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)

check-tests: all | temp-install
$(pg_regress_check) $(REGRESS_OPTS) $(MAXCONNOPT) $(TESTS) 
$(EXTRA_TESTS)

(and parallel cases for installcheck etc).  AFAICS, meson.build has
no equivalent to the EXTRA_TESTS add-on, nor does it have behavior
equivalent to check-tests' substitution of $(TESTS) for --schedule.
But I suggest that those behaviors have stood for a long time and
so the appropriate thing to do is duplicate them as best we can,
not invent something different.

regards, tom lane




Re: pltcl crashes due to a syntax error

2024-06-01 Thread Tom Lane
"a.kozhemyakin"  writes:
> When executing the following query on master branch:

> CREATE EXTENSION pltcl;
> CREATE or replace PROCEDURE test_proc5(INOUT a text)
>      LANGUAGE pltcl
>      AS $$
>      set aa [concat $1 "+" $1]
>      return [list $aa $aa])
>      $$;

> CALL test_proc5('abc');
> CREATE EXTENSION
> CREATE PROCEDURE
> server closed the connection unexpectedly

Replicated here.  I'll look into it later if nobody beats me
to it.  Thanks for the report!

regards, tom lane




Re: Explicit specification of index ensuring uniqueness of foreign columns

2024-05-31 Thread Tom Lane
Kaiting Chen  writes:
> I'd like to resurrect a subset of my proposal in [1], specifically that:
>   The FOREIGN KEY constraint syntax gains a [ USING INDEX index_name ] clause
>   optionally following the referenced column list.
> ...
> While, in this minimal reproduction, the two indexes are interchangeable, 
> there
> are situations that may reasonably occur in the course of ordinary use in 
> which
> they aren't. For example, a redundant unique index with different storage
> parameters may exist during the migration of an application schema.

I agree that there's a hazard there, but I question if the case is
sufficiently real-world to justify the costs of introducing a
non-SQL-standard clause in foreign key constraints.

One such cost is that pg_dump output would become less able to be
loaded into other DBMSes, or even into older PG versions.

I also wonder if this wouldn't just trade one fragility for another.
Specifically, I am not sure that we guarantee that the names of
indexes underlying constraints remain the same across dump/reload.
If they don't, the USING INDEX clause might fail unnecessarily.

As against that, I'm not sure I've ever seen a real-world case with
intentionally-duplicate unique indexes.

So on the whole I'm unconvinced that this is worth changing.

        regards, tom lane




Re: pgsql: Add more SQL/JSON constructor functions

2024-05-29 Thread Tom Lane
Amit Langote  writes:
> On Mon, May 27, 2024 at 7:10 PM Alvaro Herrera  
> wrote:
>> On 2024-May-27, Alvaro Herrera wrote:
>> I just noticed this behavior, which looks like a bug to me:
>> 
>> select json_serialize('{"a":1, "a":2}' returning varchar(5));
>> json_serialize
>> 
>> {"a":
>> 
>> I think this function should throw an error if the destination type
>> doesn't have room for the output json.  Otherwise, what good is the
>> serialization function?

> This behavior comes from using COERCE_EXPLICIT_CAST when creating the
> coercion expression to convert json_*() functions' argument to the
> RETURNING type.

Yeah, I too think this is a cast, and truncation is the spec-defined
behavior for casting to varchar with a specific length limit.  I see
little reason that this should work differently from

select json_serialize('{"a":1, "a":2}' returning text)::varchar(5);
 json_serialize 

 {"a":
(1 row)


regards, tom lane




Re: Need clarification on compilation errors in PG 16.2

2024-05-28 Thread Tom Lane
Pradeep Kumar  writes:
> Yes it was defined in "pg_config.h"
> /* Define to 1 if you have the `memset_s' function. */
> #define HAVE_MEMSET_S 1

That's correct for recent versions of macOS.  I see you are
building against a recent SDK:

/Library/Developer/CommandLineTools/SDKs/MacOSX14.2.sdk/usr/include/string.h

but I wonder if maybe the actual OS version is back-rev?

        regards, tom lane




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-26 Thread Tom Lane
Hannu Krosing  writes:
> Attached is a minimal patch to allow missing roles in REVOKE command

FTR, I think this is a very bad idea.

It might be OK if we added some kind of IF EXISTS option,
but I'm not eager about that concept either.

The right thing here is to fix the backend so that pg_dump doesn't
see these bogus ACLs.

regards, tom lane




Re: DISCARD ALL does not force re-planning of plpgsql functions/procedures

2024-05-26 Thread Tom Lane
Jelte Fennema-Nio  writes:
> I got a report on the PgBouncer repo[1] that running DISCARD ALL was
> not sufficient between connection handoffs to force replanning of
> stored procedures. Turns out that while DISCARD AL and DISCARD PLAN
> reset the plan cache they do not reset the num_custom_plans fields of
> the existing PlanSources. So while the generic plan is re-planned
> after DISCARD ALL, the decision on whether to choose it or not won't
> be changed.

Hm, should it be?  That's hard-won knowledge, and I'm not sure there
is a good reason to believe it's no longer applicable.

Note that any change in behavior there would affect prepared
statements in general, not only plpgsql.

    regards, tom lane




Re: Fix an incorrect assertion condition in mdwritev().

2024-05-25 Thread Tom Lane
I wrote:
> Hmm ... I agree that this is better normally.  But there's an
> edge case where it would fail to notice a problem that the
> existing code does notice: if blocknum is close to UINT32_MAX
> and adding nblocks causes it to wrap around to a small value.
> Is there an inexpensive way to catch that?

After a few minutes' thought, how about:

Assert((uint64) blocknum + (uint64) nblocks <= (uint64) mdnblocks(reln, 
forknum));

This'd stop being helpful if we ever widen BlockNumber to 64 bits,
but I think that's unlikely.  (Partitioning seems like a better answer
for giant tables.)

        regards, tom lane




Re: Fix an incorrect assertion condition in mdwritev().

2024-05-25 Thread Tom Lane
Michael Paquier  writes:
> On Sat, May 25, 2024 at 11:52:22PM +0800, Xing Guo wrote:
>> #ifdef CHECK_WRITE_VS_EXTEND
>> -Assert(blocknum < mdnblocks(reln, forknum));
>> +Assert(blocknum + nblocks <= mdnblocks(reln, forknum));
>> #endif

> Yes, it looks like you're right that this can be made stricter,
> computing the number of blocks we're adding in the number calculated
> (aka adding one block to this number fails immediately at initdb).

Hmm ... I agree that this is better normally.  But there's an
edge case where it would fail to notice a problem that the
existing code does notice: if blocknum is close to UINT32_MAX
and adding nblocks causes it to wrap around to a small value.
Is there an inexpensive way to catch that?  (If not, it's
not a reason to block this patch; but let's think about it
while we're here.)

regards, tom lane




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-25 Thread Tom Lane
Hannu Krosing  writes:
> Having an pg_init_privs entry referencing a non-existing user is
> certainly of no practical use.

Sure, that's not up for debate.  What I think we're discussing
right now is

1. What other cases are badly handled by the pg_init_privs
mechanisms.

2. How much of that is practical to fix in v17, seeing that
it's all long-standing bugs and we're already past beta1.

I kind of doubt that the answer to #2 is "all of it".
But perhaps we can do better than "none of it".

        regards, tom lane




Re: Schema variables - new implementation for Postgres 15

2024-05-24 Thread Tom Lane
Pavel Stehule  writes:
> we can introduce special safe mode started by
> set enable_direct_variable_read to off;
> and allowing access to variables only by usage dedicated function
> (supported by parser) named like variable or pg_variable

Didn't we learn twenty years ago that GUCs that change query
semantics are an awful idea?  Pick a single access method
for these things and stick to it.

    regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-24 Thread Tom Lane
Tomas Vondra  writes:
> I personally don't think the FOSDEM triage is a very productive use of
> our time - we go through patches top to bottom, often with little idea
> what the current state of the patch is. We always ran out of time after
> looking at maybe 1/10 of the list.

> Having an in-person discussion about patches would be good, but I think
> we should split the meeting into much smaller groups for that, each
> looking at a different subset. And maybe it should be determined in
> advance, so that people can look at those patches in advance ...

Yeah, subgroups of 3 or 4 people sounds about right.  And definitely
some advance looking to see which patches need discussion.

        regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-24 Thread Tom Lane
Joe Conway  writes:
> On 5/24/24 15:45, Tom Lane wrote:
>> I was *not* proposing doing a regular review, unless of course
>> somebody really wants to do that.  What I am thinking about is
>> suggesting how to make progress on patches that are stuck, or in some
>> cases delivering the bad news that this patch seems unlikely to ever
>> get accepted and it's time to cut our losses.  (Patches that seem to
>> be moving along in good order probably don't need any attention in
>> this process, beyond determining that that's the case.)  That's why
>> I think we need some senior people doing this, as their opinions are
>> more likely to be taken seriously.

> Maybe do a FOSDEM-style dev meeting with triage review at PG.EU would at 
> least move us forward? Granted it is less early and perhaps less often 
> than the thread seems to indicate, but has been tossed around before and 
> seems doable.

Perhaps.  The throughput of an N-person meeting is (at least) a factor
of N less than the same N people looking at patches individually.
On the other hand, the consensus of a meeting is more likely to be
taken seriously than a single person's opinion, senior or not.
So it could work, but I think we'd need some prefiltering so that
the meeting only spends time on those patches already identified as
needing help.

regards, tom lane




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-24 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 24, 2024 at 2:57 PM Tom Lane  wrote:
>> Doesn't seem right to me.  That will give pg_dump the wrong idea
>> of what the initial privileges actually were, and I don't see how
>> it can construct correct delta GRANT/REVOKE on the basis of false
>> information.  During the dump reload, the extension will be
>> recreated with the original owner (I think), causing its objects'
>> privileges to go back to the original pg_init_privs values.

> Oh! That does seem like it would make what I said wrong, but how would
> it even know who the original owner was? Shouldn't we be recreating
> the object with the owner it had at dump time?

Keep in mind that the whole point here is for the pg_dump script to
just say "CREATE EXTENSION foo", not to mess with the individual
objects therein.  So the objects are (probably) going to be owned by
the user that issued CREATE EXTENSION.

In the original conception, that was the end of it: what you got for
the member objects was whatever state CREATE EXTENSION left behind.
The idea of pg_init_privs is to support dump/reload of subsequent
manual alterations of privileges for extension-created objects.
I'm not, at this point, 100% certain that that's a fully realizable
goal.  But I definitely think it's insane to expect that to work
without also tracking changes in the ownership of said objects.

Maybe forbidding ALTER OWNER on extension-owned objects isn't
such a bad idea?

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-24 Thread Tom Lane
Tomas Vondra  writes:
> On 5/24/24 19:09, Tom Lane wrote:
>>>> ... Maybe we could divide and conquer: get a
>>>> dozen-or-so senior contributors to split up the list of pending
>>>> patches and then look at each one with an eye to what needs to
>>>> happen to move it along (*not* to commit it right away, although
>>>> in some cases maybe that's the thing to do).

> I think meeting all these conditions - a week early in the cycle, but
> not in the summer, when everyone can focus on this - will be difficult.

True.  Perhaps in the fall there'd be a better chance?

> So maybe it'd be better to just set some deadline by which this needs to
> be done, and make sure every pending patch has someone expected to look
> at it? IMHO we're not in position to assign stuff to people, so I guess
> people would just volunteer anyway - the CF app might track this.

One problem with a time-extended process is that the set of CF entries
is not static, so a predetermined division of labor will result in
missing some newly-arrived entries.  Maybe that's not a problem
though; anything newly-arrived is by definition not "stuck".  But we
would definitely need some support for keeping track of what's been
looked at and what remains, whereas if it happens over just a few
days that's probably not so essential.

> It's not entirely clear to me if this would effectively mean doing a
> regular review of those patches, or something less time consuming.

I was *not* proposing doing a regular review, unless of course
somebody really wants to do that.  What I am thinking about is
suggesting how to make progress on patches that are stuck, or in some
cases delivering the bad news that this patch seems unlikely to ever
get accepted and it's time to cut our losses.  (Patches that seem to
be moving along in good order probably don't need any attention in
this process, beyond determining that that's the case.)  That's why
I think we need some senior people doing this, as their opinions are
more likely to be taken seriously.

regards, tom lane




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-05-24 Thread Tom Lane
The partition_split test has unstable results, as shown at [1].
I suggest adding "ORDER BY conname" to the two queries shown
to fail there.  Better look at other queries in the test for
possible similar problems, too.

    regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jackdaw=2024-05-24%2015%3A58%3A17




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-24 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 24, 2024 at 11:59 AM Tom Lane  wrote:
>> So this goal seems to
>> mean that neither ALTER OWNER nor REASSIGN OWNED should touch
>> pg_init_privs at all, as that would break its function of recording
>> a historical state.  Only DROP OWNED should get rid of pg_init_privs
>> grants, and that only because there's no choice -- if the role is
>> about to go away, we can't hang on to a reference to its OID.

> But I would have thought that the right thing to do to pg_init_privs
> here would be essentially s/$OLDOWNER/$NEWOWNER/g.

Doesn't seem right to me.  That will give pg_dump the wrong idea
of what the initial privileges actually were, and I don't see how
it can construct correct delta GRANT/REVOKE on the basis of false
information.  During the dump reload, the extension will be
recreated with the original owner (I think), causing its objects'
privileges to go back to the original pg_init_privs values.
Applying a delta that starts from some other state seems pretty
questionable in that case.

It could be that if we expect pg_dump to issue an ALTER OWNER
to move ownership of the altered extension object to its new
owner, and only then apply its computed delta GRANT/REVOKEs,
then indeed the right thing is for the original ALTER OWNER
to apply s/$OLDOWNER/$NEWOWNER/g to pg_init_privs.  I've not
thought this through in complete detail, but it feels like
that might work, because the reload-time ALTER OWNER would
apply exactly that change to both the object's ACL and its
pg_init_privs, and then the delta is starting from the right state.
Of course, pg_dump can't do that right now because it lacks the
information that such an ALTER is needed.

Although ... this is tickling a recollection that pg_dump doesn't
try very hard to run CREATE EXTENSION with the same owner that
the extension had originally.  That's a leftover from the times
when basically all extensions required superuser to install,
and of course one superuser is as good as the next.  There might
be some work we have to do on that side too if we want to up
our game in this area.

Another case that's likely not handled well is what if the extension
really shouldn't have its original owner (e.g. you're using
--no-owner).  If it's restored under a new owner then the
pg_init_privs data certainly doesn't apply, and it feels like it'd
be mostly luck if the precomputed delta GRANT/REVOKEs lead to a
state you like.

regards, tom lane




Re: PG17beta1: Unable to test Postgres on Fedora due to fatal Error in psql: undefined symbol: PQsocketPoll

2024-05-24 Thread Tom Lane
Hans Buschmann  writes:
> When I tried to connect to the restored database with psql \c I got:
> ...
> postgres=# \c cpsdb
> pgbeta/bin/psql: symbol lookup error: pgbeta/bin/psql: undefined symbol: 
> PQsocketPoll

> (To my understanding) the problem comes from incompatible libpq.so libraries 
> on the system.

Right, you must have a v16-or-earlier libpq lying around somewhere,
and psql has bound to that not to the beta-test version.
PQsocketPoll is new in v17.

> - Why doesn't psql use the just created lib64/libpq.so.5.17 from ninja 
> install?

It's on you to ensure that happens, especially on Linux systems which
have a strong bias towards pulling libraries from /usr/lib[64].
Normally our --enable-rpath option is sufficient; while that's
default in an autoconf-based build, I'm not sure that it is
in a meson build.  Also, if your beta libpq is not where the
rpath option expected it to get installed, the linker will silently
fall back to /usr/lib[64].

> The loading of the locally available libpq.so should always have priority 
> over a system wide in /usr/lib64

Tell it to the Linux developers --- they think the opposite.
Likewise, all of your other proposals need to be addressed to
the various distros' packagers; this is not the place to complain.

The main thing that is bothering me about the behavior you
describe is that it didn't fail until psql actually tried to
call PQsocketPoll.  (AFAICT from a quick look, that occurs
during \c but not during the startup connection.)  I had thought
that we select link options that result in early binding and
hence startup-time failure for a case like this.  I can confirm
though that this acts as described on my RHEL8 box if I force
current psql to link to v16 libpq, so either we've broken that
or it never did apply to frontend programs.  But it doesn't
seem to me to be a great thing for it to behave like this.
You could easily miss that you have a broken setup until
after you deploy it.

    regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-24 Thread Tom Lane
Tomas Vondra  writes:
> On 5/20/24 16:16, Robert Haas wrote:
>> On Sun, May 19, 2024 at 3:18 PM Tom Lane  wrote:
>>> * Before starting this thread, Robert did a lot of very valuable
>>> review of some individual patches.  I think what prompted him to
>>> start the thread was the realization that he'd only made a small
>>> dent in the problem.  Maybe we could divide and conquer: get a
>>> dozen-or-so senior contributors to split up the list of pending
>>> patches and then look at each one with an eye to what needs to
>>> happen to move it along (*not* to commit it right away, although
>>> in some cases maybe that's the thing to do).  It'd be great if
>>> that could happen just before each commitfest, but that's probably
>>> not practical with the current patch volume.  What I'm thinking
>>> for the moment is to try to make that happen once a year or so.

>> I like this idea.

> Me too. Do you think it'd happen throughout the whole year, or at some
> particular moment?

I was imagining a focused community effort spanning a few days to
a week.  It seems more likely to actually happen if we attack it
that way than if it's spread out as something people will do when
they get around to it.  Of course the downside is finding a week
when everybody is available.

> We used to do a "triage" at the FOSDEM PGDay meeting, but that used to
> be more of an ad hoc thing to use the remaining time, with only a small
> subset of people. But that seems pretty late in the dev cycle, I guess
> we'd want it to happen early, perhaps during the first CF?

Yeah, early in the cycle seems more useful, although the summer might
be the worst time for peoples' availability.

regards, tom lane




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-24 Thread Tom Lane
Daniel Gustafsson  writes:
> On 24 May 2024, at 16:20, Tom Lane  wrote:
>> Another point: shdepReassignOwned explicitly does not touch grants
>> or default ACLs.  It feels like the same should be true of
>> pg_init_privs entries,

> Agreed, I can't see why pg_init_privs should be treated differently.

Thinking about this some more: the point of pg_init_privs is to record
an object's privileges as they stood at the end of CREATE EXTENSION
(or extension update), with the goal that pg_dump should be able to
compute the delta between that and the object's current privileges
and emit GRANT/REVOKE commands to restore those current privileges
after a fresh extension install.  (We slide gently past the question
of whether the fresh extension install is certain to create privileges
matching the previous pg_init_privs entry.)  So this goal seems to
mean that neither ALTER OWNER nor REASSIGN OWNED should touch
pg_init_privs at all, as that would break its function of recording
a historical state.  Only DROP OWNED should get rid of pg_init_privs
grants, and that only because there's no choice -- if the role is
about to go away, we can't hang on to a reference to its OID.

However ... then what are the implications of doing ALTER OWNER on
an extension-owned object?  Is pg_dump supposed to recognize that
that's happened and replay it too?  If not, is it sane at all to
try to restore the current privileges, which are surely dependent
on the current owner?  I kind of doubt that that's possible at all,
and even if it is it might result in security issues.  It seems
like pg_init_privs has missed a critical thing, which is to record
the original owner not only the original privileges.

(Alternatively, maybe we should forbid ALTER OWNER on extension-owned
objects?  Or at least on those having pg_init_privs entries?)


I'm wondering too about this scenario:

1. CREATE EXTENSION installs an object and sets some initial privileges.

2. DBA manually modifies the object's privileges.

3. ALTER EXTENSION UPDATE further modifies the object's privileges.

I think what will happen is that at the end of ALTER EXTENSION,
we'll store the object's current ACL verbatim in pg_init_privs,
therefore including the effects of step 2.  This seems undesirable,
but I'm not sure how to get around it.


Anyway, this is starting to look like the sort of can of worms
best not opened post-beta1.  v17 has made some things better in this
area, and I don't think it's made anything worse; so maybe we should
declare victory for the moment and hope to address these additional
concerns later.  I've added an open item though.

        regards, tom lane




Re: Convert node test compile-time settings into run-time parameters

2024-05-24 Thread Tom Lane
Peter Eisentraut  writes:
> Ok, I have an improved plan.  I'm wrapping all the code related to this 
> in #ifdef DEBUG_NODE_TESTS_ENABLED.  This in turn is defined in 
> assert-enabled builds, or if you define it explicitly, or if you define 
> one of the legacy individual symbols.  That way you get the run-time 
> settings in a normal development build, but there is no new run-time 
> overhead.  This is the same scheme that we use for debug_discard_caches.

+1; this addresses my concern about not adding effectively-dead code
to production builds.  Your point upthread about debug_print_plan and
other legacy debug switches was not without merit; should we also fold
those into this plan?  (In that case we'd need a symbol named more
generically than DEBUG_NODE_TESTS_ENABLED.)

> (An argument could be made to enable this code if and only if assertions 
> are enabled, since these tests are themselves kind of assertions.  But I 
> think having a separate symbol documents the purpose of the various code 
> sections better.)

Agreed.

>> Maybe not three settings, but a single setting, with multiple values, like
>> debug_io_direct?

> Yeah, good idea.  Let's get some more feedback on this before I code up 
> a complicated list parser.

Kinda doubt it's worth the trouble, either to code the GUC support or
to use it.  I don't object to having the booleans in a debug build,
I was just concerned about whether they should exist in production.

regards, tom lane




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-24 Thread Tom Lane
Daniel Gustafsson  writes:
> I had a look, but I didn't beat you to a fix since it's not immediately clear
> to me how this should work for REASSING OWNED (DROP OWNED seems a simpler
> case).  Should REASSIGN OWNED alter the rows in pg_shdepend matching init 
> privs
> from SHARED_DEPENDENCY_OWNER to SHARED_DEPENDENCY_INITACL, so that these can 
> be
> mopped up with a later DROP OWNED?  Trying this in a POC patch it fails with
> RemoveRoleFromInitPriv not removing the rows, shortcircuiting that for a test
> seems to make it work but is it the right approach?

I've tentatively concluded that I shouldn't have modeled
SHARED_DEPENDENCY_INITACL so closely on SHARED_DEPENDENCY_ACL,
in particular the decision that we don't need such an entry if
there's also SHARED_DEPENDENCY_OWNER.  I think one reason we
can get away with omitting a SHARED_DEPENDENCY_ACL entry for the
owner is that the object's normal ACL is part of its primary
catalog row, so it goes away automatically if the object is
dropped.  But obviously that's not true for a pg_init_privs
entry.  I can see two routes to a solution:

1. Create SHARED_DEPENDENCY_INITACL, if applicable, whether the
role is the object's owner or not.  Then, clearing out the
pg_shdepend entry cues us to go delete the pg_init_privs entry.

2. Just always search pg_init_privs for relevant entries
when dropping an object.

I don't especially like #2 on performance grounds, but it has
a lot fewer moving parts than #1.  In particular, there's some
handwaving in changeDependencyOnOwner() about why we should
drop SHARED_DEPENDENCY_ACL when changing owner, and I've not
wrapped my head around how those concerns map to INITACL
if we treat it in this different way.

Another point: shdepReassignOwned explicitly does not touch grants
or default ACLs.  It feels like the same should be true of
pg_init_privs entries, or at least if not, why not?  In that case
there's nothing to be done in shdepReassignOwned (although maybe its
comments should be adjusted to mention this explicitly).  The bug is
just that DROP OWNED isn't getting rid of the entries because there's
no INITACL entry to cue it to do so.

Another thing I'm wondering about right now is privileges on global
objects (roles, databases, tablespaces).  The fine manual says
"Although an extension script is not prohibited from creating such
objects, if it does so they will not be tracked as part of the
extension".  Presumably, that also means that no pg_init_privs
entries are made; but do we do that correctly?

Anyway, -ENOCAFFEINE for the moment.  I'll look more later.

regards, tom lane




Re: PG catalog

2024-05-24 Thread Tom Lane
"Karki, Sanjay"  writes:
> I need to grant select on privilege in pg_catalog to user so I can connect 
> via Toad Data point ,

Why do you think you need to do that?  Most catalogs have public
select privilege already, and for the ones that don't, there are
very good reasons why not.  I don't know what "Toad Data point"
is, but if it thinks it needs more privilege than is normally
granted, you should be asking very pointed questions about why
and why that shouldn't be considered a security breach.

(Usually we get complaints that the default permissions on the
catalogs are too loose, not too tight.)

        regards, tom lane




Re: struct RelOptInfo member relid comments

2024-05-23 Thread Tom Lane
Ashutosh Bapat  writes:
> OJ is an outer join, AFAIU. OJ's have their own relids. If you are
> wondering why not all joins - I think inner joins need not be tracked as
> separated relations in parse tree, but OJ's need to be.

An outer join is necessarily associated with explicit JOIN syntax
in the FROM clause, and each such JOIN has its own rangetable entry
and hence a relid.  Inner joins might arise from comma syntax
(that is, "SELECT ... FROM tab1, tab2").  For perhaps-historical
reasons that syntax doesn't give rise to an explicit RTE_JOIN
rangetable entry, so the implied join has no relid.

        regards, tom lane




Re: struct RelOptInfo member relid comments

2024-05-23 Thread Tom Lane
jian he  writes:
> imho, the above comment is not very helpful.
> we should say more about what kind of information relid says about a base rel?

"Relid" is defined at the very top of the file:

/*
 * Relids
 *  Set of relation identifiers (indexes into the rangetable).
 */
typedef Bitmapset *Relids;

Repeating that everyplace the term "relid" appears would not be
tremendously helpful.

        regards, tom lane




Re: First draft of PG 17 release notes

2024-05-23 Thread Tom Lane
Bruce Momjian  writes:
> On Thu, May 23, 2024 at 02:27:07PM +1200, David Rowley wrote:
>> I also don't agree these should be left to "Source code" section. I
>> feel that section is best suited for extension authors who might care
>> about some internal API change.  I'm talking of stuff that makes some
>> user queries possibly N times (!) faster. Surely "Source Code" isn't
>> the place to talk about that?

> I said a new section "after our 'Source code' section," not in the
> source code section.

Surely, if we make this a separate section, it would come before
'Source code'?

I am not sure Bruce that you realize that your disregard for
performance improvements is shared by nobody.  Arguably,
performance is 90% of what we do these days, and it's also
90% of what users care about.

regards, tom lane




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-23 Thread Tom Lane
Hannu Krosing  writes:
> While the 'DROP OWNED BY fails to clean out pg_init_privs grants'
> issue is now fixed,we have a similar issue with REASSIGN OWNED BY that
> is still there:

Ugh, how embarrassing.  I'll take a look tomorrow, if no one
beats me to it.

regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-23 Thread Tom Lane
Robert Haas  writes:
> I think part of the reason we ended up with the protocol parameters =
> GUCs thing is because you seemed to be concurring with that approach
> upthread. I think it was Jelte's idea originally, but I interpreted
> some of your earlier remarks to be supporting it. I'm not sure whether
> you've revised your opinion, or just refined it, or whether we
> misinterpreted your earlier remarks.

I don't recall exactly what I thought earlier, but now I think we'd
be better off with separate infrastructure.  guc.c is unduly complex
already.  Perhaps there are bits of it that could be factored out
and shared, but I bet not a lot.

        regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-23 Thread Tom Lane
Jacob Champion  writes:
> Would it be good to expand on that idea of criticality? IIRC one of
> Jelte's complaints earlier was that middleware has to know all the
> extension types anyway, to be able to figure out whether it has to do
> something about them or not. HTTP has the concept of hop-by-hop vs
> end-to-end headers for related reasons.

Yeah, perhaps.  We'd need to figure out just which classes we need
to divide protocol parameters into, and then think about a way for
code to understand which class a parameter falls into even when
it doesn't specifically know that parameter.  That seems possible
though.  PNG did it with spelling rules for the chunk labels.
Here, since we don't yet have any existing _pq_.xxx parameter names,
we could maybe say that the names shall follow a pattern like
"_pq_.class.param".  (That works only if the classes are
non-overlapping, an assumption not yet justified by evidence;
but we could do something more complicated if we have to.)

    regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-23 Thread Tom Lane
Jelte Fennema-Nio  writes:
> On Fri, 17 May 2024 at 21:24, Robert Haas  wrote:
>> Perhaps these are all minority positions, but I can't tell you what
>> everyone thinks, only what I think.

> I'd love to hear some opinions from others on these design choices. So
> far it seems like we're the only two that have opinions on this (which
> seems hard to believe) and our opinions are clearly conflicting. And
> above all I'd like to move forward with this, be it my way or yours
> (although I'd prefer my way of course ;) )

I got around to looking through this thread in preparation for next
week's patch review session.  I have a couple of opinions to offer:

1. Protocol versions suck.  Bumping them is seldom a good answer,
and should never be done if you have any finer-grained negotiation
mechanism available.  My aversion to this is over thirty years old:
I learned that lesson from watching the GIF87-to-GIF89 transition mess.
Authors of GIF-writing tools tended to take the easy way out and write
"GIF89" in the header whether they were actually using any of the new
version's features or not.  This led to an awful lot of pictures that
couldn't be read by available GIF-displaying tools, for no good reason
whatsoever.  The PNG committee, a couple years later, reacted to that
mess by designing PNG to have no version number whatsoever, and yet
be extensible in a fine-grained way.  (Basically, a PNG file is made
up of labeled chunks.  If a reader doesn't recognize a particular
chunk code, it can still tell whether the chunk is "critical" or not,
and thereby decide if it must give up or can proceed while ignoring
that chunk.)

So overall, I have a strong preference for using the _pq_.xxx
mechanism instead of a protocol version bump.  I do not believe
the latter has any advantage.

2. I share Robert's suspicion of equating protocol parameters
with GUCs.  The GUC mechanism is quite opinionated and already
serves multiple masters.  In particular, the fact that GUC
settings are normally transactional does not play nice with
the way protocol parameters need to behave.  Yeah, no doubt you
could add another dollop of complexity to guc.c to make parameters
work differently from other GUCs, but I think it's the wrong design
direction.  We should handle protocol parameters with a separate
mechanism.  It's not, for instance, clear to me that protocol
parameters should be exposed at the SQL level at all; but if we
don't feel they need to be available via SHOW and pg_settings,
what benefit is guc.c really bringing to the table?

regards, tom lane




Re: about cross-compiling issue

2024-05-23 Thread Tom Lane
"=?utf-8?B?6ZmI5Lqa5p2w?=" <1441147...@qq.com> writes:
> Hello, I have a question about cross-compiling. I get an error when doing 
> initdb for postgresql for arm64 architecture devices.
> The error information is Error relocating 
> /data/postgresql/postgresql-16.3-arm64-v8a-build/tmp_install/usr/postgresql/arm64-v8a/lib/dict_snowball.so:
>  palloc0: symbol not found.

We don't really support cross-compiling, because there are too many
things that the configure script can't check for if it can't run a
test program.  In this particular case I think what is biting you
is that configure won't add -Wl,--export-dynamic to the backend
link switches.

You might think that that shouldn't require a test program to
verify, but c-compiler.m4 says differently:

# Given a string, check if the compiler supports the string as a
# command-line option. If it does, add to the given variable.
# For reasons you'd really rather not know about, this checks whether
# you can link to a particular function, not just whether you can link.
# In fact, we must actually check that the resulting program runs :-(

This check dates to 2008, and maybe it's no longer necessary on
any modern system, but I'm unexcited about trying to find out.

        regards, tom lane




Re: Sort operation displays more tuples than it contains its subnode

2024-05-22 Thread Tom Lane
"a.rybakina"  writes:
> I faced the issue, when the sorting node in the actual information  
> shows a larger number of tuples than it actually is. And I can not 
> understand why?

If I'm reading this correctly, the sort node you're worrying about
feeds the inner side of a merge join.  Merge join will rewind its
inner side to the start of the current group of equal-keyed tuples
whenever it sees that the next outer tuple must also be joined to
that group.  Since what EXPLAIN is counting is the number of tuples
returned from the node, that causes it to double-count those tuples.
The more duplicate-keyed tuples on the outer side, the bigger the
effect.

You can see the same thing happening at the Materialize a little
further up, which is feeding the inside of the other merge join.

        regards, tom lane




Re: Schema variables - new implementation for Postgres 15

2024-05-22 Thread Tom Lane
Peter Eisentraut  writes:
> On 18.05.24 13:29, Alvaro Herrera wrote:
>> I want to note that when we discussed this patch series at the dev
>> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want
>> schema variables at all because of the fact that creating a variable
>> would potentially change the meaning of queries by shadowing table
>> columns.  But this turns out to be incorrect: it's_variables_  that are
>> shadowed by table columns, not the other way around.

> But that's still bad, because seemingly unrelated schema changes can 
> make variables appear and disappear.  For example, if you have
>   SELECT a, b FROM table1
> and then you drop column b, maybe the above query continues to work 
> because there is also a variable b.

Yeah, that seems pretty dangerous.  Could we make it safe enough
by requiring some qualification on variable names?  That is, if
you mean b to be a variable, then you must write something like

SELECT a, pg_variables.b FROM table1

This is still ambiguous if you use "pg_variables" as a table alias in
the query, but the alias would win so the query still means what it
meant before.  Also, table aliases (as opposed to actual table names)
don't change readily, so I don't think there's much risk of the query
suddenly meaning something different than it did yesterday.

regards, tom lane




Re: Possible Bug in relation_open

2024-05-21 Thread Tom Lane
Robert Haas  writes:
> On Tue, May 21, 2024 at 9:58 AM Pradeep Kumar  
> wrote:
>> If the user tries to open the relation in RangeVar and NoLock mode calling 
>> table_openrv(relation, NoLock), it will internally call 
>> relation_openrv()-->relation_open(). In relation_open() we checking the 
>> Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES); , here we expecting 
>> the lockmode is NoLock or greater than that, but in same function again we 
>> checking this assert case Assert(lockmode != NoLock || 
>> IsBootstrapProcessingMode() || CheckRelationLockedByMe(r, AccessShareLock, 
>> true)); , here we are expecting (lockmode != NoLock) , so why are there two 
>> cases that contradict?  and What if the user tries to open the relation in 
>> NoLock mode? and that will definitely cause the assert failure, Suppose the 
>> user who writes some extension and reads some relation oid that is constant, 
>> and wants to acquire NoLock?, need some clarification on this.

> You need to acquire a lock. Otherwise, the relcache entry could change
> underneath you while you're accessing it, which would result in
> PostgreSQL crashing.

To clarify: the rule is that it's only allowed to pass NoLock if you
know for certain that some suitable lock on that relation is already
held by the current query.  That's why these conditions are complicated.

regards, tom lane




Re: tydedef extraction - back to the future

2024-05-20 Thread Tom Lane
Andrew Dunstan  writes:
> Attached is an attempt to thread this needle. The core is a new perl 
> module that imports the current buildfarm client logic. The intention is 
> that once we have this, the buildfarm client will switch to using the 
> module (if found) rather than its own built-in logic. There is precedent 
> for this sort of arrangement (AdjustUpgrade.pm). Accompanying the new 
> module is a standalone perl script that uses the new module, and 
> replaces the current shell script (thus making it more portable).

Haven't read the code in detail, but +1 for concept.  A couple of
minor quibbles:

* Why not call the wrapper script "find_typedefs"?  Without the "s"
it seems rather confusing --- "which typedef is this supposed to
find, exactly?"

* The header comment for sub typedefs seems to have adequate detail
about what the arguments are, but that all ought to be propagated
into the --help output for the wrapper script.  Right now you
couldn't figure out how to use the wrapper without reading the
underlying module.

regards, tom lane




Re: Reading timestamp values from Datums gives garbage values

2024-05-20 Thread Tom Lane
Tomas Vondra  writes:
> On 5/20/24 16:37, Sushrut Shivaswamy wrote:
>> When trying to read the query response from the Datum, I get garbage values.
>> I've tried various types and none of them read the correct value.

> TimestampTz is int64, so using DatumGetInt64 is probably the simplest
> solution. And it's the number of microseconds, so X/1e6 should give you
> the epoch.

Don't forget that TimestampTz uses an epoch (time zero) of 2000-01-01.
If you want a Unix-convention value where the epoch is 1970-01-01,
you'll need to add 30 years to the result.

The reported values seem pretty substantially off, though ---
5293917674 would be barely an hour and a half later than the
epoch, which seems unlikely to be the value Sushrut intended
to test with.  I suspect a mistake that's outside the fragment
of code we were shown.

        regards, tom lane




Re: Convert node test compile-time settings into run-time parameters

2024-05-20 Thread Tom Lane
Peter Eisentraut  writes:
> This patch converts the compile-time settings
>  COPY_PARSE_PLAN_TREES
>  WRITE_READ_PARSE_PLAN_TREES
>  RAW_EXPRESSION_COVERAGE_TEST

> into run-time parameters

>  debug_copy_parse_plan_trees
>  debug_write_read_parse_plan_trees
>  debug_raw_expression_coverage_test

I'm kind of down on this.  It seems like forcing a bunch of
useless-in-production debug support into the standard build.
What of this would be of any use to any non-developer?

        regards, tom lane




Re: Underscore in positional parameters?

2024-05-19 Thread Tom Lane
Erik Wienhold  writes:
> On 2024-05-20 03:26 +0200, jian he wrote:
>> /* Check parameter number is in range */
>> if (paramno <= 0 || paramno > MaxAllocSize / sizeof(Oid))
>> ereport(ERROR, ...

> Yes, it makes sense to show the upper bound.  How about a hint such as
> "Valid parameters range from $%d to $%d."?

I kind of feel like this upper bound is ridiculous.  In what scenario
is parameter 25000 not a mistake, if not indeed somebody trying
to break the system?

The "Bind" protocol message only allows an int16 parameter count,
so rejecting parameter numbers above 32K would make sense to me.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-19 Thread Tom Lane
Bruce Momjian  writes:
> On Sun, May 19, 2024 at 03:18:11PM -0400, Tom Lane wrote:
>> * Another reason for things sitting a long time is that they're too
>> big to review without an unreasonable amount of effort.  We should
>> encourage authors to break large patches into smaller stepwise
>> refinements.  It may seem like that will result in taking forever
>> to reach the end goal, but dropping a huge patchset on the community
>> isn't going to give speedy results either.

> I think it is sometimes hard to incrementally apply patches if the
> long-term goal isn't agreed or know to be achievable.

True.  The value of the earlier patches in the series can be unclear
if you don't understand what the end goal is.  But I think people
could post a "road map" of how they intend a patch series to go.

Another way of looking at this is that sometimes people do post large
chunks of work in long many-patch sets, but we tend to look at the
whole series as something to review and commit as one (or I do, at
least).  We should be more willing to bite off and push the earlier
patches in such a series even when the later ones aren't entirely
done.

(The cfbot tends to discourage this, since as soon as one of the
patches is committed it no longer knows how to apply the rest.
Can we improve on that tooling somehow?)

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-19 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
> There are lots of good takes on this in the thread. It also makes clear what's
> at stake -- as Melanie pointed out with the patch about EXPLAIN for parallel
> bitmap heap scan, we're loosing potential contributors for no reasons. But I'm
> a bit concerned about what are the next steps: if memory serves, every couple
> of years there is a discussion about everything what goes wrong with the 
> review
> process, commitfests, etc. Yet to my (admittedly limited) insight into the
> community, not many things have changed due to those discussions. How do we
> make sure this time it will be different?

Things *have* changed, if you take the long view.  We didn't have
commitfests at all until around 2007, and we've changed the ground
rules for them a couple of times since then.  We didn't have the CF
app at all until, well, I don't recall when, but the first few CFs
were managed by keeping patch lists on a wiki page.  It's not that
people are unwilling to change this stuff, but that it's hard to
identify what will make things better.

IMV one really fundamental problem is the same as it's been for a
couple of decades: too many patch submissions, too few committers.
We can't fix that by just appointing a ton more committers, at least
not if we want to keep the project's quality up.  We have to grow
qualified committers.  IIRC, one of the main reasons for instituting
the commitfests at all was the hope that if we got more people to
spend time reading the code and reviewing patches, some of them would
learn enough to become good committers.  I think that's worked, again
taking a long view.  I just did some quick statistics on the commit
history, and I see that we were hovering at somewhere around ten
active committers from 1999 to 2012, but since then it's slowly crept
up to about two dozen today.  (I'm counting "active" as "at least 10
commits per year", which is an arbitrary cutoff --- feel free to slice
the data for yourself.)  Meanwhile the number of submissions has also
grown, so I'm not sure how much better the load ratio is.

My point here is not that things are great, but just that we are
indeed improving, and I hope we can continue to.  Let's not be
defeatist about it.

I think this thread has already identified a few things we have
consensus to improve in the CF app, and I hope somebody goes off
and makes those happen (I lack the web skillz to help myself).
However, the app itself is just a tool; what counts more is our
process around it.  I have a couple of thoughts about that:

* Patches that sit in the queue a long time tend to be ones that lack
consensus, either about the goal or the details of how to achieve it.
Sometimes "lacks consensus" really means "nobody but the author thinks
this is a good idea, but we haven't mustered the will to say no".
But I think it's more usually the case that there are plausible
competing opinions about what the patch should do or how it should
do it.  How can we resolve such differences and get something done?

* Another reason for things sitting a long time is that they're too
big to review without an unreasonable amount of effort.  We should
encourage authors to break large patches into smaller stepwise
refinements.  It may seem like that will result in taking forever
to reach the end goal, but dropping a huge patchset on the community
isn't going to give speedy results either.

* Before starting this thread, Robert did a lot of very valuable
review of some individual patches.  I think what prompted him to
start the thread was the realization that he'd only made a small
dent in the problem.  Maybe we could divide and conquer: get a
dozen-or-so senior contributors to split up the list of pending
patches and then look at each one with an eye to what needs to
happen to move it along (*not* to commit it right away, although
in some cases maybe that's the thing to do).  It'd be great if
that could happen just before each commitfest, but that's probably
not practical with the current patch volume.  What I'm thinking
for the moment is to try to make that happen once a year or so.

> What I think wasn't discussed yet in details is the question of motivation.
> Surely, it would be great to have a process that will introduce as less burden
> as possible. But giving more motivation to follow the process / use the tool 
> is
> as important. What motivates folks to review patches, figure out status of a
> complicated patch thread, maintain a list of open items, etc?

Yeah, all this stuff ultimately gets done "for the good of the
project", which isn't the most reliable motivation perhaps.
I don't see a better one...

regards, tom lane




State of pg_createsubscriber

2024-05-19 Thread Tom Lane
I'm fairly disturbed about the readiness of pg_createsubscriber.
The 040_pg_createsubscriber.pl TAP test is moderately unstable
in the buildfarm [1], and there are various unaddressed complaints
at the end of the patch thread (pretty much everything below [2]).
I guess this is good enough to start beta with, but it's far from
being good enough to ship, IMO.  If there were active work going
on to fix these things, I'd feel better, but neither the C code
nor the test script have been touched since 1 April.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=30=HEAD==pg_basebackupCheck=Submit
[2] 
https://www.postgresql.org/message-id/flat/3fa9ef0f-b277-4c13-850a-8ccc04de1406%40eisentraut.org#152dacecfc8f0cf08cbd8ecb79d6d38f




Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

2024-05-19 Thread Tom Lane
Daniel Gustafsson  writes:
> On 19 Apr 2024, at 15:13, David Benjamin  wrote:
>> Circling back here, anything else needed from my end on this patch?

> Patience mostly, v17 very recently entered feature freeze so a lof of the
> developers are busy finding and fixing bugs to stabilize the
> release.

Per the cfbot [1], this patch needs a rebase over the ALPN-related
commits.  It still isn't likely to get human attention before the
July commitfest, but you can save some time by making sure it's
in a reviewable state before that.

        regards, tom lane

[1] http://commitfest.cputube.org/david-benjamin.html




Re: generic plans and "initial" pruning

2024-05-18 Thread Tom Lane
David Rowley  writes:
> With the caveat of not yet having looked at the latest patch, my
> thoughts are that having the executor startup responsible for taking
> locks is a bad idea and I don't think we should go down this path.

OK, it's certainly still up for argument, but ...

> 1. No ability to control the order that the locks are obtained. The
> order in which the locks are taken will be at the mercy of the plan
> the planner chooses.

I do not think I buy this argument, because plancache.c doesn't
provide any "ability to control the order" today, and never has.
The order in which AcquireExecutorLocks re-gets relation locks is only
weakly related to the order in which the parser/planner got them
originally.  The order in which AcquirePlannerLocks re-gets the locks
is even less related to the original.  This doesn't cause any big
problems that I'm aware of, because these locks are fairly weak.

I think we do have a guarantee that for partitioned tables, parents
will be locked before children, and that's probably valuable.
But an executor-driven lock order could preserve that property too.

> 2. It introduces lots of complexity regarding how to cleanly clean up
> after a failed executor startup which is likely to make exec startup
> slower and the code more complex

Perhaps true, I'm not sure.  But the patch we'd been discussing
before this proposal was darn complex as well.

> 3. It puts us even further down the path of actually needing an
> executor startup phase.

Huh?  We have such a thing already.

> For #1, the locks taken for SELECT queries are less likely to conflict
> with other locks obtained by PostgreSQL, but at least at the moment if
> someone is getting deadlocks with a DDL type operation, they can
> change their query or DDL script so that locks are taken in the same
> order.  If we allowed executor startup to do this then if someone
> comes complaining that PG18 deadlocks when PG17 didn't we'd just have
> to tell them to live with it.  There's a comment at the bottom of
> find_inheritance_children_extended() just above the qsort() which
> explains about the deadlocking issue.

The reason it's important there is that function is (sometimes)
used for lock modes that *are* exclusive.

> For #3, I've been thinking about what improvements we can do to make
> the executor more efficient. In [1], Andres talks about some very
> interesting things. In particular, in his email items 3) and 5) are
> relevant here. If we did move lots of executor startup code into the
> planner, I think it would be possible to one day get rid of executor
> startup and have the plan record how much memory is needed for the
> non-readonly part of the executor state and tag each plan node with
> the offset in bytes they should use for their portion of the executor
> working state.

I'm fairly skeptical about that idea.  The entire reason we have an
issue here is that we want to do runtime partition pruning, which
by definition can't be done at plan time.  So I doubt it's going
to play nice with what we are trying to accomplish in this thread.

Moreover, while "replace a bunch of small pallocs with one big one"
would save some palloc effort, what are you going to do to ensure
that that memory has the right initial contents?  I think this idea is
likely to make the executor a great deal more notationally complex
without actually buying all that much.  Maybe Andres can make it work,
but I don't want to contort other parts of the system design on the
purely hypothetical basis that this might happen.

> I think what Amit had before your objection was starting to turn into
> something workable and we should switch back to working on that.

The reason I posted this idea was that I didn't think the previously
existing patch looked promising at all.

regards, tom lane




Re: Requiring LLVM 14+ in PostgreSQL 18

2024-05-18 Thread Tom Lane
Thomas Munro  writes:
> Oops, right I didn't know we had that documented.  Thanks.  Will hold
> off doing anything until the thaw.

FWIW, I don't think the release freeze precludes docs-only fixes.
But if you prefer to sit on this, that's fine too.

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-18 Thread Tom Lane
Andrew Dunstan  writes:
> I've pushed a small change, that should just mark with an asterisk any 
> gitref that is more than 2 days older than the tip of the branch at the 
> time of reporting.

Thanks!

regards, tom lane




Re: Ignore Visual Studio's Temp Files While Working with PG on Windows

2024-05-18 Thread Tom Lane
=?UTF-8?B?Sm9zZWYgxaBpbcOhbmVr?=  writes:
> But this is different. If I understand it well, just by following
> https://www.postgresql.org/docs/16/install-windows-full.html you'll
> get those files no matter what is your specific environment (or
> specific set of tools).

Hm?  Visual Studio seems like quite a specific tool from here.

I did some googling around the question of project .gitignore
files ignoring .vs/, and was amused to come across this:

https://github.com/github/gitignore/blob/main/VisualStudio.gitignore

which seems like a mighty fine example of where we *don't*
want to go.

    regards, tom lane




Re: Ignore Visual Studio's Temp Files While Working with PG on Windows

2024-05-18 Thread Tom Lane
Yasir  writes:
> We can add it to "~/.config/git/ignore" as it will ignore globally on
> windows which we don't want. Also we don't have ".git/info/exclude" in PG
> project's so the best place left is projects's .gitignore. That's what was
> patched.

As Peter said, we're not going to do that.  The intention with
the project's .gitignore files is to ignore files that are
intentionally built by our "make" targets (and, hopefully, will be
removed by "make maintainer-clean").  Anything else that you want
git to ignore should be in a personal ignore list; especially
files that are platform-specific.  The fact that it's reasonable
to ignore ".vs" files when working with your toolset doesn't mean
that it's reasonable to ignore them when working on some other
platform.

If we used some other policy, we'd have tons of debates about
which files were reasonable to exclude.  For myself, for example,
I exclude "*~" (Emacs backup files) and "*.orig" (patch(1)
backup files) but those choices are very much dependent on the
set of tools I choose to use.  Other developers have other
personal exclusion lists.  If we tried to make the project's
files be the union of all those lists, we'd be at serious risk
of ignoring stuff we absolutely shouldn't ignore in some contexts.

regards, tom lane




Re: Speed up clean meson builds by ~25%

2024-05-18 Thread Tom Lane
Alvaro Herrera  writes:
> On 2024-May-17, Robert Haas wrote:
>> Anyone else want to vote?

> I had pretty much the same thought as you.  It seems a waste to leave
> the code in existing branches be slow only because we have a much better
> approach for a branch that doesn't even exist yet.

I won't complain too loudly as long as we remember to revert the
patch from HEAD once the ecpg fix goes in.

        regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 17, 2024 at 3:51 PM Laurenz Albe  wrote:
>> I think it is a good rule.  I don't think that it shouldn't lead to putting
>> people on the pillory or kicking their patches, but I imagine that a 
>> committer
>> looking for somebody else's patch to work on could prefer patches by people
>> who are doing their share of reviews.

> If you give me an automated way to find that out, I'll consider paying
> some attention to it.

Yeah, I can't imagine that any committer (or reviewer, really) is
doing any such thing, because it would take far too much effort to
figure out how much work anyone else is doing.  I see CFMs reminding
everybody that this rule exists, but I don't think they ever try to
check it either.  It's pretty much the honor system, and I'm sure
some folk ignore it.

    regards, tom lane




Re: Speed up clean meson builds by ~25%

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> If this is the consensus opinion, then
> https://commitfest.postgresql.org/48/4914/ should be marked Rejected.
> However, while I think the improvements that Tom was able to make here
> sound fantastic, I don't understand why that's an argument against
> Jelte's patches. After all, Tom's work will only go into v18, but this
> patch could be adopted in v17 and back-patched to all releases that
> support meson builds, saving oodles of compile time for as long as
> those releases are supported. The most obvious beneficiary of that
> course of action would seem to be Tom himself, since he back-patches
> more fixes than anybody, last I checked, but it'd be also be useful to
> get slightly quicker results from the buildfarm and slightly quicker
> results for anyone using CI on back-branches and for other hackers who
> are looking to back-patch bug fixes. I don't quite understand why we
> want to throw those potential benefits out the window just because we
> have a better fix for the future.

As I mentioned upthread, I'm more worried about confusing error
reports than the machine time.  It would save me personally exactly
nada, since (a) I usually develop with gcc not clang, (b) when
I do use clang it's not a heavily-affected version, and (c) since
we *very* seldom change the grammar in stable branches, ccache will
hide the problem pretty effectively anyway in the back branches.

(If you're not using ccache, please don't complain about build time.)

I grant that there are people who are more affected, but still, I'd
just as soon not contort the build rules for a temporary problem.

regards, tom lane




Re: Postgres and --config-file option

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> If the reason that somebody is upset is because it's not technically
> true to say that you *must* do one of those things, we could fix that
> with "You must" -> "You can" or with "You must specify" -> "Specify".
> The patch you propose is also not terrible or anything, but it goes in
> the direction of listing every alternative, which will become
> unpalatable as soon as somebody adds one more way to do it, or maybe
> it's unpalatable already.

I agree that it's not necessary or particularly useful for this hint
to be exhaustive.  I could get behind your suggestion of
s/You must specify/Specify/, but I also think it's fine to do nothing.

regards, tom lane




Re: problems with "Shared Memory and Semaphores" section of docs

2024-05-17 Thread Tom Lane
Nathan Bossart  writes:
> [ many, many problems in documented formulas ]

> At a bare minimum, we should probably fix the obvious problems, but I
> wonder if we could simplify this section a bit, too.

Yup.  "The definition of insanity is doing the same thing over and
over and expecting different results."  Time to give up on documenting
these things in such detail.  Anybody who really wants to know can
look at the source code.

> If the exact values
> are important, maybe we could introduce more GUCs like
> shared_memory_size_in_huge_pages that can be consulted (instead of
> requiring users to break out their calculators).

I don't especially like shared_memory_size_in_huge_pages, and I don't
want to introduce more of those.  GUCs are not the right way to expose
values that you can't actually set.  (Yeah, I'm guilty of some of the
existing ones like that, but it's still not a good thing.)  Maybe it's
time to introduce a system view for such things?  It could be really
simple, with name and value, or we could try to steal some additional
ideas such as units from pg_settings.

        regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-17 Thread Tom Lane
TAKATSUKA Haruka  writes:
> I'm a hamerkop maintainer.
> Sorry I have missed the scm error for so long.

> Today I switched scmrepo from git.postgresql.org/git/postgresql.git 
> to github.com/postgres/postgres.git and successfully modernized
> the build target code.

Thanks very much!  I see hamerkop has gone green in HEAD.

It looks like it succeeded in v13 too but failed in v12,
which suggests that the isolationcheck problem is intermittent,
which is not too surprising given our current theory about
what's causing that.

At this point I think we are too close to the 17beta1 release
freeze to mess with it, but I'd support pushing Thomas'
proposed patch after the freeze is over.

        regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
"David G. Johnston"  writes:
> On Friday, May 17, 2024, Joe Conway  wrote:
>> A solution to both of these issues (yours and mine) would be to allow
>> things to be added *during* the CF month. What is the point of having a
>> "freeze" before every CF anyway? Especially if they start out clean. If
>> something is ready for review on day 8 of the CF, why not let it be added
>> for review?

> In conjunction with WIP removing this limitation on the bimonthlies makes
> sense to me.

I think that the original motivation for this was two-fold:

1. A notion of fairness, that you shouldn't get your patch reviewed
ahead of others that have been waiting (much?) longer.  I'm not sure
how much this is really worth.  In particular, even if we do delay
an incoming patch by one CF, it's still going to compete with the
older stuff in future CFs.  There's already a sort feature in the CF
dashboard whereby patches that have lingered for more CFs appear ahead
of patches that are newer, so maybe just ensuring that late-arriving
patches sort as "been around for 0 CFs" is sufficient.

2. As I mentioned a bit ago, the original idea was that we didn't exit
a CF until it was empty of un-handled patches, so obviously allowing
new patches to come in would mean we'd never get to empty.  That idea
didn't work and we don't think that way anymore.

So yeah, I'm okay with abandoning the must-submit-to-a-future-CF
restriction.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Melanie Plageman  writes:
> So, anyway, I'd argue that we need a parking lot for patches which we
> all agree are important and have a path forward but need someone to do
> the last 20-80% of the work. To avoid this being a dumping ground,
> patches should _only_ be allowed in the parking lot if they have a
> clear path forward. Patches which haven't gotten any interest don't go
> there. Patches in which the author has clearly not addressed feedback
> that is reasonable for them to address don't go there. These are
> effectively community TODOs which we agree need to be done -- if only
> someone had the time.

Hmm.  I was envisioning "parking lot" as meaning "this is on my
personal TODO list, and I'd like CI support for it, but I'm not
expecting anyone else to pay attention to it yet".  I think what
you are describing is valuable but different.  Maybe call it
"pending" or such?  Or invent a different name for the other thing.

regards, tom lane




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-17 Thread Tom Lane
Peter Eisentraut  writes:
> On 16.05.24 16:45, Tom Lane wrote:
>> Yeah, that was bothering me too, but I went for the minimum delta.
>> I did think that a couple of integer macros would be a better idea,
>> so +1 for what you did here.

> I committed this, and Michael took care of WaitEventExtension, so we 
> should be all clear here.

Thanks.  I just made the committed typedefs.list exactly match the
current buildfarm output, so we're clean for now.

        regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 17, 2024 at 10:31 AM Tom Lane  wrote:
>> If we go back to the old its-development-mode-all-the-time approach,
>> what is likely to happen is that the commit rate for not-your-own-
>> patches goes to zero, because it's always possible to rationalize
>> your own stuff as being more important.

> We already have gone back to that model. We just haven't admitted it
> yet. And we're never going to get out of it until we find a way to get
> the contents of the CommitFest application down to a more reasonable
> size and level of complexity. There's just no way everyone's up for
> that level of pain. I'm not sure not up for that level of pain.

Yeah, we clearly need to get the patch list to a point of
manageability, but I don't agree that abandoning time-boxed CFs
will improve anything.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, May 17, 2024 at 9:54 AM Joe Conway  wrote:
>> I agree with you. Before commitfests were a thing, we had no structure
>> at all as I recall. The dates for the dev cycle were more fluid as I
>> recall, and we had no CF app to track things. Maybe retaining the
>> structure but going back to the continuous development would be just the
>> thing, with the CF app tracking just the currently (as of this
>> week/month/???) active stuff.

> The main thing that we'd gain from that is to avoid all the work of
> pushing lots of things forward to the next CommitFest at the end of
> every CommitFest.

To my mind, the point of the time-boxed commitfests is to provide
a structure wherein people will (hopefully) pay some actual attention
to other peoples' patches.  Conversely, the fact that we don't have
one running all the time gives committers some defined intervals
where they can work on their own stuff without feeling guilty that
they're not handling other people's patches.

If we go back to the old its-development-mode-all-the-time approach,
what is likely to happen is that the commit rate for not-your-own-
patches goes to zero, because it's always possible to rationalize
your own stuff as being more important.

> Like, we could also just have a button that says "move everything
> that's left to the next CommitFest".

Clearly, CFMs would appreciate some more tooling to make that sort
of thing easier.  Perhaps we omitted it in the original CF app
coding because we expected the end-of-CF backlog to be minimal,
but it's now obvious that it generally isn't.

BTW, I was reminded while trawling old email yesterday that
we used to freeze the content of a CF at its start and then
hold the CF open until the backlog actually went to zero,
which resulted in multi-month death-march CFs and no clarity
at all as to release timing.  Let's not go back to that.
But the CF app was probably built with that model in mind.

regards, tom lane




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Tom Lane
Michael Paquier  writes:
> On Thu, May 16, 2024 at 10:45:18AM -0400, Tom Lane wrote:
>> ... I think the enum should be nuked altogether, but
>> it's a bit late to be redesigning that for v17 perhaps.

> You're right, WaitEventExtension is better gone.  The only thing that
> matters is that we want to start computing the IDs assigned to the
> custom wait events for extensions with a number higher than the
> existing WAIT_EXTENSION to avoid interferences in pg_stat_activity, so
> this could be cleaned up as the attached.

WFM, and this is probably a place where we don't want to change the
API in v17 and again in v18, so I agree with pushing now.

Reminder though: beta1 release freeze begins Saturday.
Not many hours left.

        regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Peter Eisentraut  writes:
> On 16.05.24 23:46, Tom Lane wrote:
>> Right, so what can we do about that?  Does Needs Review state need to
>> be subdivided, and if so how?

> Maybe a new state "Unclear". ...

> I think, if we consider the core mission of the commitfest app, we need 
> to be more protective of the Needs Review state.

Yeah, makes sense.

> So a third status that encompasses the various other situations like 
> maybe forgotten by author, disagreements between author and reviewer, 
> process difficulties, needs some senior developer intervention, etc. 
> could be helpful.

Hmm, "forgotten by author" seems to generally turn into "this has been
in WOA state a long time".  Not sure we have a problem representing
that, only with a process for eventually retiring such entries.
Your other three examples all sound like "needs senior developer
attention", which could be a helpful state that's distinct from "ready
for committer".  It's definitely not the same as "Unclear".

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
> Pulling in the information from the CFBot regarding test failures and whether
> the patch applies at all, and automatically altering the state to WOA for at
> least the latter category would be nice.

+1.  There are enough intermittent test failures that I don't think
changing state for that would be good, but patch apply failure is
usually persistent.

I gather that the CFBot is still a kluge that is web-scraping the CF
app rather than being actually integrated with it, and that there are
plans to make that better that haven't been moving fast.  Probably
that would have to happen first, but there would be a lot of benefit
from having the info flow be two-way.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Peter Eisentraut  writes:
> The problem is if we have 180 patches in Needs Review, and only 20 are 
> really actually ready to be reviewed.  And a second-order problem is 
> that if you already know that this will be the case, you give up before 
> even looking.

Right, so what can we do about that?  Does Needs Review state need to
be subdivided, and if so how?

If it's just that a patch should be in some other state altogether,
we should simply encourage people to change the state as soon as they
discover that.  I think the problem is not so much "90% are in the
wrong state" as "each potential reviewer has to rediscover that".

At this point it seems like there's consensus to have a "parking"
section of the CF app, separate from the time-boxed CFs, and I hope
somebody will go make that happen.  But I don't think that's our only
issue, so we need to keep thinking about what should be improved.

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-05-16 Th 17:15, Tom Lane wrote:
>> I'd rather have some visible status on the BF dashboard.  Invariably,
>> with a problem like this, the animal's owner is unaware there's a
>> problem.  If it's just silently not reporting, then no one else will
>> notice either, and we effectively lose an animal (despite it still
>> burning electricity to perform those rejected runs).

> Fair enough. That will mean some database changes and other stuff, so it 
> will take a bit longer.

Sure, I don't think it's urgent.

    regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-05-16 Th 16:18, Tom Lane wrote:
>> Andrew: maybe the buildfarm server could be made to flag
>> animals building exceedingly old commits?  This is the second
>> problem of this sort that I've noticed this month, and you
>> really have to look closely to realize it's happening.

> Yeah, that should be doable. Since we have the git ref these days we 
> should be able to mark it as old, or maybe just reject builds for very 
> old commits (the latter would be easier).

I'd rather have some visible status on the BF dashboard.  Invariably,
with a problem like this, the animal's owner is unaware there's a
problem.  If it's just silently not reporting, then no one else will
notice either, and we effectively lose an animal (despite it still
burning electricity to perform those rejected runs).

    regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Jacob Champion  writes:
> On Thu, May 16, 2024 at 1:31 PM Joe Conway  wrote:
>> Maybe we should just make it a policy that *nothing* gets moved forward
>> from commitfest-to-commitfest and therefore the author needs to care
>> enough to register for the next one?

> I think that's going to severely disadvantage anyone who doesn't do
> this as their day job. Maybe I'm bristling a bit too much at the
> wording, but not having time to shepherd a patch is not the same as
> not caring.

Also, I doubt that there are all that many patches that have simply
been abandoned by their authors.  Our problem is the same as it's
been for many years: not enough review person-power, rather than
not enough patches.  So I think authors would just jump through that
hoop, or enough of them would that it wouldn't improve matters.

    regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Jacob Champion  writes:
> ... Similar to how people currently use the
> Reviewer field as a personal TODO list... it might be nice to
> officially separate the ideas a bit.

Oh, that's an independent pet peeve of mine.  Usually, if I'm
looking over the CF list for a patch to review, I skip over ones
that already show an assigned reviewer, because I don't want to
step on that person's toes.  But it seems very common to put
one's name down for review without any immediate intention of
doing work.  Or to do a review and wander off, leaving the patch
apparently being tended to but not really.  (And I confess I'm
occasionally guilty of both things myself.)

I think it'd be great if we could separate "I'm actively reviewing
this" from "I'm interested in this".  As a bonus, adding yourself
to the "interested" list would be a fine proxy for the thumbs-up
or star markers mentioned upthread.

If those were separate columns, we could implement some sort of
aging scheme whereby somebody who'd not commented for (say)
a week or two would get quasi-automatically moved from the "active
reviewer" column to the "interested" column, whereupon it wouldn't
be impolite for someone else to sign up for active review.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Melanie Plageman  writes:
> I was reflecting on why a general purpose patch tracker sounded
> appealing to me, and I realized that, at least at this time of year, I
> have a few patches that really count as "waiting on author" that I
> know I need to do additional work on before they need more review but
> which aren't currently my top priority. I should probably simply
> withdraw and re-register them. My justification was that I'll lose
> them if I don't keep them in the commitfest app. But, I could just,
> you know, save them somewhere myself instead of polluting the
> commitfest app with them. I don't know if others are in this
> situation. Anyway, I'm definitely currently guilty of parking.

It's also nice that the CF app will run CI for you, so at least
you can keep the patch building if you're so inclined.

David J. had a suggestion for this too upthread, which was to create a
separate slot for WIP patches that isn't one of the dated CF slots.

It's hard to argue that such patches need to be in "the CF app" at
all, if you're not actively looking for review.  But the CI runs
and the handy per-author patch status list make the app very tempting
infrastructure for parked patches.  Maybe we could have a not-the-CF
app that offers those amenities?

regards, tom lane




Re: GUC names in messages

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
> On 16 May 2024, at 13:35, Peter Eisentraut  wrote:
>> - errmsg("WAL generated with full_page_writes=off was replayed "
>> + errmsg("WAL generated with \"full_page_writes=off\" was replayed "
 
> I'm not a fan of this syntax, but I at the same time can't offer a better idea
> so this isn't an objection but a hope that it can be made even better during
> the v18 cycle.

Yeah ... formally correct would be something like

errmsg("WAL generated with \"full_page_writes\"=\"off\" was replayed "

but that's a bit much for my taste.

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-16 Thread Tom Lane
Thomas Munro  writes:
> For citext_utf8, I pushed cff4e5a3.  Hamerkop runs infrequently, so
> here's hoping for 100% green on master by Tuesday or so.

Meanwhile, back at the ranch, it doesn't seem that changed anything:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop=2024-05-16%2011%3A00%3A32

... and now that I look more closely, the reason why it didn't
change anything is that hamerkop is still building 0294df2
on HEAD.  All its other branches are equally stuck at the
end of March.  So this is a flat-out-broken animal, and I
plan to just ignore it until its owner un-sticks it.
(In particular, I think we shouldn't be in a hurry to push
the patch discussed downthread.)

Andrew: maybe the buildfarm server could be made to flag
animals building exceedingly old commits?  This is the second
problem of this sort that I've noticed this month, and you
really have to look closely to realize it's happening.

regards, tom lane




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 16 May 2024, at 20:30, Robert Haas  wrote:
>> The original intent of CommitFests, and of commitfest.postgresql.org
>> by extension, was to provide a place where patches could be registered
>> to indicate that they needed to be reviewed, thus enabling patch
>> authors and patch reviewers to find each other in a reasonably
>> efficient way. I don't think it's working any more.

> But which part is broken though, the app, our commitfest process and workflow
> and the its intent, or our assumption that we follow said process and workflow
> which may or may not be backed by evidence?  IMHO, from being CMF many times,
> there is a fair bit of the latter, which excacerbates the problem.  This is
> harder to fix with more or better software though. 

Yeah.  I think that Robert put his finger on a big part of the
problem, which is that punting a patch to the next CF is a lot
easier than rejecting it, particularly for less-senior CFMs
who may not feel they have the authority to say no (or at
least doubt that the patch author would accept it).  It's hard
even for senior people to get patch authors to take no for an
answer --- I know I've had little luck at it --- so maybe that
problem is inherent.  But a CF app full of patches that are
unlikely ever to go anywhere isn't helpful.

It's also true that some of us are abusing the process a bit.
I know I frequently stick things into the CF app even if I intend
to commit them pretty darn soon, because it's a near-zero-friction
way to run CI on them, and I'm too lazy to learn how to do that
otherwise.  I like David's suggestion of a "Pending Commit"
status, or maybe I should just put such patches into RfC state
immediately?  However, short-lived entries like that don't seem
like they're a big problem beyond possibly skewing the CF statistics
a bit.  It's the stuff that keeps hanging around that seems like
the core of the issue.

>> I spent a good deal of time going through the CommitFest this week

> And you deserve a big Thank You for that.

+ many

regards, tom lane




Re: race condition when writing pg_control

2024-05-16 Thread Tom Lane
Andres Freund  writes:
> On 2024-05-16 14:50:50 -0400, Tom Lane wrote:
>> The intention was certainly always that it be atomic.  If it isn't
>> we have got *big* trouble.

> We unfortunately do *know* that on several systems e.g. basebackup can read a
> partially written control file, while the control file is being
> updated.

Yeah, but can't we just retry that if we get a bad checksum?

What had better be atomic is the write to disk.  Systems that can't
manage POSIX semantics for concurrent reads and writes are annoying,
but not fatal ...

        regards, tom lane




Re: race condition when writing pg_control

2024-05-16 Thread Tom Lane
Nathan Bossart  writes:
> I suspect it will be difficult to investigate this one too much further
> unless we can track down a copy of the control file with the bad checksum.
> Other than searching for any new code that isn't doing the appropriate
> locking, maybe we could search the buildfarm for any other occurrences.  I
> also seem some threads concerning whether the way we are reading/writing
> the control file is atomic.

The intention was certainly always that it be atomic.  If it isn't
we have got *big* trouble.

        regards, tom lane




Re: [UNVERIFIED SENDER] pg_upgrade can result in early wraparound on databases with high transaction load

2024-05-16 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 5 Jul 2022, at 18:59, Tom Lane  wrote:
>> Given the lack of field complaints, it's probably not worth trying
>> to do anything to restore that capability.  But we really ought to
>> update pg_upgrade's code and docs in pre-v15 branches to say that
>> the minimum supported source version is 9.0.

> (reviving an old thread from the TODO)

> Since we never got around to doing this we still refer to 8.4 as a possible
> upgrade path in v14 and older.

Oh, yeah, that seems to have fallen through a crack.

> The attached takes the conservative approach of raising the minimum supported
> version to 9.0 while leaving the code to handle 8.4 in place.  While it can be
> removed, the risk/reward tradeoff of gutting code in backbranches doesn't seem
> appealing since the code will be unreachable with this check anyways.

Yeah, it's not worth working harder than this.  I do see one typo
in your comment: s/supported then/supported when/.  LGTM otherwise.

regards, tom lane




Re: Docs: Always use true|false instead of sometimes on|off for the subscription options

2024-05-16 Thread Tom Lane
Peter Smith  writes:
> Yeah, I had a vested interest in this one place because I've been
> reviewing the other thread [1] that was mentioned above. If that other
> thread chooses "true|false" then putting "true|false" adjacent to
> another "on|off" will look silly. But if that other thread adopts the
> existing 'failover=on|off' values then it will diverge even further
> from being consistent with the CREATE SUBSCRIPTION page.

It's intentional that we allow more than one spelling of boolean
values.  I can see some value in being consistent within nearby
examples, but I don't agree at all that it should be uniform
across all our docs.

regards, tom lane




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-16 Thread Tom Lane
Peter Eisentraut  writes:
> In these cases, I think for
> NotificationHash
> ResourceOwnerData
> WalSyncMethod
> we can just get rid of the typedef.

I have no objection to dealing with NotificationHash as you have here.

> ReadBuffersFlags shouldn't be an enum at all, because its values are 
> used as flag bits.

Yeah, that was bothering me too, but I went for the minimum delta.
I did think that a couple of integer macros would be a better idea,
so +1 for what you did here.

> WaitEventExtension, I'm not sure, it's like, an extensible enum?  I 
> guess let's remove the typedef there, too.

I am also quite confused by that.  It seems to be kind of an enum
that is supposed to be extended at runtime, meaning that neither
of the existing enum member values ought to be used as such, although
either autoprewarm.c didn't get the memo or I misunderstand the
intended usage.  NUM_BUILTIN_WAIT_EVENT_EXTENSION is possibly the
most bizarre idea I've ever seen: what would a "built-in extension"
event be exactly?  I think the enum should be nuked altogether, but
it's a bit late to be redesigning that for v17 perhaps.

> Attached is a variant patch.

I'm good with this, with a mental note to look again at
WaitEventExtension later.

    regards, tom lane




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-15 Thread Tom Lane
I wrote:
> This works for me.  One point that could stand discussion while we're
> here is whether the once-a-cycle run should use the verbatim buildfarm
> results or it's okay to editorialize on that typedefs list.  I did a
> little of the latter in da256a4a7, and I feel like we should either
> bless that practice in this document or decide that it was a bad idea.

> For reference, what I added to the buildfarm's list was

> +InjectionPointCacheEntry
> +InjectionPointCondition
> +InjectionPointConditionType
> +InjectionPointEntry
> +InjectionPointSharedState
> +NotificationHash
> +ReadBuffersFlags
> +ResourceOwnerData
> +WaitEventExtension
> +WalSyncMethod

I realized that the reason the InjectionPoint typedefs were missing
is that none of the buildfarm animals that contribute typedefs are
building with --enable-injection-points.  I rectified that on sifaka,
and now those are in the list available from the buildfarm.

As for the remainder, they aren't showing up because no variable
or field is declared using them, which means no debug symbol
table entry is made for them.  This means we could just drop those
typedefs and be little the worse off notationally.  I experimented
with a patch for that, as attached.  (In the case of NotificationHash,
I thought it better to arrange for there to be a suitable variable;
but it could certainly be done the other way too.)  Is this too anal?

regards, tom lane

diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index d0891e3f0e..6861f028d2 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -2253,10 +2253,13 @@ AsyncExistsPendingNotify(Notification *n)
 	if (pendingNotifies->hashtab != NULL)
 	{
 		/* Use the hash table to probe for a match */
-		if (hash_search(pendingNotifies->hashtab,
-		,
-		HASH_FIND,
-		NULL))
+		NotificationHash *ent;
+
+		ent = hash_search(pendingNotifies->hashtab,
+		  ,
+		  HASH_FIND,
+		  NULL);
+		if (ent)
 			return true;
 	}
 	else
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index ab9343bc5c..3bde0eba4d 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -107,7 +107,7 @@ StaticAssertDecl(RESOWNER_HASH_MAX_ITEMS(RESOWNER_HASH_INIT_SIZE) >= RESOWNER_AR
 /*
  * ResourceOwner objects look like this
  */
-typedef struct ResourceOwnerData
+struct ResourceOwnerData
 {
 	ResourceOwner parent;		/* NULL if no parent (toplevel owner) */
 	ResourceOwner firstchild;	/* head of linked list of children */
@@ -155,7 +155,7 @@ typedef struct ResourceOwnerData
 
 	/* The local locks cache. */
 	LOCALLOCK  *locks[MAX_RESOWNER_LOCKS];	/* list of owned locks */
-} ResourceOwnerData;
+};
 
 
 /*
@@ -415,7 +415,7 @@ ResourceOwnerCreate(ResourceOwner parent, const char *name)
 	ResourceOwner owner;
 
 	owner = (ResourceOwner) MemoryContextAllocZero(TopMemoryContext,
-   sizeof(ResourceOwnerData));
+   sizeof(*owner));
 	owner->name = name;
 
 	if (parent)
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 76787a8267..1a1f11a943 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -19,14 +19,14 @@
 
 
 /* Sync methods */
-typedef enum WalSyncMethod
+enum WalSyncMethod
 {
 	WAL_SYNC_METHOD_FSYNC = 0,
 	WAL_SYNC_METHOD_FDATASYNC,
 	WAL_SYNC_METHOD_OPEN,		/* for O_SYNC */
 	WAL_SYNC_METHOD_FSYNC_WRITETHROUGH,
 	WAL_SYNC_METHOD_OPEN_DSYNC	/* for O_DSYNC */
-} WalSyncMethod;
+};
 extern PGDLLIMPORT int wal_sync_method;
 
 extern PGDLLIMPORT XLogRecPtr ProcLastRecPtr;
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 42211bfec4..edb7011743 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -107,14 +107,14 @@ typedef struct BufferManagerRelation
 #define BMR_REL(p_rel) ((BufferManagerRelation){.rel = p_rel})
 #define BMR_SMGR(p_smgr, p_relpersistence) ((BufferManagerRelation){.smgr = p_smgr, .relpersistence = p_relpersistence})
 
-typedef enum ReadBuffersFlags
+enum ReadBuffersFlags
 {
 	/* Zero out page if reading fails. */
 	READ_BUFFERS_ZERO_ON_ERROR = (1 << 0),
 
 	/* Call smgrprefetch() if I/O necessary. */
 	READ_BUFFERS_ISSUE_ADVICE = (1 << 1),
-} ReadBuffersFlags;
+};
 
 struct ReadBuffersOperation
 {
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 080e92d1cf..72c4d60930 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -53,11 +53,11 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
  *
  * The ID retrieved can be used with pgstat_report_wait_start() or equivalent.
  */
-typedef enum
+enum WaitEventExtension
 {
 	WAIT_EVENT_EXTENSION = PG_WAIT_EXTENSION,
 	WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED,
-} WaitEventEx

Re: Adding the extension name to EData / log_line_prefix

2024-05-15 Thread Tom Lane
Andres Freund  writes:
> What portability issues do you forsee? We already look up the same symbol in
> all the shared libraries ("Pg_magic_func"), so we know that we can deal with
> duplicate function names. Are you thinking that somehow we'd end up with
> symbol interposition or something?

No, it's the dependence on the physical library file name that's
bothering me.  Maybe that won't be an issue, but I foresee requests
like "would you please case-fold it" or "the extension-trimming rule
isn't quite right", etc.

regards, tom lane




Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-15 Thread Tom Lane
Nathan Bossart  writes:
> On Wed, May 15, 2024 at 04:07:18PM -0400, Robert Haas wrote:
>> How's this?

> I compared this with my v1, and the only bit of information there that I
> see missing in v3 is that validation step 4 only applies in the
> once-per-cycle run (or if you forget to pgindent before committing a
> patch).  This might be why I was struggling to untangle the two types of
> pgindent runs in my attempt.  Perhaps it's worth adding a note to that step
> about when it is required.

Oh ... another problem is that the VALIDATION steps really apply to
both kinds of indent runs, but it's not clear to me that that's
obvious in v3.  Maybe the steps should be rearranged to be
(1) base case, (2) VALIDATION, (3) ONCE PER CYCLE.

At this point my OCD got the better of me and I did a little
additional wordsmithing.  How about the attached?

        regards, tom lane

diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
index b649a21f59..369f120eb0 100644
--- a/src/tools/pgindent/README
+++ b/src/tools/pgindent/README
@@ -1,8 +1,9 @@
 pgindent'ing the PostgreSQL source tree
 ===
 
-We run this process at least once in each development cycle,
-to maintain uniform layout style in our C and Perl code.
+pgindent is used to maintain uniform layout style in our C and Perl code,
+and should be run for every commit. There are additional code beautification
+tasks which should be performed at least once per release cycle.
 
 You might find this blog post interesting:
 http://adpgtech.blogspot.com/2015/05/running-pgindent-on-non-core-code-or.html
@@ -25,45 +26,31 @@ PREREQUISITES:
Or if you have cpanm installed, you can just use:
cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20230309.tar.gz
 
-DOING THE INDENT RUN:
 
-1) Change directory to the top of the source tree.
-
-2) Download the latest typedef file from the buildfarm:
-
-	wget -O src/tools/pgindent/typedefs.list https://buildfarm.postgresql.org/cgi-bin/typedefs.pl
+DOING THE INDENT RUN BEFORE A NORMAL COMMIT:
 
-   (See https://buildfarm.postgresql.org/cgi-bin/typedefs.pl?show_list for a full
-   list of typedef files, if you want to indent some back branch.)
+1) Change directory to the top of the source tree.
 
-3) Run pgindent on the C files:
+2) Run pgindent on the C files:
 
 	src/tools/pgindent/pgindent .
 
If any files generate errors, restore their original versions with
"git checkout", and see below for cleanup ideas.
 
-4) Indent the Perl code using perltidy:
-
-	src/tools/pgindent/pgperltidy .
-
-   If you want to use some perltidy version that's not in your PATH,
-   first set the PERLTIDY environment variable to point to it.
-
-5) Reformat the bootstrap catalog data files:
-
-	./configure # "make" will not work in an unconfigured tree
-	cd src/include/catalog
-	make reformat-dat-files
-	cd ../../..
-
-VALIDATION:
-
-1) Check for any newly-created files using "git status"; there shouldn't
+3) Check for any newly-created files using "git status"; there shouldn't
be any.  (pgindent leaves *.BAK files behind if it has trouble, while
perltidy leaves *.LOG files behind.)
 
-2) Do a full test build:
+4) If pgindent wants to change anything your commit wasn't touching,
+   stop and figure out why.  If it is making ugly whitespace changes
+   around typedefs your commit adds, you need to add those typedefs
+   to src/tools/pgindent/typedefs.list.
+
+5) If you have the patience, it's worth eyeballing the "git diff" output
+   for any egregiously ugly changes.  See below for cleanup ideas.
+
+6) Do a full test build:
 
 	make -s clean
 	make -s all	# look for unexpected warnings, and errors of course
@@ -75,14 +62,38 @@ VALIDATION:
header files that get copied into ecpg output; if so, adjust the
expected-files to match.
 
-3) If you have the patience, it's worth eyeballing the "git diff" output
-   for any egregiously ugly changes.  See below for cleanup ideas.
 
+AT LEAST ONCE PER RELEASE CYCLE:
+
+1) Download the latest typedef file from the buildfarm:
+
+	wget -O src/tools/pgindent/typedefs.list https://buildfarm.postgresql.org/cgi-bin/typedefs.pl
+
+   This step resolves any differences between the incrementally updated
+   version of the file and a clean, autogenerated one.
+   (See https://buildfarm.postgresql.org/cgi-bin/typedefs.pl?show_list for a full
+   list of typedef files, if you want to indent some back branch.)
+
+2) Run pgindent as above.
+
+3) Indent the Perl code using perltidy:
+
+	src/tools/pgindent/pgperltidy .
+
+   If you want to use some perltidy version that's not in your PATH,
+   first set the PERLTIDY environment variable to point to it.
+
+4) Reformat the bootstrap catalog data files:
+
+	./configure # "make" will not work in an unconfigured tree
+	cd src/include/catalog
+	make reformat-dat-files
+	cd ../

Re: Why does pgindent's README say to download typedefs.list from the buildfarm?

2024-05-15 Thread Tom Lane
Robert Haas  writes:
> On Wed, May 15, 2024 at 3:30 PM Nathan Bossart  
> wrote:
>> This is much cleaner, thanks.  The only thing that stands out to me is that
>> the "once per release cycle" section should probably say to do an indent
>> run after downloading the typedef file.

> How's this?

This works for me.  One point that could stand discussion while we're
here is whether the once-a-cycle run should use the verbatim buildfarm
results or it's okay to editorialize on that typedefs list.  I did a
little of the latter in da256a4a7, and I feel like we should either
bless that practice in this document or decide that it was a bad idea.

For reference, what I added to the buildfarm's list was

+InjectionPointCacheEntry
+InjectionPointCondition
+InjectionPointConditionType
+InjectionPointEntry
+InjectionPointSharedState
+NotificationHash
+ReadBuffersFlags
+ResourceOwnerData
+WaitEventExtension
+WalSyncMethod

I believe all of these must have been added manually during v17.
If I took any one of them out there was some visible disimprovement
in pgindent's results, so I kept them.  Was that the right decision?
If so we should explain it here.

regards, tom lane




Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread Tom Lane
"David G. Johnston"  writes:
> Now, in my ideal world something like this could be made as an extension so
> that it can work on older versions and not have to be maintained by core.
> And likely grow more features over time.  Is there anything fundamental
> about this that prevents it being implemented in an extension and, if so,
> what can we add to core in v18 to alleviate that limitation?

It'd be pretty trivial to create a function that takes a string
and runs it through raw_parser --- I've got such things laying
about for microbenchmarking purposes, in fact.  But the API that'd
present for tools such as editors is enormously different from
the proposed patch: there would need to be a running server and
they'd need to be able to log into it, plus there are more minor
concerns like having to wrap the string in valid quoting.

Now on the plus side, once you'd bought into that environment,
it'd be equally trivial to offer alternatives like "run raw
parsing and parse analysis, but don't run the query".  I continue
to maintain that that's the set of checks you'd really want in a
lot of use-cases.

    regards, tom lane




Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread Tom Lane
Robert Haas  writes:
> On Wed, May 15, 2024 at 2:39 PM Tom Lane  wrote:
>> The thing that was bothering me most about this is that I don't
>> understand why that's a useful check. ...

> But I don't understand the idea that the concept doesn't make sense.

Sorry: "make sense" was a poorly chosen phrase there.  What I was
doubting, and continue to doubt, is that 100% checking of what
you can check without catalog access and 0% checking of the rest
is a behavior that end users will find especially useful.

> I think it is perfectly reasonable to imagine a world in which the
> initial parsing takes care of reporting everything that can be
> determined by static analysis without knowing anything about catalog
> contents, and parse analysis checks all the things that require
> catalog access, and you can run the first set of checks and then
> decide whether to proceed further. I think if I were designing a new
> system from scratch, I'd definitely want to set it up that way, and I
> think moving our existing system in that direction would probably let
> us clean up a variety of warts along the way. Really, the only
> argument I see against it is that getting from where we are to there
> would be a gigantic amount of work for the value we'd derive.

I'm less enthusiatic about this than you are.  I think it would likely
produce a slower and less maintainable system.  Slower because we'd
need more passes over the query: what parse analysis does today would
have to be done in at least two separate steps.  Less maintainable
because knowledge about certain things would end up being more spread
around the system.  Taking your example of getting syntax checking to
recognize invalid EXPLAIN keywords: right now there's just one piece
of code that knows what those options are, but this'd require two.
Also, "run the first set of checks and then decide whether to proceed
further" seems like optimizing the system for broken queries over
valid ones, which I don't think is an appropriate goal.

Now, I don't say that there'd be *no* benefit to reorganizing the
system that way.  But it wouldn't be an unalloyed win, and so I
share your bottom line that the costs would be out of proportion
to the benefits.

regards, tom lane




Re: add function argument names to regex* functions.

2024-05-15 Thread Tom Lane
Robert Haas  writes:
> On Thu, Apr 4, 2024 at 9:55 AM jian he  wrote:
>> changing "N" to lower-case would be misleading for regexp_replace?
>> so I choose "count".

> I don't see why that would be confusing for regexp_replace
> specifically, but I think N => count is a reasonable change to make.
> However, I don't think this quite works:
> + then the count'th match of the pattern

I think the origin of the problem here is not wanting to use "N"
as the actual name of the parameter, because then users would have
to double-quote it to write "regexp_replace(..., "N" => 42, ...)".

However ... is that really so awful?  It's still fewer keystrokes
than "count".  It's certainly a potential gotcha for users who've
not internalized when they need double quotes, but I think we
could largely address that problem just by making sure to provide
a documentation example that shows use of "N".

> An English speaker is more likely to understand what is meant by
> "N'th" than what is meant by "count'th".

+1 ... none of the proposals make that bit read more clearly
than it does now.

regards, tom lane




Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread Tom Lane
walt...@technowledgy.de writes:
> Tom Lane:
>> BTW, if you do feel that a pure grammar check is worthwhile, you
>> should look at the ecpg preprocessor, which does more or less that
>> with the SQL portions of its input.

> Would working with ecpg allow to get back a parse tree of the query to 
> do stuff with that?

No, ecpg isn't interested in building a syntax tree.

> This is really what is missing for the ecosystem. A libpqparser for 
> tools to use: Formatters, linters, query rewriters, simple syntax 
> checkers... they are all missing access to postgres' own parser.

To get to that, you'd need some kind of agreement on what the syntax
tree is.  I doubt our existing implementation would be directly useful
to very many tools, and even if it is, do they want to track constant
version-to-version changes?

    regards, tom lane




Re: Fix PGresult leak in pg_dump during binary upgrade

2024-05-15 Thread Tom Lane
Daniel Gustafsson  writes:
> While looking at pg_dump performance today I noticed that pg_dump fails to
> clear query results in binary_upgrade_set_pg_class_oids during binary upgrade
> mode.  9a974cbcba00 moved the query to the outer block, but left the PQclear
> and query buffer destruction in the is_index conditional, making it not always
> be executed.  353708e1fb2d fixed the leak of the query buffer but left the
> PGresult leak.  The attached fixes the PGresult leak which when upgrading 
> large
> schemas can be non-trivial.

+1 --- in 353708e1f I was just fixing what Coverity complained about.
I wonder why it missed this; it does seem to understand that PGresult
leaks are a thing.  But anyway, I missed it too.

        regards, tom lane




Re: [PATCH] Add --syntax to postgres for SQL syntax checking

2024-05-15 Thread Tom Lane
=?UTF-8?B?Sm9zZWYgxaBpbcOhbmVr?=  writes:
> I'm not sure everyone in this thread understands the reason for this
> patch, which is clearly my fault, since I have failed to explain. Main
> idea is to make a tool to validate query can be parsed, that's all.
> Similar to running "EXPLAIN query", but not caring about the result
> and not caring about the DB structure (ignoring missing tables, ...),
> just checking it was successfully executed. This definitely belongs to
> the server side and not to the client side, it is just a tool to
> validate that for this running PostgreSQL backend it is a "parseable"
> query.

The thing that was bothering me most about this is that I don't
understand why that's a useful check.  If I meant to type

UPDATE mytab SET mycol = 42;

and instead I type

UPDATEE mytab SET mycol = 42;

your proposed feature would catch that; great.  But if I type

UPDATE mytabb SET mycol = 42;

it won't.  How does that make sense?  I'm not entirely sure where
to draw the line about what a "syntax check" should catch, but this
seems a bit south of what I'd want in a syntax-checking editor.

BTW, if you do feel that a pure grammar check is worthwhile, you
should look at the ecpg preprocessor, which does more or less that
with the SQL portions of its input.  ecpg could be a better model
to follow because it doesn't bring all the dependencies of the server
and so is much more likely to appear in a client-side installation.
It's kind of an ugly, unmaintained mess at the moment, sadly.

The big knock on doing this client-side is that there might be
version skew compared to the server you're using --- but if you
are not doing anything beyond a grammar-level check then your
results are pretty approximate anyway, ISTM.  We've not heard
anything suggesting that version skew is a huge problem for
ecpg users.

regards, tom lane




Re: Adding the extension name to EData / log_line_prefix

2024-05-15 Thread Tom Lane
Andres Freund  writes:
> On 2024-05-15 12:54:45 -0400, Chapman Flack wrote:
>> But I'd bet, within the fraction of the population that does use it,
>> it is already a short string that looks a whole lot like the name
>> of the extension. So maybe enhancing the documentation and making it
>> easy to set up would achieve much of the objective here.

> The likely outcome would IMO be that some extensions will have the data,
> others not. Whereas inferring the information from our side will give you
> something reliable.
> But I also just don't think it's something that architecturally fits together
> that well. If we either had TEXTDOMAIN reliably set across extensions or it'd
> architecturally be pretty, I'd go for it, but imo it's neither.

There is one advantage over my suggestion of changing PG_MODULE_MAGIC:
if we tell people to write

   PG_MODULE_MAGIC;
   #undef TEXTDOMAIN
   #define TEXTDOMAIN PG_TEXTDOMAIN("hstore")

then that's 100% backwards compatible and they don't need any
version-testing ifdef's.

I still think that the kind of infrastructure Andres proposes
is way overkill compared to the value, plus it's almost certainly
going to have a bunch of platform-specific problems to solve.
So I think Peter's thought is worth pursuing.

regards, tom lane




Re: Adding the extension name to EData / log_line_prefix

2024-05-15 Thread Tom Lane
Peter Eisentraut  writes:
> On 14.05.24 01:11, Tom Lane wrote:
>> The mechanism that Andres describes for sourcing the name seems a bit
>> overcomplex though.  Why not just allow/require each extension to
>> specify its name as a constant string?  We could force the matter by
>> redefining PG_MODULE_MAGIC as taking an argument:
>> PG_MODULE_MAGIC("hstore");

> We kind of already have something like this, for NLS.  If you look for 
> pg_bindtextdomain(TEXTDOMAIN) and ereport_domain(), this information 
> already trickles into the vicinity of the error data.  Maybe the same 
> thing could just be used for this, by wiring up the macros a bit 
> differently.

Hmm, cute idea, but it'd only help for extensions that are
NLS-enabled.  Which I bet is a tiny fraction of the population.
So far as I can find, we don't even document how to set up
TEXTDOMAIN for an extension --- you have to cargo-cult the
macro definition from some in-core extension.

regards, tom lane




Re: Fixup a few 2023 copyright years

2024-05-15 Thread Tom Lane
David Rowley  writes:
> On Wed, 15 May 2024 at 17:32, Michael Paquier  wrote:
>> While running src/tools/copyright.pl, I have noticed that that a
>> newline was missing at the end of index_including.sql, as an effect of
>> the test added by you in a63224be49b8.  I've cleaned up that while on
>> it, as it was getting added automatically, and we tend to clean these
>> like in 3f1197191685 or more recently c2df2ed90a82.

> Thanks for fixing that.  I'm a little surprised that pgindent does not
> fix that sort of thing.

pgindent does not touch anything but .c and .h files.

I do recommend running "git diff --check" (with --staged if you
already git-added your changes) before you're ready to commit
something.  That does find generic whitespace issues, and I
believe it would've found this one.

regards, tom lane




Re: explain format json, unit for serialize and memory are different.

2024-05-14 Thread Tom Lane
Michael Paquier  writes:
> Perhaps Alvaro and Tom would like to chime in, as committers of
> respectively 5de890e3610d and 06286709ee06?

No objection here.  In a green field I might argue for
round-to-nearest instead of round-up, but it looks like we
have several precedents for round-up, so let's avoid changing
that existing behavior.

regards, tom lane




Re: An improved README experience for PostgreSQL

2024-05-14 Thread Tom Lane
Nathan Bossart  writes:
> On Tue, May 14, 2024 at 10:05:01AM +0200, Peter Eisentraut wrote:
>> My point is, in order to get that enhanced GitHub experience, you don't
>> actually have to commit these files into the individual source code
>> repository.  You can add them to the organization and they will apply to all
>> repositories under the organization.  This is explained at the above link.

> Oh, I apologize, my brain skipped over the word "organization" in your
> message.

FWIW, I'd vote against doing it that way, because then
maintaining/updating those files would only be possible for whoever
owns the github repo.  I don't have a position on whether we want
these additional files or not; but if we do, I think the best answer
is to stick 'em under .github/ where they are out of the way but yet
updatable by any committer.

regards, tom lane




Re: elog/ereport VS misleading backtrace_function function address

2024-05-14 Thread Tom Lane
Alvaro Herrera  writes:
> On 2024-May-14, Jakub Wartak wrote:
>> Reality is apparently mixed,at least from what I have checked :
>> - all RHEL 7.x/8.x (PGDG and our forks) do NOT come with
>> --enable-debug according to pg_config.

> Ooh, yeah, that's true according to 
> https://git.postgresql.org/gitweb/?p=pgrpms.git;a=blob;f=rpm/redhat/main/non-common/postgresql-16/main/postgresql-16.spec;h=ab2f6edc903f083e04b8f7a1d3bad8e1b7018791;hb=1a8b9fa7019d3f73322ca873b62dc0b33e73ed1d

>  507 %if %beta
>  508 --enable-debug \
>  509 --enable-cassert \
>  510 %endif

> Maybe a better approach for this whole thing is to change the specs so
> that --enable-debug is always given, not just for %beta.

My recollection from my time at Red Hat is that their standard policy
is to build everything with debug symbols all the time; so this is
violating that policy, and we should change it just on those grounds.
However, I'm not sure how much the change will help Joe Average User
with respect to the thread topic.  RH actually has infrastructure that
splits the debug symbol tables out into separate "debuginfo" RPMs,
which you have to install manually if you want to debug a particular
package.  This is good for disk space consumption, but it means that
most users are still only going to see the same backtrace they see
currently.

I don't know how much of that applies to, eg, Debian.

regards, tom lane




Re: I have an exporting need...

2024-05-14 Thread Tom Lane
Heikki Linnakangas  writes:
> On 13/05/2024 16:01, Juan Hernández wrote:
>> Do you consider useful to add a parameter (for example, 
>> --separatetables) so when used the exporting file process can create a 
>> different tablename.sql file for each table in database automatically?

> It'd be tricky to restore from, as you need to restore the tables in the 
> right order. I think you'd still need a "main" sql file that includes 
> all the other files in the right order. And using the table names as 
> filenames gets tricky if the table names contain any funny characters.

It's a lot worse than that, as it's entirely possible to have circular
FK dependencies, meaning there is no "right order" if you think of
each table file as self-contained DDL plus data.  Other sorts of
circularities are possible too.

pg_dump deals with that hazard by splitting things up: first create
all the tables, then load all the data, then create all the indexes
and foreign keys.  You can tell it to just emit the parts relevant to
a particular table, but it's on your head whether that's actually
going to be useful in your context.  I doubt that it's widely enough
useful to justify creating a special mode beyond what we already
have.

regards, tom lane




Re: Underscore in positional parameters?

2024-05-14 Thread Tom Lane
Dean Rasheed  writes:
> On Tue, 14 May 2024 at 07:43, Michael Paquier  wrote:
>> On Tue, May 14, 2024 at 05:18:24AM +0200, Erik Wienhold wrote:
>>> Parameter $1_2 is taken as $1 because in rule {param} in scan.l we get
>>> the parameter number with atol which stops at the underscore.  That's a
>>> regression in faff8f8e47f.  Before that commit, $1_2 resulted in
>>> "ERROR: trailing junk after parameter".

> I'm sure that this wasn't intentional -- I think we just failed to
> notice that "param" also uses "decinteger" in the scanner. Taking a
> quick look, there don't appear to be any other uses of "decinteger",
> so at least it only affects params.

> Unless the spec explicitly says otherwise, I agree that we should
> reject this, as we used to do, and add a comment saying that it's
> intentionally not supported. I can't believe it would ever be useful,
> and the current behaviour is clearly broken.

+1, let's put this back the way it was.

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-14 Thread Tom Lane
Alexander Lakhin  writes:
> 13.05.2024 23:17, Tom Lane wrote:
>> 3. We don't know exactly why hamerkop suddenly started seeing these
>> failures, but a plausible theory emerges after noting that its
>> reported time for the successful "make check" step dropped pretty
>> substantially right when this started.  In the v13 branch, "make
>> check" was taking 2:18 or more in the several runs right before the
>> first isolationcheck failure, but 1:40 or less just after.  So it
>> looks like the animal was moved onto faster hardware.

> Yes, and one thing I can't explain yet, is why REL_14_STABLE+ timings
> substantially differ from REL_13_STABLE-, say, for the check stage:

As I mentioned in our off-list discussion, I have a lingering feeling
that this v14 commit could be affecting the results somehow:

Author: Tom Lane 
Branch: master Release: REL_14_BR [d5a9a661f] 2020-10-18 12:56:43 -0400

Update the Winsock API version requested by libpq.

According to Microsoft's documentation, 2.2 has been the current
version since Windows 98 or so.  Moreover, that's what the Postgres
backend has been requesting since 2004 (cf commit 4cdf51e64).
So there seems no reason for libpq to keep asking for 1.1.

I didn't believe at the time that that'd have any noticeable effect,
but maybe it somehow made Winsock play a bit nicer with the GSS
support?

regards, tom lane




Re: Why is parula failing?

2024-05-13 Thread Tom Lane
David Rowley  writes:
> I've not seen any recent failures from Parula that relate to this
> issue.  The last one seems to have been about 4 weeks ago.

> I'm now wondering if it's time to revert the debugging code added in
> 1db689715.  Does anyone think differently?

+1.  It seems like we wrote off the other issue as a compiler bug,
so I don't have much trouble assuming that this one was too.

    regards, tom lane




Re: Adding the extension name to EData / log_line_prefix

2024-05-13 Thread Tom Lane
Andres Freund  writes:
> On 2024-05-13 19:25:11 -0300, Fabrízio de Royes Mello wrote:
>> Hmmm, depending on the extension it can extensively call/use postgres code
>> so would be nice if we can differentiate if the code is called from
>> Postgres itself or from an extension.

> I think that's not realistically possible. It's also very fuzzy what that'd
> mean. If there's a planner hook and then the query executes normally, what do
> you report for an execution time error? And even the simpler case - should use
> of pg_stat_statements cause everything within to be logged as a
> pg_stat_statement message?

Not to mention that there could be more than one extension on the call
stack.  I think tying this statically to the ereport call site is
fine.

The mechanism that Andres describes for sourcing the name seems a bit
overcomplex though.  Why not just allow/require each extension to
specify its name as a constant string?  We could force the matter by
redefining PG_MODULE_MAGIC as taking an argument:

PG_MODULE_MAGIC("hstore");

regards, tom lane




Re: Why is citext/regress failing on hamerkop?

2024-05-13 Thread Tom Lane
Thomas Munro  writes:
> For citext_utf8, I pushed cff4e5a3.  Hamerkop runs infrequently, so
> here's hoping for 100% green on master by Tuesday or so.

In the meantime, some off-list investigation by Alexander Lakhin
has turned up a good deal of information about why we're seeing
failures on hamerkop in the back branches.  Summarizing, it
appears that

1. In a GSS-enabled Windows build without any active Kerberos server,
libpq's pg_GSS_have_cred_cache() succeeds, allowing libpq to try to
open a GSSAPI connection, but then gss_init_sec_context() fails,
leading to client-side reports like this:

+Connection 2 failed: could not initiate GSSAPI security context: Unspecified 
GSS failure.  Minor code may provide more information: Credential cache is empty
+FATAL:  sorry, too many clients already

(The first of these lines comes out during the attempted GSS
connection, the second during the only-slightly-more-successful
non-GSS connection.)  So that's problem number 1: how is it that
gss_acquire_cred() succeeds but then gss_init_sec_context() disclaims
knowledge of any credentials?  Can we find a way to get this failure
to be detected during pg_GSS_have_cred_cache()?  It is mighty
expensive to launch a backend connection that is doomed to fail,
especially when this happens during *every single libpq connection
attempt*.

2. Once gss_init_sec_context() fails, libpq abandons the connection
and starts over; since it has already initiated a GSS handshake
on the connection, there's not much choice.  Although libpq faithfully
issues closesocket() on the abandoned connection, Alexander found
that the connected backend doesn't reliably see that: it may just
sit there until the AuthenticationTimeout elapses (1 minute by
default).  That backend is still consuming a postmaster child slot,
so if this happens on any sizable fraction of test connection
attempts, it's little surprise that we soon get "sorry, too many
clients already" failures.

3. We don't know exactly why hamerkop suddenly started seeing these
failures, but a plausible theory emerges after noting that its
reported time for the successful "make check" step dropped pretty
substantially right when this started.  In the v13 branch, "make
check" was taking 2:18 or more in the several runs right before the
first isolationcheck failure, but 1:40 or less just after.  So it
looks like the animal was moved onto faster hardware.  That feeds
into this problem because, with a successful isolationcheck run
taking more than a minute, there was enough time for some of the
earlier stuck sessions to time out and exit before their
postmaster-child slots were needed.

Alexander confirmed this theory by demonstrating that the main
regression tests in v15 would pass if he limited their speed enough
(by reducing the CPU allowed to a VM) but not at full speed.  So the
buildfarm results suggesting this is only an issue in <= v13 must
be just a timing artifact; the problem is still there.

Of course, backends waiting till timeout is not a good behavior
independently of what is triggering that, so we have two problems to
solve here, not one.  I have no ideas about the gss_init_sec_context()
failure, but I see a plausible theory about the failure to detect
socket closure in Microsoft's documentation about closesocket() [1]:

If the l_onoff member of the LINGER structure is zero on a stream
socket, the closesocket call will return immediately and does not
receive WSAEWOULDBLOCK whether the socket is blocking or
nonblocking. However, any data queued for transmission will be
sent, if possible, before the underlying socket is closed. This is
also called a graceful disconnect or close. In this case, the
Windows Sockets provider cannot release the socket and other
resources for an arbitrary period, thus affecting applications
that expect to use all available sockets. This is the default
behavior for a socket.

I'm not sure whether we've got unsent data pending in the problematic
condition, nor why it'd remain unsent if we do (shouldn't the backend
consume it anyway?).  But this has the right odor for an explanation.

I'm pretty hesitant to touch this area myself, because it looks an
awful lot like commits 6051857fc and ed52c3707, which eventually
had to be reverted.  I think we need a deeper understanding of
exactly what Winsock is doing or not doing before we try to fix it.
I wonder if there are any Microsoft employees around here with
access to the relevant source code.

In the short run, it might be a good idea to deprecate using
--with-gssapi on Windows builds.  A different stopgap idea
could be to drastically reduce the default AuthenticationTimeout,
perhaps only on Windows.

    regards, tom lane

[1] 
https://learn.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket




Re: Is there any chance to get some kind of a result set sifting mechanism in Postgres?

2024-05-13 Thread Tom Lane
aa  writes:
> If you call the action of "sifting" ordering, then yes. If you don't call
> it ordering, then no.
> In essence, is the output of a filtering mechanism, done in a single result
> set pass. And this pass should be the same pass in charge of collecting the
> result set in the first place.

Sounds a lot like a WHERE clause to me.

    regards, tom lane




Re: Fix out-of-bounds in the function GetCommandTagName

2024-05-13 Thread Tom Lane
David Rowley  writes:
> I've added a CF entry under your name for this:
> https://commitfest.postgresql.org/48/4927/

> If it was code new to PG17 I'd be inclined to go ahead with it now,
> but it does not seem to align with making the release mode stable.
> I'd bet others will feel differently about that.  Delaying seems a
> better default choice at least.

The security team's Coverity instance has started to show this
complaint now too.  So I'm going to go ahead and push this change
in HEAD.  It's probably unwise to change it in stable branches,
since there's at least a small chance some external code is using
COMMAND_TAG_NEXTTAG for the same purpose tag_behavior[] does.
But we aren't anywhere near declaring v17's API stable, so
I'd rather fix the issue than dismiss it in HEAD.

        regards, tom lane




  1   2   3   4   5   6   7   8   9   10   >