DISCARD ALL does not force re-planning of plpgsql functions/procedures

2024-05-26 Thread Jelte Fennema-Nio
I got a report on the PgBouncer repo[1] that running DISCARD ALL was
not sufficient between connection handoffs to force replanning of
stored procedures. Turns out that while DISCARD AL and DISCARD PLAN
reset the plan cache they do not reset the num_custom_plans fields of
the existing PlanSources. So while the generic plan is re-planned
after DISCARD ALL, the decision on whether to choose it or not won't
be changed. See below for a minimally reproducing example:


create table test_mode (a int);
insert into test_mode select 1 from generate_series(1,100) union
all select 2;
create index on test_mode (a);
analyze test_mode;

create function test_mode_func(int)
returns integer as $$
declare
v_count integer;
begin
select into v_count count(*) from test_mode where a = $1;
return v_count;
END;
$$ language plpgsql;

\timing on
-- trigger execution 5 times
SELECT test_mode_func(1);
SELECT test_mode_func(1);
SELECT test_mode_func(1);
SELECT test_mode_func(1);
SELECT test_mode_func(1);
DISCARD ALL;
-- slow because of bad plan, even after DISCARD ALL
SELECT test_mode_func(2);
\c
-- fast after re-connect, because of custom plan
SELECT test_mode_func(2);



[1]: https://github.com/pgbouncer/pgbouncer/issues/912#issuecomment-2131250109




Re: Volatile write caches on macOS and Windows, redux

2024-05-25 Thread Jelte Fennema-Nio
On Tue, 18 Jul 2023 at 05:29, Thomas Munro  wrote:
> It's possible that fcntl(F_FULLFSYNC) might fail with ENOSUPP or other
> errors in obscure cases (eg unusual file systems).  In that case, you
> could manually lower fsync to just "on" and do your own research on
> whether power loss can toast your database, but that doesn't seem like
> a reason for us not to ship good solid defaults for typical users.

Is this the only reason why you're suggesting adding fsync=full,
instead of simply always setting F_FULLFSYNC when fsync=true on MacOS.
If so, I'm not sure we really gain anything by this tri-state. I think
people either care about data loss on power loss, or they don't. I
doubt many people want his third intermediate option, which afaict
basically means lose data on powerloss less often than fsync=false but
still lose data most of the time.

If you're going to keep this tri-state for MacOS, then it still seems
nicer to me to "fix" fsync=true on MacOS and introduce a fsync=partial
or something. Then defaults are the same across platforms and anyone
setting fsync=yes currently in their postgresql.conf would get the
fixed behaviour on upgrade.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Jelte Fennema-Nio
(pressed send to early)

On Sat, 25 May 2024 at 12:39, Jelte Fennema-Nio  wrote:
> But again if I'm alone in this, then I don't

... mind budging on this to move this decision along. Using _pq_.xxx
parameters for all protocol changes would totally be acceptable to me.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Jelte Fennema-Nio
On Thu, 23 May 2024 at 20:12, Tom Lane  wrote:
>
> Jelte Fennema-Nio  writes:
> > On Fri, 17 May 2024 at 21:24, Robert Haas  wrote:
> >> Perhaps these are all minority positions, but I can't tell you what
> >> everyone thinks, only what I think.
>
> > I'd love to hear some opinions from others on these design choices. So
> > far it seems like we're the only two that have opinions on this (which
> > seems hard to believe) and our opinions are clearly conflicting. And
> > above all I'd like to move forward with this, be it my way or yours
> > (although I'd prefer my way of course ;) )
>
> I got around to looking through this thread in preparation for next
> week's patch review session.  I have a couple of opinions to offer:
>
> 1. Protocol versions suck.  Bumping them is seldom a good answer,
> and should never be done if you have any finer-grained negotiation
> mechanism available.  My aversion to this is over thirty years old:
> I learned that lesson from watching the GIF87-to-GIF89 transition mess.
> Authors of GIF-writing tools tended to take the easy way out and write
> "GIF89" in the header whether they were actually using any of the new
> version's features or not.  This led to an awful lot of pictures that
> couldn't be read by available GIF-displaying tools, for no good reason
> whatsoever.  The PNG committee, a couple years later, reacted to that
> mess by designing PNG to have no version number whatsoever, and yet
> be extensible in a fine-grained way.  (Basically, a PNG file is made
> up of labeled chunks.  If a reader doesn't recognize a particular
> chunk code, it can still tell whether the chunk is "critical" or not,
> and thereby decide if it must give up or can proceed while ignoring
> that chunk.)
>
> So overall, I have a strong preference for using the _pq_.xxx
> mechanism instead of a protocol version bump.  I do not believe
> the latter has any advantage.

I'm not necessarily super opposed to only using the _pq_.xxx
mechanism. I mainly think it's silly to have a protocol version number
and then never use it. And I feel some of the proposed changes don't
really benefit from being able to be turned on-and-off by themselves.
My rule of thumb would be:
1. Things that a modern client/pooler would always request: version bump
2. Everything else: _pq_.xxx

Of the proposed changes so far on the mailing list the only 2 that
would fall under 1 imho are:
1. The ParameterSet message
2. Longer than 32bit secret in BackendKeyData

I also don't think the GIF situation you describe translates fully to
this discussion. We have active protocol version negotiation, so if a
server doesn't support protocol 3.1 a client is expected to fall back
to the 3.0 protocol when communicating. Of course you can argue that a
badly behaved client will fail to connect when it gets a downgrade
request from the server, but that same argument can be made about a
server not reporting support for a _pq_.xxx parameter that every
modern client/pooler requests. So I don't think there's a practical
difference in the problem you're describing.



But again if I'm alone in this, then I don't




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Jelte Fennema-Nio
On Thu, 23 May 2024 at 20:40, Tom Lane  wrote:
>
> Jacob Champion  writes:
> > Would it be good to expand on that idea of criticality? IIRC one of
> > Jelte's complaints earlier was that middleware has to know all the
> > extension types anyway, to be able to figure out whether it has to do
> > something about them or not. HTTP has the concept of hop-by-hop vs
> > end-to-end headers for related reasons.
>
> Yeah, perhaps.  We'd need to figure out just which classes we need
> to divide protocol parameters into, and then think about a way for
> code to understand which class a parameter falls into even when
> it doesn't specifically know that parameter.

I think this class is so rare, that it's not worth complicating the
discussion on new protocol features even more. AFAICT there is only
one proposed protocol change that does not need any pooler support
(apart from syncing the feature value when re-assigning the
connectin): Automatic binary encoding for a list of types

All others need some support from poolers, at the very least they need
new message types to not error out. But in many cases more complex
stuff is needed.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-24 Thread Jelte Fennema-Nio
On Fri, 24 May 2024 at 15:28, Robert Haas  wrote:
>
> On Thu, May 23, 2024 at 4:00 PM Tom Lane  wrote:
> > I don't recall exactly what I thought earlier, but now I think we'd
> > be better off with separate infrastructure.  guc.c is unduly complex
> > already.  Perhaps there are bits of it that could be factored out
> > and shared, but I bet not a lot.
>
> OK. That seems fine to me, but I bet Jelte is going to disagree.

I indeed disagree. I think the effort needed to make guc.c handle
protocol parameters is extremely little. The 0011 patch are all the
changes that are needed to achieve that, and that patch only adds 65
additional lines. And only 15 of those 65 lines actually have to do
anything somewhat weird, to be able to handle the transactionality
discrepancy between protocol parameters and other GUCs. The other 50
lines are (imho) really clean and fit perfectly with the way guc.c is
currently structured (i.e. they add PGC_PROTOCOL and PGC_SU_PROTOCOL
in a really obvious way)

Separating it from the GUC infrastructure will mean we need to
duplicate a lot of the infrastructure. Assuming we don't care about
SHOW or pg_settings (which I agree are not super important), the
things that we would want for protocol parameters to have that guc.c
gives us for free are:
1. Reporting the value of the parameter to the client (done using
ParameterStatus)
2. Parsing and validating of the input, bool, int, enum, etc, but also
check_hook and assign_hook.
3. Logic in all connection poolers to change GUC values to the
client's expected values whenever a server connection is handed off to
a client
4. Permission checking, if we want some protocol extensions to only be
configurable by a highly privileged user

All of those things would have to be duplicated/re-implemented if we
make protocol parameters their own dedicated thing. Doing that work
seems like a waste of time to me, and would imho add much more
complexity than the proposed 65 lines of code in 0011.




Re: libpq compression (part 3)

2024-05-21 Thread Jelte Fennema-Nio
On Mon, 20 May 2024 at 21:42, Jacob Champion
 wrote:
> As Andrey points out, there was prior work done that started to take
> this into account. I haven't reviewed it to see how good it is -- and
> I think there are probably many use cases in which queries and tables
> contain both private and attacker-controlled information -- but if we
> agree that they have to be separated, then the strategy can at least
> be improved upon.


To help get everyone on the same page I wanted to list all the
security concerns in one place:

1. Triggering excessive CPU usage before authentication, by asking for
very high compression levels
2. Triggering excessive memory/CPU usage before authentication, by
sending a client sending a zipbomb
3. Triggering excessive CPU after authentication, by asking for a very
high compression level
4. Triggering excessive memory/CPU after authentication due to
zipbombs (i.e. small amount of data extracting to lots of data)
5. CRIME style leakage of information about encrypted data

1 & 2 can easily be solved by not allowing any authentication packets
to be compressed. This also has benefits for 5.

3 & 4 are less of a concern than 1&2 imho. Once authenticated a client
deserves some level of trust. But having knobs to limit impact
definitely seems useful.

3 can be solved in two ways afaict:
a. Allow the server to choose the maximum compression level for each
compression method (using some GUC), and downgrade the level
transparently when a higher level is requested
b. Don't allow the client to choose the compression level that the server uses.

I'd prefer option a

4 would require some safety limits on the amount of data that a
(small) compressed message can be decompressed to, and stop
decompression of that message once that limit is hit. What that limit
should be seems hard to choose though. A few ideas:
a. The size of the message reported by the uncompressed header. This
would mean that at most the 4GB will be uncompressed, since maximum
message length is 4GB (limited by 32bit message length field)
b. Allow servers to specify maximum client decompressed message length
lower than this 4GB, e.g. messages of more than 100MB of uncompressed
size should not be allowed.

I think 5 is the most complicated to deal with, especially as it
depends on the actual usage to know what is safe. I believe we should
let users have the freedom to make their own security tradeoffs, but
we should protect them against some of the most glaring issues
(especially ones that benefit little from compression anyway). As
already shown by Andrey, sending LDAP passwords in a compressed way
seems extremely dangerous. So I think we should disallow compressing
any authentication related packets. To reduce similar risks further we
can choose to compress only the message types that we expect to
benefit most from compression. IMHO those are the following (marked
with (B)ackend or (F)rontend to show who sends them):
- Query (F)
- Parse (F)
- Describe (F)
- Bind (F)
- RowDescription (B)
- DataRow (B)
- CopyData (B/F)

Then I think we should let users choose how they want to compress and
where they want their compression stream to restart. Something like
this:
a. compression_restart=query: Restart the stream after every query.
Recommended if queries across the same connection are triggered by
different end-users. I think this would be a sane default
b. compression_restart=message: Restart the stream for every message.
Recommended if the amount of correlation between rows of the same
query is a security concern.
c. compression_restart=manual: Don't restart the stream automatically,
but only when the client user calls a specific function. Recommended
only if the user can make trade-offs, or if no encryption is used
anyway.




Re: Multiple startup messages over the same connection

2024-05-19 Thread Jelte Fennema-Nio
On Sat, 18 May 2024 at 23:10, Vladimir Churyukin  wrote:
> I guess the process of passing control from child processes to the parent 
> could be a bit tricky for that one, but doable?
> Is there anything I'm missing that can be a no-go for this?

One seriously difficult/possibly impossible thing is passing SSL
session state between processes using shared memory.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 21:24, Robert Haas  wrote:
> I think the
> fear that we're going to run into cases where such complex handshaking
> is necessary is a major reason why I'm afraid of Jelte's ideas about
> ParameterSet: it seems much more opinionated to me than he seems to
> think it is.

I think that fear is valid, and I agree that we might want to add a
bespoke message for cases where ParameterSet is not enough. But as far
as I can tell ParameterSet would at least cover all the protocol
changes that have been suggested so far. Using an opinionated but
limited message type for 90% of the cases and a bespoke message for
the last 10% seems much better to me than having a bespoke one for
each (especially if currently none of the protocol proposals fall into
the last 10%).

> To the extent that I can wrap my head around what Jelte is proposing,
> and all signs point to that extent being quite limited, I suppose I
> was thinking of something like his option (2). That is, I assumed that
> a client would request all the optional features that said client
> might wish to use at connection startup time. But I can see how that
> assumption might be somewhat problematic in a connection-pooling
> environment.

To be clear, I'd also be totally fine with my option (2). I'm
personally slightly leaning towards my option (1), due to the reasons
listed before. But connection poolers could request all the protocol
extensions at the start just fine (using the default "disabled" value)
to check for support. So I think option (2) would probably be the most
conservative, i.e. we could always decide that option (1) is fine in
some future release.

> Still, at least to me, it seems better than trying to
> rely on GUC_REPORT. My opinion is (1) GUC_REPORT isn't a particularly
> well-designed mechanism so I dislike trying to double down on it

I agree that GUC_REPORT is not particularly well designed,
currently... But even in its current form it's already a very
effective mechanism for connection poolers to find out to which value
a specific GUC is set to, and if something similar to patch 0014 would
be merged my main gripe with GUC_REPORT would be gone. Tracking GUC
settings by using ParameterSet would actually be harder, because if
ParameterSet errors at Postgres then the connection pooler would have
to roll back its cache of that setting. While with the GUC_REPORT
response from Postgres it can simply rely on Postgres telling the
pooler that the GUC has changed, even rollbacks are handled correctly
this way.

> and
> (2) trying to mix these protocol-level parameters and the
> transactional GUCs we have together in a single mechanism seems
> potentially problematic

I don't understand what potential problems you're worried about here.
Could you clarify?

> and (3) I'm still not particularly happy about
> the idea of making protocol parameters into GUCs in the first place.

Similar to the above: Could you clarify why you're not happy about that?

> Perhaps these are all minority positions, but I can't tell you what
> everyone thinks, only what I think.

I'd love to hear some opinions from others on these design choices. So
far it seems like we're the only two that have opinions on this (which
seems hard to believe) and our opinions are clearly conflicting. And
above all I'd like to move forward with this, be it my way or yours
(although I'd prefer my way of course ;) )




Re: libpq compression (part 3)

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 23:10, Jacob Champion
 wrote:
> We're talking about a transport-level option, though -- I thought the
> proposal enabled compression before authentication completed? Or has
> that changed?

I think it would make sense to only compress messages after
authentication has completed. The gain of compressing authentication
related packets seems pretty limited.

> (I'm suspicious of arguments that begin "well you can already do bad
> things"

Once logged in it's really easy to max out a core of the backend
you're connected as. There's many trivial queries you can use to do
that. An example would be:
SELECT sum(i) from generate_series(1, 10) i;

So I don't think it makes sense to worry about an attacker using a
high compression level as a means to DoS the server. Sending a few of
the above queries seems much easier.




Re: libpq compression (part 3)

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 23:40, Robert Haas  wrote:
> To be clear, I am not arguing that it should be the receiver's choice.
> I'm arguing it should be the client's choice, which means the client
> decides what it sends and also tells the server what to send to it.
> I'm open to counter-arguments, but as I've thought about this more,
> I've come to the conclusion that letting the client control the
> behavior is the most likely to be useful and the most consistent with
> existing facilities. I think we're on the same page based on the rest
> of your email: I'm just clarifying.

+1

> I have some quibbles with the syntax but I agree with the concept.
> What I'd probably do is separate the server side thing into two GUCs,
> each with a list of algorithms, comma-separated, like we do for other
> lists in postgresql.conf. Maybe make the default 'all' meaning
> "everything this build of the server supports". On the client side,
> I'd allow both things to be specified using a single option, because
> wanting to do the same thing in both directions will be common, and
> you actually have to type in connection strings sometimes, so
> verbosity matters more.
>
> As far as the format of the value for that keyword, what do you think
> about either compression=DO_THIS_BOTH_WAYS or
> compression=DO_THIS_WHEN_SENDING/DO_THIS_WHEN_RECEIVING, with each "do
> this" being a specification of the same form already accepted for
> server-side compression e.g. gzip or gzip:level=9? If you don't like
> that, why do you think the proposal you made above is better, and why
> is that one now punctuated with slashes instead of semicolons?

+1




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 17:59, Robert Haas  wrote:
> If they
> get attention, it's much more likely to be because somebody saw their
> email and wrote back than it is to be because somebody went through
> the CommitFest and found their entry and was like "oh, I should review
> this".

I think this is an important insight. I used the commitfest app to
find patches to review when I was just starting out in postgres
development, since I didn't subscribe to all af pgsql-hackers yet. I
do subscribe nowadays, but now I rarely look at the commitfest app,
instead I skim the titles of emails that go into my "Postgres" folder
in my mailbox. Going from such an email to a commitfest entry is near
impossible (at least I don't know how to do this efficiently). I guess
I'm not the only one doing this.

> Honestly, if we get to a situation where a patch author is sad
> because they have to click a link every 2 months to say "yeah, I'm
> still here, please review my patch," we've already lost the game. That
> person isn't sad because we asked them to click a link. They're sad
> it's already been N * 2 months and nothing has happened.

Maybe it wouldn't be so bad for an author to click the 2 months
button, if it would actually give their patch some higher chance of
being reviewed by doing that. And given the previous insight, that
people don't look at the commitfest app that often, it might be good
if pressing this button would also bump the item in people's
mailboxes.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 14:19, Joe Conway  wrote:
>
> On 5/16/24 22:26, Robert Haas wrote:
> > For example, imagine that the CommitFest is FORCIBLY empty
> > until a week before it starts. You can still register patches in the
> > system generally, but that just means they get CI runs, not that
> > they're scheduled to be reviewed. A week before the CommitFest,
> > everyone who has a patch registered in the system that still applies
> > gets an email saying "click here if you think this patch should be
> > reviewed in the upcoming CommitFest -- if you don't care about the
> > patch any more or it needs more work before other people review it,
> > don't click here". Then, the CommitFest ends up containing only the
> > things where the patch author clicked there during that week.
>
> 100% agree. This is in line with what I suggested on an adjacent part of
> the thread.

Such a proposal would basically mean that no-one that cares about
their patches getting reviews can go on holiday and leave work behind
during the week before a commit fest. That seems quite undesirable to
me.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 13:39, Robert Haas  wrote:
>
> On Fri, May 17, 2024 at 7:11 AM Tomas Vondra
>  wrote:
> > Yeah, I 100% agree with this. If a patch bitrots and no one cares enough
> > to rebase it once in a while, then sure - it's probably fine to mark it
> > RwF. But forcing all contributors to do a dance every 2 months just to
> > have a chance someone might take a look, seems ... not great.
>
> I don't think that clicking on a link that someone sends you that asks
> "hey, is this ready to be reviewed' qualifies as a dance.

If there's been any useful response to the patch since the last time
you pressed this button, then it might be okay. But it's definitely
not uncommon for items on the commitfest app to get no actual response
at all for half a year, i.e. multiple commits fests (except for the
odd request for a rebase that an author does within a week). I'd most
certainly get very annoyed if for those patches where it already seems
as if I'm screaming into the void I'd also be required to press a
button every two months, for it to even have a chance at receiving a
response.

> So I think the right interpretation of his comment is that
> managing the CommitFest has become about an order of magnitude more
> difficult than what it needs to be for the task to be done well.

+1

> > Long time ago there was a "rule" that people submitting patches are
> > expected to do reviews. Perhaps we should be more strict this.
>
> This sounds like it's just generating more manual work to add to a
> system that's already suffering from needing too much manual work. Who
> would keep track of how many reviews each person is doing, and how
> many patches they're submitting, and whether those reviews were
> actually any good, and what would they do about it?

+1




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 11:02, Jelte Fennema-Nio  wrote:
>
> On Fri, 17 May 2024 at 10:47, Daniel Gustafsson  wrote:
> > One way to ensure we capture detail could be if the system would send an
> > automated email to the thread summarizing the entry when it's marked as
> > "committed"?
>
> This sounds great! Especially if

(oops pressed send too early)
**Especially if it contains the git commit hash**

> Going back from an archived thread
> to the commitfest entry or git commit is currently very hard. If I'll
> just be able to Ctrl+F on commitf...@postgresql.com that would be so
> helpful. I think it would even be useful to have an email be sent
> whenever a patch gets initially added to the commitfest, so that
> there's a back link to and it's easy to mark yourself as reviewer.
> Right now, I almost never take the time to do that because if I look
> at my inbox, I have no clue what the interesting email thread is
> called in the commitfest app.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 10:47, Daniel Gustafsson  wrote:
> One way to ensure we capture detail could be if the system would send an
> automated email to the thread summarizing the entry when it's marked as
> "committed"?

This sounds great! Especially if  Going back from an archived thread
to the commitfest entry or git commit is currently very hard. If I'll
just be able to Ctrl+F on commitf...@postgresql.com that would be so
helpful. I think it would even be useful to have an email be sent
whenever a patch gets initially added to the commitfest, so that
there's a back link to and it's easy to mark yourself as reviewer.
Right now, I almost never take the time to do that because if I look
at my inbox, I have no clue what the interesting email thread is
called in the commitfest app.




Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Jelte Fennema-Nio
On Fri, 17 May 2024 at 06:58, Peter Eisentraut  wrote:
> Yeah, some fine-tuning might be required.  But I would be wary of
> over-designing too many new states at this point.  I think the key idea
> is that there ought to be a state that communicates "needs attention
> from someone other than author, reviewer, or committer".

+1 on adding a new state like this. I think it would make sense for
the author to request additional input, but I think it would need two
states, something like "Request for new reviewer" and "Request for new
committer". Just like authors disappear sometimes after a driveby
patch submission, it's fairly common too imho for reviewers or
committers to disappear after a driveby review. Having a way for an
author to mark a patch as such would be helpful, both to the author,
and to reviewers/committers looking to do help some patch out.

On Fri, 17 May 2024 at 09:33, Heikki Linnakangas  wrote:
> Things like "cfbot says
> this has bitrotted" or "Will review this after other patch this depends
> on". On the mailing list, notes like that are both noisy and easily lost
> in the threads. But as a little free-form text box on the commitfest, it
> would be handy.

+many on the free form text box




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Jelte Fennema-Nio
On Thu, 16 May 2024 at 18:57, Robert Haas  wrote:
> Ugh, it's so hard to communicate clearly about this stuff. I didn't
> really have any thought that we'd ever try to handle something as
> complicated as compression using ParameterSet.

Okay, then it's definitely very hard to communicate clearly about
this. Because being able to re-configure compression using
ParameterSet is exactly the type of thing I want to be able to do in
PgBouncer. Otherwise a connection from PgBouncer to Postgres cannot be
handed off to any client, because its compression state cannot be
changed on the fly to the state that a client expects, if the first
one wants lz4 compression, and a second one wants zstd compression.
There's no way the same connection can be reused for both unless you
decompress and recompress in pgbouncer, which is expensive to do.
Being able to reconfigure the stream to compress the messages in the
expected form is much cheaper.

> Does it send a NegotiateCompressionType message after
> the NegotiateProtocolVersion, for example? That seems like it could
> lead to the client having to be prepared for a lot of NegotiateX
> messages somewhere down the road.

Like Jacob explains, you'd want to allow the client to provide a list
of options in order of preference. And then have the server respond
with a ParameterStatus message saying what it ended up being. So no
new NegotiateXXX messages are needed, as long as we make sure any
_pq_.xxx falls back to reporting its default value on failure. This is
exactly why I said I prefer option 1 of the options I listed, because
we need _pq_.xxx messages to report their current value to the client.

To be clear, this is not special to compression. This applies to ALL
proposed protocol parameters. The server should fall back to some
"least common denominator" if it doesn't understand (part of) the
value for the protocol parameter that's provided by the client,
possibly falling back to disabling the protocol extension completely.

> Changing compression algorithms in mid-stream is tricky, too. If I
> tell the server "hey, turn on server-to-client compression!"

Yes it is tricky, but it's something that it would need to support
imho. And Jacob actually implemented it this way, so I feel like we're
discussing a non-problem here.

> I don't know if we want to support changing compression algorithms in
> mid-stream. I don't think there's any reason we can't, but it might be
> a bunch of work for something that nobody really cares about.

Again, I guess I wasn't clear at all in my previous emails and/or
commit messages. Connection poolers care **very much** about this.
Poolers need to be able to re-configure any protocol parameter to be
able to pool the same server connection across clients with
differently configured protocol parameters. Again: This is the primary
reason for me wanting to introduce the ParameterSet message.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Jelte Fennema-Nio
On Thu, 16 May 2024 at 17:29, Robert Haas  wrote:
> You're probably not going to like this answer, but I feel like this is
> another sign that you're trying to use the protocol extensibility
> facilities in the wrong way. In my first reply to the thread, I
> proposed having the client send _pq_.protocol_set=1 in that startup
> message. If the server accepts that message, then you can send
> whatever set of message types are associated with that option, which
> could include messages to list known settings, as well as messages to
> set them. Alternatively, if we used a wire protocol bump for this, you
> could request version 3.1 and everything that I just said still
> applies. In other words, I feel that if you had adopted the design
> that I proposed back in March, or some variant of it, the problem
> you're having now wouldn't exist.

I don't really understand the benefit of your proposal over option 2
that I proposed. Afaict you're proposing that for e.g. compression we
first set _pq_.supports_compression=1 in the StartupMessage and use
that  to do feature detection, and then after we get the response we
send ParameterSet("compression", "gzip"). To me this is pretty much
identical to option 2, except that it introduces an extra round trip
for no benefit (as far as I can see). Why not go for option 2 and send
_pq_.compression=gzip in the StartupMessage directly.

> IMHO, we need to negotiate the language that we're going to use to
> communicate before we start communicating. We should find out which
> protocol version we're using, and what protocol options are accepted,
> based on sending a StartupMessage and receiving a reply. Then, after
> that, having established a common language, we can do whatever. I
> think you're trying to meld those two steps into one, which is
> understandable from the point of view of saving a round trip, but I
> just don't see it working out well.

I think not increasing the number of needed round trips in the startup
of a connection is extremely important. I think it's so important that
I honestly don't think we should merge a protocol change that
introduces an extra round trip without a VERY good reason, and this
round trip should only be needed when actually using the feature.

> What we can do in the startup
> message is extremely limited, because any startup messages we generate
> can't break existing servers, and also because of the concerns I
> raised earlier about leaving room for more extension in the future.
> Once we get past the startup message negotiation, the sky's the limit!

Sure, what we can do in the StartupMessage is extremely limited, but
what it does allow is passing arbitrary key value pairs to the server.
But by only using _pq_.feature_name=1, we're effectively only using
the key part of the key value pair. Limiting ourselves even more, by
throwing half of our communication channel away, seems like a bad idea
to me. But maybe I'm just not understanding the problem you're seeing
with using the value too.




Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-16 Thread Jelte Fennema-Nio
On Thu, 16 May 2024 at 03:45, Jonathan S. Katz  wrote:
> Attached is a copy of the PostgreSQL 17 Beta 1 release announcement
> draft.

I think we can quickly mention c4ab7da6061 in the COPY paragraph, in
some benchmarks it improved perf by close to 2x. Something like this:
"has improved performance in PostgreSQL 17 when the source encoding
matches the destination encoding *and when sending large rows from
server to client*"

Also, I think it's a bit weird to put the current COPY paragraph under
Developer Experience. I think if you want to keep it there instead of
move it to the per section, we should put the line about IGNORE_ERROR
first instead of the perf improvements. Now the IGNORE_ERROR addition
seems more of an afterthought.

s/IGNORE_ERROR/ON_ERROR

I think it would be good to clarify if the following applies when
upgrading from or to PostgreSQL 17:
"Starting with PostgreSQL 17, you no longer need to drop logical
replication slots when using pg_upgrade"

Finally, I personally would have included a lot more links for the new
items in this document. Some that would benefit from being a link
imho:
- pg_createsubscriber
- JSON_TABLE
- SQL/JSON constructor
- SQL/JSON query functions
- ON_ERROR
- sslnegotiation
- PQchangePassword
- pg_maintain




Re: First draft of PG 17 release notes

2024-05-16 Thread Jelte Fennema-Nio
On Thu, 16 May 2024 at 05:48, Andres Freund  wrote:
> We're having this debate every release. I think the ongoing reticence to note
> performance improvements in the release notes is hurting Postgres.
>
> For one, performance improvements are one of the prime reason users
> upgrade. Without them being noted anywhere more dense than the commit log,
> it's very hard to figure out what improved for users. A halfway widely
> applicable performance improvement is far more impactful than many of the
> feature changes we do list in the release notes.
>
> For another, it's also very frustrating for developers that focus on
> performance. The reticence to note their work, while noting other, far
> smaller, things in the release notes, pretty much tells us that our work isn't
> valued.

+1 to the general gist of listing every perf improvement **and memory
usage reduction** in the release notes. Most of them are already
grouped together in a dedicated "General performance" section anyway,
having that section be big would only be good imho to show that we're
committed to improving perf.

I think one thing would make this a lot easier though is if commits
that knowlingy impact perf would clearly say so in the commit message,
because now it's sometimes hard to spot as someone not deeply involved
with the specific patch. e.g. c4ab7da606 doesn't mention performance
at all, so I'm not surprised it wasn't listed initially. And while
667e65aac3 states that multiple rounds of heap scanning is now
extremely rare, it doesn't explicitly state what the kind of perf
impact can be expected because of that.

Maybe something like introducing a common "Perf-Improvement: true"
marker in the commit message and when doing so add a clear paragraph
explaining the expected perf impact perf impact. Another option could
be to add a "User Impact" section to the commit message, where an
author could add their suggestion for a release note entry. So
basically this suggestion boils down to more clearly mentioning user
impact in commit messages, instead of mostly/only including
technical/implementation details.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-16 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 18:30, Dave Cramer  wrote:
> On Fri, 5 Apr 2024 at 12:09, Jelte Fennema-Nio  wrote:
>> I'll take a look at redesigning the protocol parameter stuff. To work
>> with dedicated functions instead.
>
> +1

It's been a while, but I now actually took the time to look into this.
And I ran into a problem that I'd like to get some feedback on before
continuing the implementation:

If we're not setting the protocol parameter in the StartupMessage,
there's currently no way for us to know if the protocol parameter is
supported by the server. If protocol parameters were unchangable then
that would be fine, but the whole point of introducing ParameterSet is
to make it possible to change protocol parameters on an existing
connection. Having the function SupportsProtocolCompression return
false, even though you can enable compression just fine, only because
we didn't ask for compression when connecting seems quite silly and
confusing.

I see five ways around this problem and would love some feedback on
which you think is best (or if you can think of any other/better
ones):
1. Have protocol parameters always be GUC_REPORT, so that the presence
of a ParameterStatus message during connection startup can be used as
a way of detecting support for the protocol parameter.
2. Make libpq always send each known protocol parameter in the
StartupMessage to check for their support, even if the connection
string does not contain the related parameters (set them to their
default value then). Then the non-presence of the parameter in the
NegotiateProtocolVersion message can be used reliably to determine
support for the feature. We could even disallow changing a protocol
parameter at the server side using ParameterSet if it was not
requested in the StartupMessage.
3. Very similar to 1, but require explicit user input in the
connection string to request the feature on connection startup by
having the user explicitly provide its default value. If it's not
requested on connection startup assume its unsupported and disallow
usage of the feature (even if the server might actually support it).
4. Make SupportsProtocolCompression return a tri-state, SUPPORTED,
UNSUPPORTED, UNKNOWN. If it's UNKNOWN people could send a ParameterSet
message themselves to check for feature support after connection
startup. We could even recognize this and change the state that
SupportProtocolCompression function to return SUPPORTED/UNSUPPORTED on
future calls according to the server response.
5. Basically the same as 4 but automatically send a ParameterSet
message internally when calling SupportsProtocolCompression and the
state is UNKNOWN, so we'd only ever return SUPPORTED or UNSUPPORTED.

The above options are listed in my order of preference, below some
reasoning why:

1 and 2 would increase the bandwidth used during connection handshake
slightly for each protocol parameter that we would add, but they have
the best user experience IMHO.

I slightly prefer 1 over 2 because there is another argument to be
made for always having protocol parameters be GUC_REPORT: these
parameters change what message types a client can send or receive. So
it makes sense to me to have the server tell the client what the
current value of such a parameter is. This might not be a strong
argument though, because this value would only ever change due to user
interaction. But still, one might imagine scenarios where the value
that the client sent is not exactly what the server would set the
parameter to on receiving that value from the client. e.g. for
protocol compression, maybe the client sends a list of prefered
compression methods and the server would send a ParameterStatus
containing only the specific compression method that it will use when
sending messages to the client.

3 seems also an acceptable option to me. While having slightly worse
user experience than 2, it allows the user of the client to make the
decision if the extra bandwidth during connection startup is worth it
to be able to enable the feature later.

4 assumes that we want people to be able to trigger sending
ParameterSet messages for every protocol parameter. I'm not sure we'd
want to give that ability in all cases.

5 would require SupportsProtocolCompression to also have a
non-blocking version, which bloats our API more than I'd like. Also as
a user you wouldn't be able to know if SupportsProtocolCompression
will do a network request or not.

PS. This is only a problem for feature detection for features relying
on protocol parameters, feature-support relying on protocol version
bumps are easy to detect based on the NegotiateProtocolVersion
message.




Re: Postgres and --config-file option

2024-05-15 Thread Jelte Fennema-Nio
On Wed, 15 May 2024 at 11:49, Peter Eisentraut  wrote:
> Yeah, some of this is becoming quite unwieldy, if we document and
> mention each spelling variant of each option everywhere.
>
> Maybe if the original problem is that the option --config-file is not
> explicitly in the --help output, let's add it to the --help output?

I definitely think it would be useful to list this --config variant in
more places, imho it's nicer than the -c variant. Especially in the
PGOPTIONS docs it would be useful. People are already using it in the
wild and I regressed on support for that in PgBouncer by accident:
https://github.com/pgbouncer/pgbouncer/pull/1064




Re: First draft of PG 17 release notes

2024-05-14 Thread Jelte Fennema-Nio
On Tue, 14 May 2024 at 02:56, Bruce Momjian  wrote:
>
> On Sat, May 11, 2024 at 10:24:39AM -0400, Joe Conway wrote:
> > On 5/11/24 09:57, Jelte Fennema-Nio wrote:
> > > The buffering change improved performance up to ~40% in some of the
> > > benchmarks. The case it improves mostly is COPY of large rows and
> > > streaming a base backup. That sounds user-visible enough to me to
> > > warrant an entry imho.
> >
> > +1
>
> Attached patch applied.

I think we shouldn't list this under the libpq changes and shouldn't
mention libpq in the description, since this patch changes
src/backend/libpq files instead of src/interfaces/libpq files. I think
it should be in the "General performance" section and describe the
change as something like the below:

Improve performance when transferring large blocks of data to a client

PS. I completely understand that this was not clear from the commit message.




Re: JIT compilation per plan node

2024-05-14 Thread Jelte Fennema-Nio
On Tue, 14 May 2024 at 10:19, David Rowley  wrote:
> Here's a plan where the total cost of a subnode is greater than the
> total cost of the top node:

Ah I didn't realize it was possible for that to happen. **reads up on
plan costs**

This actually makes me think that using total_cost of the sub-nodes is
not the enough to determine determining if the node should be JITet.
We wouldn't want to start jitting plans like this, i.e. introducing
all the JIT setup overhead for just a single row:

set max_parallel_workers_per_gather=0;
create table t0 as select a from generate_Series(1,100)a;
analyze t0;
explain select a+a*a+a*a+a from t0 limit 1;
QUERY PLAN
-
 Limit  (cost=0.00..0.03 rows=1 width=4)
   ->  Seq Scan on t0  (cost=0.00..26980.00 rows=100 width=4)

An easy way to work around that issue I guess is by using the minimum
total_cost of all the total_costs from the current sub-node up to the
root node. The current minimum could be passed along as a part of the
context I guess.




Re: JIT compilation per plan node

2024-05-14 Thread Jelte Fennema-Nio
On Tue, 20 Feb 2024 at 06:38, David Rowley  wrote:
>
> On Tue, 20 Feb 2024 at 18:31, Tom Lane  wrote:
> > FWIW, I seriously doubt that an extra walk of the plan tree is even
> > measurable compared to the number of cycles JIT compilation will
> > expend if it's called.  So I don't buy your argument here.
> > We would be better off to do this in a way that's clean and doesn't
> > add overhead for non-JIT-enabled builds.
>
> The extra walk of the tree would need to be done for every plan, not
> just the ones where we do JIT. I'd rather find a way to not add this
> extra plan tree walk, especially since the vast majority of cases on
> an average instance won't be doing any JIT.

I'm not saying I'd prefer the extra walk, but I don't think you'd need
to do this extra walk for all plans. Afaict you could skip the extra
walk when top_plan->total_cost < jit_above_cost. i.e. only doing the
extra walk to determine which exact nodes to JIT for cases where we
currently JIT all nodes. That would limit the extra walk overhead to
cases where we currently already spend significant resources on JITing
stuff.




Re: Direct SSL connection with ALPN and HBA rules

2024-05-13 Thread Jelte Fennema-Nio
On Mon, 13 May 2024 at 18:14, Robert Haas  wrote:
> I disagree with Jacob's assertion that sslmode=require has no security
> benefits over sslmode=prefer. That seems like the kind of pessimism
> that makes people hate security professionals. There have got to be
> some attacks that are foreclosed by encrypting the connection, even if
> you don't stop MITM attacks or other things that are more
> sophisticated than running wireshark and seeing what goes by on the
> wire.

Like Jacob already said, I guess you meant me here. The main point I
was trying to make is that sslmode=require is extremely insecure too,
so if we're changing the default then I'd rather bite the bullet and
actually make the default a secure one this time. No-ones browser
trusts self-signed certs by default, but currently lots of people
trust self-signed certs when connecting to their production database
without realizing.

IMHO the only benefit that sslmode=require brings over sslmode=prefer
is detecting incorrectly configured servers i.e. servers that are
supposed to support ssl but somehow don't due to a misconfigured
GUC/pg_hba. Such "incorrectly configured server" detection avoids
sending data to such a server, which an eavesdropper on the network
could see. Which is definitely a security benefit, but it's an
extremely small one. In all other cases sslmode=prefer brings exactly
the same protection as sslmode=require, because sslmode=prefer
encrypts the connection unless postgres actively tells the client to
downgrade to plaintext (which never happens when the server is
configured correctly).




Re: Direct SSL connection with ALPN and HBA rules

2024-05-13 Thread Jelte Fennema-Nio
On Mon, 13 May 2024 at 13:07, Heikki Linnakangas  wrote:
> "channel_binding=require sslmode=require" also protects from MITM attacks.

Cool, I didn't realize we had this connection option and it could be
used like this. But I think there's a few security downsides of
channel_binding=require over sslmode=verify-full: If the client relies
on channel_binding to validate server authenticity, a leaked
server-side SCRAM hash is enough for an attacker to impersonate a
server. While normally a leaked scram hash isn't really much of a
security concern (assuming long enough passwords). I also don't know
of many people rotating their scram hashes, even though many rotate
TLS certs.

> I think these options should be designed from the user's point of view,
> so that the user can specify the risks they're willing to accept, and
> the details of how that's accomplished are handled by libpq. For
> example, I'm OK with (tick all that apply):
>
> [ ] passive eavesdroppers seeing all the traffic
> [ ] MITM being able to hijack the session
> [ ] connecting without verifying the server's identity
> [ ] divulging the plaintext password to the server
> [ ] ...

I think that sounds like a great idea, looking forward to the proposal.




Re: Direct SSL connection with ALPN and HBA rules

2024-05-13 Thread Jelte Fennema-Nio
On Mon, 13 May 2024 at 15:38, Heikki Linnakangas  wrote:
> Here's a patch to implement that.

+   if (conn->sslnegotiation[0] == 'd' &&
+   conn->sslmode[0] != 'r' && conn->sslmode[0] != 'v')

I think these checks should use strcmp instead of checking magic first
characters. I see this same clever trick is used in the recently added
init_allowed_encryption_methods, and I think that should be changed to
use strcmp too for readability.




Re: Direct SSL connection with ALPN and HBA rules

2024-05-13 Thread Jelte Fennema-Nio
On Sun, 12 May 2024 at 23:39, Heikki Linnakangas  wrote:
> You might miss that by changing sslnegotiation to 'postgres', or by
> removing it altogether, you not only made it compatible with older
> server versions, but you also allowed falling back to a plaintext
> connection. Maybe you're fine with that, but maybe not. I'd like to
> nudge people to use sslmode=require, not rely on implicit stuff like
> this just to make connection strings a little shorter.

I understand your worry, but I'm not sure that this is actually much
of a security issue in practice. sslmode=prefer and sslmode=require
are the same amount of insecure imho (i.e. extremely insecure). The
only reason sslmode=prefer would connect as non-ssl to a server that
supports ssl is if an attacker has write access to the network in the
middle (i.e. eavesdropping ability alone is not enough). Once an
attacker has this level of network access, it's trivial for this
attacker to read any data sent to Postgres by intercepting the TLS
handshake and doing TLS termination with some arbitrary cert (any cert
is trusted by sslmode=require).

So the only actual case where this is a security issue I can think of
is when an attacker has only eavesdropping ability on the network. And
somehow the Postgres server that the client tries to connect to is
configured incorrectly, so that no ssl is set up at all. Then a client
would drop to plaintext, when connecting to this server instead of
erroring, and the attacker could now read the traffic. But I don't
really see this scenario end up any differently when requiring people
to enter sslmode=require. The only action a user can take to connect
to a server that does not have ssl support is to remove
sslmode=require from the connection string. Except if they are also
the server operator, in which case they could enable ssl on the
server. But if ssl is not set up, then it was probably never set up,
and thus providing sslnegotiation=direct would be to test if ssl
works.

But, if you disagree I'm fine with erroring on plain sslnegotiation=direct

> In v18, I'd like to make sslmode=require the default. Or maybe introduce
> a new setting like "encryption=ssl|gss|none", defaulting to 'ssl'. If we
> want to encourage encryption, that's the right way to do it. (I'd still
> recommend everyone to use an explicit sslmode=require in their
> connection strings for many years, though, because you might be using an
> older client without realizing it.)

I'm definitely a huge proponent of making the connection defaults more
secure. But as described above sslmode=require is still extremely
insecure and I don't think we gain anything substantial by making it
the default. I think the only useful secure default would be to use
sslmode=verify-full (with probably some automatic fallback to
sslmode=prefer when connecting to hardcoded IPs or localhost). Which
probably means that sslrootcert=system should also be made the
default. Which would mean that ~/.postgresql/root.crt would not be the
default anymore, which I personally think is fine but others likely
disagree.




Re: Direct SSL connection with ALPN and HBA rules

2024-05-11 Thread Jelte Fennema-Nio
On Fri, 10 May 2024 at 15:50, Heikki Linnakangas  wrote:
> New proposal:
>
> - Remove the "try both" mode completely, and rename "requiredirect" to
> just "direct". So there would be just two modes: "postgres" and
> "direct". On reflection, the automatic fallback mode doesn't seem very
> useful. It would make sense as the default, because then you would get
> the benefits automatically in most cases but still be compatible with
> old servers. But if it's not the default, you have to fiddle with libpq
> settings anyway to enable it, and then you might as well use the
> "requiredirect" mode when you know the server supports it. There isn't
> anything wrong with it as such, but given how much confusion there's
> been on how this all works, I'd prefer to cut this back to the bare
> minimum now. We can add it back in the future, and perhaps make it the
> default at the same time. This addresses points 2. and 3. above.
>
> and:
>
> - Only allow sslnegotiation=direct with sslmode=require or higher. This
> is what you, Jacob, wanted to do all along, and addresses point 1.
>
> Thoughts?

Sounds mostly good to me. But I think we'd want to automatically
increase sslmode to require if it is unset, but sslnegotation is set
to direct. Similar to how we bump sslmode to verify-full if
sslrootcert is set to system, but sslmode is unset. i.e. it seems
unnecessary/unwanted to throw an error if the connection string only
contains sslnegotiation=direct




Re: First draft of PG 17 release notes

2024-05-11 Thread Jelte Fennema-Nio
On Fri, 10 May 2024 at 23:31, Tom Lane  wrote:
>
> Bruce Momjian  writes:
> > I looked at both of these.   In both cases I didn't see why the user
> > would need to know these changes were made:
>
> I agree that the buffering change is not likely interesting, but
> the fact that you can now control-C out of a psql "\c" command
> is user-visible.  People might have internalized the fact that
> it didn't work, or created complicated workarounds.

The buffering change improved performance up to ~40% in some of the
benchmarks. The case it improves mostly is COPY of large rows and
streaming a base backup. That sounds user-visible enough to me to
warrant an entry imho.




Re: First draft of PG 17 release notes

2024-05-10 Thread Jelte Fennema-Nio
On Thu, 9 May 2024 at 06:04, Bruce Momjian  wrote:
>
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html

Great work!

There are two commits that I think would benefit from being listed
(but maybe they are already listed and I somehow missed them, or they
are left out on purpose for some reason):

- c4ab7da60617f020e8d75b1584d0754005d71830
- cafe1056558fe07cdc52b95205588fcd80870362




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-06 Thread Jelte Fennema-Nio
On Tue, 23 Apr 2024 at 19:39, Jacob Champion
 wrote:
>
> On Mon, Apr 22, 2024 at 2:20 PM Jelte Fennema-Nio  wrote:
> > 1. I strongly believe minor protocol version bumps after the initial
> > 3.1 one can be made painless for clients/poolers (so the ones to
> > 3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
> > having to worry about breaking TLS 1.2 communication.
>
> Apologies for focusing on a single portion of your argument, but this
> claim in particular stuck out to me. To my understanding, IETF worried
> a _lot_ about breaking TLS 1.2 implementations with the TLS 1.3
> change, to the point that TLS 1.3 clients and servers advertise
> themselves as TLS 1.2 and sneak the actual version used into a TLS
> extension (roughly analogous to the _pq_ stuff). I vaguely recall that
> the engineering work done for that update was pretty far from
> painless.

My bad... I guess TLS 1.3 was a bad example, due to it changing the
handshake itself so significantly.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-06 Thread Jelte Fennema-Nio
On Tue, 23 Apr 2024 at 17:03, Robert Haas  wrote:
> If a client doesn't know about encrypted columns and sets that
> bit at random, that will break things, and formally I think that's a
> risk, because I don't believe we document anywhere that you shouldn't
> set unused bits in the format mask. But practically, it's not likely.
> (And also, maybe we should document that you shouldn't do that.)

Currently Postgres errors out when you set anything other than 0 or 1
in the format field. And it's already documented that these are the
only allowed values: "Each must presently be zero (text) or one
(binary)."

> Let's consider a hypothetical country much like Canada except that
> there are three official languages rather than two: English, French,
> and Robertish.

I don't really understand the point you are trying to make with this
analogy. Sure an almost equivalent unused language maybe shouldn't be
put on the signs, but having two different names (aka protocol
versions) for English and Robertish seems quite useful to avoid
confusion.

> And so here. If someone codes a connection pooler in the way you
> suppose, then it will break. But, first of all, they probably won't do
> that, both because it's not particularly likely that someone wants to
> gather that particular set of statistics and also because erroring out
> seems like an overreaction.

Is it really that much of an overreaction? Postgres happily errors on
any weirdness in the protocol including unknown format codes. Why
wouldn't a pooler/proxy in the middle be allowed to do the same? The
pooler has no clue what the new format means, maybe it requires some
session state to be passed on to the server to be interpreted
correctly. The pooler would be responsible for syncing that session
state but might not have synced it because it doesn't know that the
format code requires that. An example of this would be a compressed
value of which the compression algorithm is configured using a GUC.
Thus the pooler being strict in what it accepts would be a good way of
notifying the client that the pooler needs to be updated (or the
feature not used).

But I do agree that it seems quite unlikely that a pooler would
implement it like that though. However, a pooler logging a warning
seems not that crazy. And in that case the connection might not be
closed, but the logs would be flooded.

> And secondly, let's imagine that we do
> bump the protocol version and think about whether and how that solves
> the problem. A client will request from the pooler a version 3.1
> connection and the pooler will say, sorry, no can do, I only
> understand 3.0. So the client will now say, oh ok, no problem, I'm
> going to refrain from setting that parameter format bit. Cool, right?
>
> Well, no, not really. First, now the client application is probably
> broken. If the client is varying its behavior based on the server's
> protocol version, that must mean that it cares about accessing
> encrypted columns, and that means that the bit in question is not an
> optional feature. So actually, the fact that the pooler can force the
> client to downgrade hasn't fixed anything at all.

It seems quite a lot nicer if the client can safely fallback to not
writing data using the new format code, instead of getting an error
from the pooler/flooding the pooler logs/having the pooler close the
connection.

> Third, applications, drivers, and connection poolers now all need to
> worry about handling downgrades smoothly. If a connection pooler
> requests a v3.1 connection to the server and gets v3.0, it had better
> make sure that it only advertises 3.0 to the client.

This seems quite straightforward to solve from a pooler perspective:
1. Connect to the server first to find out what the maximum version is
that it support
2. If a client asks for a higher version, advertise the server version

> If the client
> requests v3.0, the pooler had better make sure to either request v3.0
> from the server. Or alternatively, the pooler can be prepared to
> translate between 3.0 and 3.1 wherever that's needed, in either
> direction. But it's not at all clear what that would look like for
> something like TCE. Will the pooler arrange to encrypt parameters
> destined for encrypted tables if the client doesn't do so? Will it
> arrange to decrypt values coming from encrypted tables if the client
> doesn't understand encryption?

This is a harder problem, and indeed one I hadn't considered much.
>From a pooler perspective I really would like to be able to have just
one pool of connections to the server all initially connected with the
3.1 protocol and use those 3.1 server connections for clients that use
the 3.1 protocol but also for clients that use the 3.0 protocol.
Creating separate pools of server connections for different protocol
versions is super annoying to manage for operators (e.g. how should
these pools be sized). One of the reasons I'm proposing the
ParameterSet message is to avoid this problem for protocol 

Re: Reducing the log spam

2024-05-02 Thread Jelte Fennema-Nio
On Thu, 2 May 2024 at 13:08, Jelte Fennema-Nio  wrote:
> 2. Change the newly added check in errcode() to only set
> output_to_server to false when IsLogicalWorker() returns false.

Actually a third, and probably even better solution would be to only
apply this new GUC to non-backgroundworker processes. That seems quite
reasonable, since often the only way to access background worker
errors is often through the logs.




Re: Reducing the log spam

2024-05-02 Thread Jelte Fennema-Nio
On Thu, 2 May 2024 at 12:47, Laurenz Albe  wrote:
> Yes.  But I'd argue that that is a shortcoming of logical replication:
> there should be a ways to get this information via SQL.  Having to look into
> the log file is not a very useful option.

Definitely agreed that accessing the error details using SQL would be
much better. But having no way at all (by default) to find the cause
of the failure is clearly much worse.

> The feature will become much less useful if unique voilations keep getting 
> logged.

Agreed. How about changing the patch so that the GUC is not applied to
logical replication apply workers (and document that accordingly). I
can think of two ways of achieving that (but there might be
other/better ones):
1. Set the GUC to empty string when an apply worker is started.
2. Change the newly added check in errcode() to only set
output_to_server to false when IsLogicalWorker() returns false.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-22 Thread Jelte Fennema-Nio
On Mon, 22 Apr 2024 at 16:26, Robert Haas  wrote:
> That's a fair point, but I'm still not seeing much practical
> advantage. It's unlikely that a client is going to set a random bit in
> a format parameter for no reason.

I think you're missing an important point of mine here. The client
wouldn't be "setting a random bit in a format parameter for no
reason". The client would decide it is allowed to set this bit,
because the PG version it connected to supports column encryption
(e.g. PG18). But this completely breaks protocol and application layer
separation.

It doesn't seem completely outside of the realm of possibility for a
pooler to gather some statistics on the amount of Bind messages that
use text vs binary query parameters. That's very easily doable now,
while looking only at the protocol layer. If a client then sets the
new format parameter bit, this pooler could then get confused and
close the connection.

> Perhaps this is the root of our disagreement, or at least part of it.
> I completely agree that it is important for human beings to be able to
> understand whether, and how, the wire protocol has changed from one
> release to another.

I think this is partially the reason for our disagreement, but I think
there are at least two other large reasons:

1. I strongly believe minor protocol version bumps after the initial
3.1 one can be made painless for clients/poolers (so the ones to
3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
having to worry about breaking TLS 1.2 communication. Once clients and
poolers implement version negotiation support for 3.1, there's no
reason for version negation support to work for 3.0 and 3.1 to then
suddenly break on the 3.2 bump. To be clear, I'm talking about the act
of bumping the version here, not the actual protocol changes. So
assuming zero/near-zero client implementation effort for the new
features (like never setting the newly supported bit in a format
parameter), then bumping the protocol version for these new features
can never have negative consequences.

2. I very much want to keep a clear split between the protocol layer
and the application layer of our communication. And these layers merge
whenever (like you say) "the wire protocol has changed from one
release to another", but no protocol version bump or protocol
extension is used to indicate that. When that happens the only way for
a client to know what valid wire protocol messages are according to
the server, is by checking the server version. This completely breaks
the separation between layers. So, while checking the server version
indeed works for direct client to postgres communication, it starts to
break down whenever you put a pooler inbetween (as explained in the
example earlier in this email). And it breaks down even more when
connecting to servers that implement the Postgres wire protocol, but
are not postgres at all, like CockroachDB. Right now libpq and other
postgres drivers can be used to talk to these other servers and
poolers, but if we start mixing protocol and application layer stuff
then eventually that will stop being the case.

Afaict from your responses, you disagree with 1. However, it's not at
all clear to me what exact problems you're worried about. It sounds
like you don't know either, and it's more that you're worried about
things breaking for not yet known reasons. I hoped to take away/reduce
those worries using some arguments in a previous email (quoted below),
but you didn't respond to those arguments, so I'm not sure if they
were able to change your mind.

On Thu, 18 Apr 2024 at 21:34, Jelte Fennema-Nio  wrote:
> When the server supports a lower version than the client, the client
> should disable certain features because it gets the
> ProtocolVersionNegotiation message. This is also true if we don't bump
> the version. Negotiating a lower version actually makes it clearer for
> the client what features to disable. Using the reported postgres
> version for this might not, because a connection pooler in the middle
> might not support the features that the client wants and thus throw an
> error (e.g. due to the client sending unknown messages) even if the
> backing Postgres server would support these features. Not to mention
> non-postgresql servers that implement the PostgreSQL protocol (of
> which there are more and more).
>
> When the server supports a higher version, the client never even
> notices this because the server will silently accept and only enable
> the features of the lower version. So this could never cause breakage,
> as from the client's perspective the server didn't bump their protocol
> version.




Re: Support a wildcard in backtrace_functions

2024-04-21 Thread Jelte Fennema-Nio
On Sat, 20 Apr 2024 at 01:19, Michael Paquier  wrote:
> Removing this GUC and making the backend react by default the same way
> as when this GUC was enabled sounds like a sensible route here.  This
> stuff is useful.

I definitely agree it's useful. But I feel like changing the default
of the GUC and removing the ability to disable it at the same time are
pretty radical changes that we should not be doing after a feature
freeze. I think we should at least have a way to turn this feature off
to be able to avoid log spam. Especially given the fact that
extensions use elog much more freely than core. Which afaict from
other threads[1] Tom also thinks we should normally be careful about.

Of the options to resolve the open item so far, I think there are only
three somewhat reasonable to do after the feature freeze:
1. Rename the GUC to something else (but keep behaviour the same)
2. Decide to keep the GUC as is
3. Revert a740b213d4

I hoped 1 was possible, but based on the discussion so far it doesn't
seem like we'll be able to get a quick agreement on a name. IMHO 2 is
just a version of 1, but with a GUC name that no-one thinks is a good
one. So I think we are left with option 3.

[1]: 
https://www.postgresql.org/message-id/flat/524751.1707240550%40sss.pgh.pa.us#59710fd4f3f186e642b8e6b886b2fdff




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Jelte Fennema-Nio
On Fri, 19 Apr 2024 at 10:58, Peter Eisentraut  wrote:
> This presupposes that there is consensus about how the future
> functionality should look like.  This topic has gone through half a
> dozen designs over a few months, and I think it would be premature to
> randomly freeze that discussion now and backport that design.

While there maybe isn't consensus on what a new design exactly looks
like, I do feel like there's consensus on this thread that the
backtrace_on_internal_error GUC is almost certainly not the design
that we want. I guess a more conservative approach to this problem
would be to revert the backtrace_on_internal_error commit and agree on
a better design for PG18. But I didn't think that would be necessary
if we could agree on the name for a more flexibly named GUC, which
seemed quite possible to me (after a little bit of bikeshedding).

> If a better, more comprehensive design arises for PG18, I think it would
> be pretty easy to either remove backtrace_on_internal_error or just
> internally remap it.

I think releasing a GUC (even if it's just meant for developers) in
PG17 and then deprecating it for a newer version in PG18 wouldn't be a
great look. And even if that's not a huge problem, it still seems
better not to have the problem at all. Renaming the GUC now seems to
have only upsides to me: worst case the new design turns out not to be
what we want either, and we have to deprecate the GUC. But in the best
case we don't need to deprecate anything.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-18 Thread Jelte Fennema-Nio
On Thu, 18 Apr 2024 at 23:36, Jelte Fennema-Nio  wrote:
> To clarify: My proposed approach is to use a protocol extension
> parameter for to enable the new messages that the server can send
> (like Peter is doing now). And **in addition to that** gate the new
> Bind format type behind a feature switch.

ugh, correction: gate the new Bind format type behind a **protocol bump**




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-18 Thread Jelte Fennema-Nio
On Thu, 18 Apr 2024 at 22:17, Robert Haas  wrote:
> If we go with Peter's approach, every driver that supports his feature
> will work perfectly, and every driver that doesn't will work exactly
> as it does today. The risk of breaking anything is as near to zero as
> human developers can reasonably hope to achieve. Nobody who doesn't
> care about the feature will have to lift a single finger, today,
> tomorrow, or ever. That's absolutely brilliant.
>
> If we instead go with your approach, then anyone who wants to support
> 3.2 when it materializes will have to also support 3.1, which means
> they have to support this feature.

To clarify: My proposed approach is to use a protocol extension
parameter for to enable the new messages that the server can send
(like Peter is doing now). And **in addition to that** gate the new
Bind format type behind a feature switch. There is literally nothing
clients will have to do to "support" that feature (except for
requesting a higher version protocol). Your suggestion of not bumping
the version but still allowing the new format type on version 3.0
doesn't have any advantage afaict, except secretly hiding from any
pooler in the middle that such a format type might be sent.

> Also, even just 3.1 is going to break
> something for somebody. There's just no way that we've left the
> protocol version unchanged for this long and the first change we make
> doesn't cause some collateral damage.

Sure, but the exact same argument holds for protocol extension
parameters. We've never set them, so they are bound to break something
the first time. My whole point is that once we bite that bullet, the
next protocol parameters and protocol version bumps won't cause such
breakage.

> Sure, those are minor downsides in the grand scheme of things. But
> AFAICS the only downside of Peter's approach that you've alleged is
> that doesn't involve bumping the version number. Of course, if bumping
> the version number is an intrinsic good, then no further justification
> is required, but I don't buy that. I do not believe that users or
> maintainers will throw us a pizza party when they find out that we've
> changed the version number. There's no reason for anyone to be happy
> about that for its own sake.

As a connection pooler maintainer I would definitely love it if every
protocol change required either a protocol version parameter or a
protocol version bump. That way I can easily check every release if
the protocol changed by looking at two things, instead of diffing the
protocol docs for some tiny "supposedly irrelevant" change was made.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-18 Thread Jelte Fennema-Nio
On Mon, 15 Apr 2024 at 21:47, Dave Cramer  wrote:
>> On Mon, 15 Apr 2024 at 19:52, Robert Haas  wrote:
>> > surely it can't be right to use protocol
>> > version 3.0 to refer to a bunch of different things. But at the same
>> > time, surely we don't want clients to start panicking and bailing out
>> > when everything would have been fine.
>>
>> I feel like the ProtocolVersionNegotiation should make sure people
>> don't panic and bail out. And if not, then feature flags won't help
>> with this either. Because clients would then still bail out if some
>> feature is not supported.
>
> I don't think a client should ever bail out. They may not support something 
> but IMO bailing out is not an option.


On Thu, 18 Apr 2024 at 21:01, Robert Haas  wrote:
> On Thu, Apr 18, 2024 at 1:49 PM Jelte Fennema-Nio  wrote:
> > IMHO that means that we should also bump the protocol version for this
> > change, because it's changing the wire protocol by adding a new
> > parameter format code. And it does so in a way that does not depend on
> > the new protocol extension.
>
> I think we're more or less covering the same ground we did on the
> other thread here -- in theory I don't love the fact that we never
> bump the protocol version when we change stuff, but in practice if we
> start bumping it every time we do anything I think it's going to just
> break a bunch of stuff without any real benefit.

(the second quoted message comes from Peter his column encryption
thread, but responding here to keep this discussion in one place)

I really don't understand what exactly you're worried about. What do
you expect will break when bumping the protocol version? As Dave said,
clients should never bail out due to protocol version differences.

When the server supports a lower version than the client, the client
should disable certain features because it gets the
ProtocolVersionNegotiation message. This is also true if we don't bump
the version. Negotiating a lower version actually makes it clearer for
the client what features to disable. Using the reported postgres
version for this might not, because a connection pooler in the middle
might not support the features that the client wants and thus throw an
error (e.g. due to the client sending unknown messages) even if the
backing Postgres server would support these features. Not to mention
non-postgresql servers that implement the PostgreSQL protocol (of
which there are more and more).

When the server supports a higher version, the client never even
notices this because the server will silently accept and only enable
the features of the lower version. So this could never cause breakage,
as from the client's perspective the server didn't bump their protocol
version.

So, I don't understand why you seem to view bumping the protocol
version with so much negativity. We're also bumping PG versions every
year. Afaik people only like that, partially because it's immediately
clear that certain features (e.g. MERGE) are not supported when
connecting to older servers. To me the same is true for bumping the
protocol version. There are no downsides to bumping it, only upsides.




Re: Transparent column encryption

2024-04-18 Thread Jelte Fennema-Nio
On Thu, 18 Apr 2024 at 18:46, Robert Haas  wrote:
> With regard to the Bind message, I suggest that we regard the protocol
> change as reserving a currently-unused bit in the message to indicate
> whether the value is pre-encrypted, without reference to the protocol
> extension. It could be legal for a client that can't understand
> encryption message from the server to supply an encrypted value to be
> inserted into a column. And I don't think we would ever want the bit
> that's being reserved here to be used by some other extension for some
> other purpose, even when this extension isn't used. So I don't see a
> need for this to be tied into the protocol extension.

I think this is an interesting idea. I can indeed see use cases for
e.g. inserting a new row based on another row (where the secret is the
same).

IMHO that means that we should also bump the protocol version for this
change, because it's changing the wire protocol by adding a new
parameter format code. And it does so in a way that does not depend on
the new protocol extension.




Re: Transparent column encryption

2024-04-18 Thread Jelte Fennema-Nio
On Thu, 18 Apr 2024 at 13:25, Peter Eisentraut  wrote:
> Hopefully, the reason for key rotation is mainly that policies require
> key rotation, not that keys get compromised all the time.

These key rotation policies are generally in place to reduce the
impact of a key compromise by limiting the time a compromised key is
valid.

> This
> seems pretty standard to me.  For example, I can change the password on
> my laptop's file system encryption, which somehow wraps a lower-level
> key, but I can't reencrypt the actual file system in place.

I think the threat model for this proposal and a laptop's file system
encryption are different enough that the same choices/tradeoffs don't
automatically translate. Specifically in this proposal the unencrypted
CEK is present on all servers that need to read/write those encrypted
values. And a successful attacker would then be able to read the
encrypted values forever with this key, because it effectively cannot
be rotated. That is a much bigger attack surface and risk than a
laptop's disk encryption. So, I feel quite strongly that shipping the
proposed feature without being able to re-encrypt columns in an online
fashion would be a mistake.

> That's the
> reason for having this two-tier key system in the first place.

If we allow for online-rotation of the actual encryption key, then
maybe we don't even need this two-tier system ;)

Not having this two tier system would have a few benefits in my opinion:
1. We wouldn't need to be sending encrypted key material from the
server to every client. Which seems nice from a security, bandwidth
and client implementation perspective.
2. Asymmetric encryption of columns is suddenly an option. Allowing
certain clients to enter encrypted data into the database but not read
it.


> Two problems here: One, for deterministic encryption, everyone needs to
> agree on the representation, otherwise equality comparisons won't work.
>   Two, if you give clients the option of storing text or binary, then
> clients also get back a mix of text or binary, and it will be a mess.
> Just giving the option of storing the payload in binary wouldn't be that
> hard, but it's not clear what you can sensibly do with that in the end.

How about defining at column creation time if the underlying value
should be binary or not? Something like:

CREATE TABLE t(
mytime timestamp ENCRYPTED WITH (column_encryption_key = cek1, binary=true)
);

> Even if the identifiers
> somehow were global (but OIDs can also change and are not guaranteed
> unique forever),

OIDs of existing rows can't just change while a connection is active,
right? (all I know is upgrades can change them but that seems fine)
Also they are unique within a catalog table, right?

> the state of which keys have already been sent is
> session state.

I agree that this is the case. But it's state that can be tracked
fairly easily by a transaction pooler. Similar to how prepared
statements can be tracked. And this is easier to do when at the IDs of
the same keys are the same across each session to the server, because
if they differ then you need to do re-mapping of IDs.

> This is kind of like SASL or TLS can add new methods dynamically without
> requiring a new version.  I mean, as we are learning, making new
> protocol versions is kind of hard, so the point was to avoid it.

Fair enough

> I guess you could do that, but wouldn't that making the decoding of
> these messages much more complicated?  You would first have to read the
> "short" variant, decode the format, and then decide to read the rest.
> Seems weird.

I see your point. But with the current approach even for queries that
don't return any encrypted columns, these useless fields would be part
of the RowDescryption. It seems quite annoying to add extra network
and parsing overhead all of your queries even if only a small
percentage use the encryption feature. Maybe we should add a new
message type instead like EncryptedRowDescription, or add some flag
field at the start of RowDescription that can be used to indicate that
there is encryption info for some of the columns.

> Yes, that's what would happen, and that's the intention, so that for
> example you can use pg_dump to back up encrypted columns without having
> to decrypt them.

Okay, makes sense. But I think it would be good to document that.

> > A related question to this is that currently libpq throws an error if
> > e.g. a master key realm is not defined but another one is. Is that
> > really what we want? Is not having one of the realms really that
> > different from not providing any realms at all?
>
> Can you provide a more concrete example of what scenario you have a
> concern about?

A server has table A and B. A is encrypted with a master key realm X
and B is encrypted with master key realm Y. If libpq is only given a
key for realm X, and it then tries to read table B, an error is
thrown. While if you don't provide any realm at all, you can read from
table B just 

Re: Support a wildcard in backtrace_functions

2024-04-18 Thread Jelte Fennema-Nio
On Thu, 18 Apr 2024 at 09:02, Michael Paquier  wrote:
>
> On Fri, Apr 12, 2024 at 09:36:36AM +0900, Michael Paquier wrote:
> > log_backtrace speaks a bit more to me as a name for this stuff because
> > it logs a backtrace.  Now, there is consistency on HEAD as well
> > because these GUCs are all prefixed with "backtrace_".  Would
> > something like a backtrace_mode where we have an enum rather than a
> > boolean be better?

I guess it depends what we want consistency with. If we want naming
consistency with all other LOGGING_WHAT GUCs or if we want naming
consistency with the current backtrace_functions GUC. I personally
like log_backtrace slightly better, but I don't have a super strong
opinion on this either. Another option could be plain "backtrace".

> > One thing would be to redesign the existing GUC as
> > having two values on HEAD as of:
> > - "none", to log nothing.
> > - "internal", to log backtraces for internal errors.

If we go for backtrace_mode or backtrace, then I think I'd prefer
"disabled"/"off" and "internal_error" for these values.


> The rest of the proposals had better happen as a v18 discussion, where
> extending this GUC is a benefit.

agreed




Re: Support a wildcard in backtrace_functions

2024-04-18 Thread Jelte Fennema-Nio
On Thu, 18 Apr 2024 at 10:50, Peter Eisentraut  wrote:
> Why exactly is this an open item?  Is there anything wrong with the
> existing feature?

The name of the GUC backtrace_on_internal_error is so specific that
it's impossible to extend our backtrace behaviour in future releases
without adding yet another backtrace GUC. You started the discussion
on renaming it upthread:

On Fri, 8 Mar 2024 at 15:51, Peter Eisentraut  wrote:
> What is the relationship of these changes with the recently added
> backtrace_on_internal_error?  We had similar discussions there, I feel
> like we are doing similar things here but slightly differently.  Like,
> shouldn't backtrace_functions_min_level also affect
> backtrace_on_internal_error?  Don't you really just want
> backtrace_on_any_error?  You are sneaking that in through the backdoor
> via backtrace_functions.  Can we somehow combine all these use cases
> more elegantly?  backtrace_on_error = {all|internal|none}?




Re: Speed up clean meson builds by ~25%

2024-04-17 Thread Jelte Fennema-Nio
On Wed, 17 Apr 2024 at 16:00, Tom Lane  wrote:
> Break gram.y (say, misspell some token in a production) and
> see what happens.  The behavior's likely to be timing sensitive
> though.

Thanks for clarifying. It took me a little while to break gram.y in
such a way that I was able to consistently reproduce, but I managed in
the end using the attached small diff.
And then running ninja in non-parallel mode:

ninja -C build all -j1

As I expected this problem was indeed fairly easy to address by still
building "backend/parser" before "interfaces". See attached patch.


v1-0001-Speed-up-clean-parallel-meson-builds-a-lot.patch
Description: Binary data


trigger-confusing-build-error.diff
Description: Binary data


Re: Speed up clean meson builds by ~25%

2024-04-17 Thread Jelte Fennema-Nio
On Wed, 17 Apr 2024 at 13:41, Peter Eisentraut  wrote:
>
> On 10.04.24 17:33, Tom Lane wrote:
> > The immediate question then is do we want to take Jelte's patch
> > as a way to ameliorate the pain meanwhile.  I'm kind of down on
> > it, because AFAICS what would happen if you break the core
> > grammar is that (most likely) the failure would be reported
> > against ecpg first.  That seems pretty confusing.
>
> Yeah that would be confusing.

How can I test if this actually happens? Because it sounds like if
that indeed happens it should be fixable fairly easily.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-15 Thread Jelte Fennema-Nio
On Mon, 15 Apr 2024 at 19:43, Robert Haas  wrote:
>
> On Sat, Apr 6, 2024 at 6:14 PM Jelte Fennema-Nio  wrote:
> > I think for clients/drivers, the work would generally be pretty
> > minimal. For almost all proposed changes, clients can "support" the
> > protocol version update by simply not using the new features, ...
>
> I mean, I agree if a particular protocol version bump does nothing
> other than signal the presence of some optional, ignorable feature,
> then it doesn't cause a problem if we force clients to support it. But
> that seems a bit like saying that eating wild mushrooms is fine
> because some of them aren't poisonous. The point is that if we roll
> out two protocol changes, A and B, each of which requires the client
> to make some change in order to work with the newer protocol version,
> then using version numbers as the gating mechanism requires that the
> client can't support the newer of those two changes without also
> supporting the older one. Using feature flags doesn't impose that
> constraint, which I think is a plus.

I think we're in agreement here, i.e. it depends on the situation if a
feature flag or version bump is more appropriate. I think the
guidelines could be as follows:
1. For protocol changes that require "extremely minimal" work from
clients & poolers: bump the protocol version.
2. For "niche" features that require some work from clients and/or
poolers: use a protocol parameter feature flag.
3. For anything in between, let's discuss on the thread for that
specific protocol change on the tradeoffs.

On Mon, 15 Apr 2024 at 19:52, Robert Haas  wrote:
> surely it can't be right to use protocol
> version 3.0 to refer to a bunch of different things. But at the same
> time, surely we don't want clients to start panicking and bailing out
> when everything would have been fine.

I feel like the ProtocolVersionNegotiation should make sure people
don't panic and bail out. And if not, then feature flags won't help
with this either. Because clients would then still bail out if some
feature is not supported.

> I'm unconvinced that we should let ParameterSet change
> non-PGC_PROTOCOL GUCs. The pooler can agree on a list of protocol GUCs
> with the end client that differs from what the server agreed with the
> pooler - e.g., it can add pgbouncer.pool_mode to the list. But for
> truly non-protocol GUCs, we have a lot of ways to set those already.

I feel like you're glossing over something fairly important here. How
exactly would the client know about pgbouncer.pool_mode? Are you
envisioning a list of GUCs which can be changed using ParameterSet,
which the server then sends to the client during connection startup
(using presumably some new protocol message)? If so, then I feel this
same problem still exists. How would the client know which of those
GUCs change wire-protocol behaviour and which don't? It still would
need a hardcoded list (now including pgbouncer.pool_mode and maybe
more) of things that a user is allowed to change using ParameterSet.
So I think a well-known prefix would still be applicable.

To be clear, imho the well-known prefix discussion is separate from
the discussion about whether Postgres should throw an ERROR when
ParameterSet is used to change any non-PGC_PROTOCOL GUC. I'd be fine
with disallowing that if that seems better/safer/clearer to you
(although I'd love to hear your exact concerns about this). But I'd
still want a well-known prefix for protocol parameters. Because that
prefix is not for the benefit of the server, it's for the benefit of
the client and pooler. So the client/pooler can error if any dangerous
GUC is being changed, because the server would accept it and change
the wire-protocol accordingly.




Re: psql: Greatly speed up "\d tablename" when not using regexes

2024-04-11 Thread Jelte Fennema-Nio
On Wed, 10 Apr 2024 at 23:51, Tom Lane  wrote:
> Is it really necessary for Citus' filter to be a security qual rather
> than a plain ol' filter condition?

No, it's not. I think using security quals simply required the least
amount of code (and it worked just fine if you didn't have lots of
tables). I created a PR for Citus to address this issue[1] by changing
to a normal filter condition. Thanks a lot for pointing me in the
right direction to fix this.

> That is, as long as the derived condition is leakproof, there's no
> reason not to let it go before the security qual.  We're probably
> failing to consider generating derived quals for anything that isn't
> qualified to become an indexqual, and this example shows that that's
> leaving money on the table.

I think even though my immediate is fixed, I think this would be a
good improvement anyway.

[1]: https://github.com/citusdata/citus/pull/7577




Re: psql: Greatly speed up "\d tablename" when not using regexes

2024-04-10 Thread Jelte Fennema-Nio
On Wed, 10 Apr 2024 at 22:11, Tom Lane  wrote:
> There may be an argument for psql to do what you suggest,
> but so far it seems like duplicative complication.
>
> If there's a case you can demonstrate where "\d foo" doesn't optimize
> into an indexscan, we should look into exactly why that's happening,
> because I think the cause must be more subtle than this.

Hmm, okay so I took a closer look and you're completely right: It's
quite a lot more subtle than I initially thought. The query from "\d
foo" is fast as long as you don't have Citus installed. It turns out
that Citus breaks this regex index search optimization somehow by
adding "NOT relation_is_a_known_shard(c.oid)" to the securityQuals of
the rangeTableEntry for pg_class in its planner hook. Citus does this
to filter out the underlying shards of a table for every query on
pg_class. The reason is that these underlying shards cluttered the
output of \d and PgAdmin etc. Users also tended to get confused by
them, sometimes badly enough to remove them (and thus requiring
restore from backup).

We have a GUC to turn this filtering off for advanced users:
SET citus.show_shards_for_app_name_prefixes = '*';

If you set that the index is used and the query is fast again. Just
like what is happening for you. Not using the regex search also worked
as a way to trigger an index scan.

I'll think/research a bit tomorrow and try some stuff out to see if
this is fixable in Citus. That would definitely be preferable to me as
it would solve this issue on all Postgres/psql versions that citus
supports.

If I cannot think of a way to address this in Citus, would it be
possible to still consider to merge this patch (assuming comments
explaining that Citus is the reason)? Because this planner issue that
Citus its behaviour introduces is fixed by the change I proposed in my
Patch too (I'm not yet sure how exactly).




Re: psql: Greatly speed up "\d tablename" when not using regexes

2024-04-10 Thread Jelte Fennema-Nio
On Wed, 10 Apr 2024 at 20:21, Kirill Reshke  wrote:
> Do we need to force Collaction here like in other branches?
> if (PQserverVersion(conn) >= 12)
>appendPQExpBufferStr(buf, " COLLATE pg_catalog.default");

According to the commit and codecomment that introduced the COLLATE,
it was specifically added for correct regex matching (e.g. \w). So I
don't think it's necessary, and I'm pretty sure adding it will cause
the index scan not to be used anymore.




Re: psql: Greatly speed up "\d tablename" when not using regexes

2024-04-10 Thread Jelte Fennema-Nio
On Wed, 10 Apr 2024 at 20:06, Tom Lane  wrote:
> Really?  ISTM this argument is ignoring an optimization the backend
> has understood for a long time.

Interesting. I didn't know about that optimization. I can't check
right now, but probably the COLLATE breaks that optimization.




psql: Greatly speed up "\d tablename" when not using regexes

2024-04-10 Thread Jelte Fennema-Nio
Running "\d tablename" from psql could take multiple seconds when
running on a system with 100k+ tables. The reason for this was that
a sequence scan on pg_class takes place, due to regex matching being
used.

Regex matching is obviously unnecessary when we're looking for an exact
match. This checks for this (common) case and starts using plain
equality in that case.


v1-0001-psql-Greatly-speed-up-d-tablename-when-not-using-.patch
Description: Binary data


Re: Transparent column encryption

2024-04-10 Thread Jelte Fennema-Nio
On Wed, 10 Apr 2024 at 12:13, Peter Eisentraut  wrote:
>
> To kick some things off for PG18, here is an updated version of the
> patch for automatic client-side column-level encryption.

I only read the docs and none of the code, but here is my feedback on
the current design:

> (The CEK can't be rotated easily, since
> that would require reading out all the data from a table/column and
> reencrypting it.  We could/should add some custom tooling for that,
> but it wouldn't be a routine operation.)

This seems like something that requires some more thought because CEK
rotation seems just as important as CMK rotation (often both would be
compromised at the same time). As far as I can tell the only way to
rotate a CEK is by re-encrypting the column for all rows in a single
go at the client side, thus taking a write-lock on all rows of the
table. That seems quite problematic, because that makes key rotation
an operation that requires application downtime. Allowing online
rotation is important, otherwise almost no-one will do it preventative
at a regular interval.

One way to allow online CEK rotation is by allowing a column to be
encrypted by one of several keys and/or allow a key to have multiple
versions. And then for each row we would store which key/version it
was encrypted with. That way for new insertions/updates clients would
use the newest version. But clients would still be able to decrypt
both old rows with the old key and new rows encrypted with the new
key, because the server would give them both keys and tell which row
was encrypted with which. Then the old rows can be rewritten by a
client in small batches, so that writes to the table can keep working
while this operation takes place.

This could even be used to allow encrypting previously unencrypted
columns using something like "ALTER COLUMN mycol ENCRYPTION KEY cek1".
Then unencrypted rows could be indicated by e.g. returning something
like NULL for the CEK.

+The plaintext inside
+the ciphertext is always in text format, but this is invisible to the
+protocol.

It seems like it would be useful to have a way of storing the
plaintext in binary form too. I'm not saying this should be part of
the initial version, but it would be good to keep that in mind with
the design.

+ The session-specific identifier of the key.

Is it necessary for this identifier to be session-specific? Why not
use a global identifier like an oid? Anything session specific makes
the job of transaction poolers quite a bit harder. If this identifier
would be global, then the message can be forwarded as is to the client
instead of re-mapping identifiers between clients and servers (like is
needed for prepared statements).

+   Additional algorithms may be added to this protocol specification without a
+   change in the protocol version number.

What's the reason for not requiring a version bump for this?

+  If the protocol extension _pq_.column_encryption is
+  enabled (see ), then
+  there is also the following for each parameter:

It seems a little bit wasteful to include these for all columns, even
the ones that don't require encryption. How about only adding these
fields when format code is 0x11

Finally, I'm trying to figure out if this really needs to be a
protocol extension or if a protocol version bump would work as well
without introducing a lot of work for clients/poolers that don't care
about it (possibly with some modifications to the proposed protocol
changes). What makes this a bit difficult for me is that there's not
much written in the documentation on what is supposed to happen for
encrypted columns when the protocol extension is not enabled. Is the
content just returned/written like it would be with a bytea? Or is
writing disallowed because the format code would never be set to 0x11.

A related question to this is that currently libpq throws an error if
e.g. a master key realm is not defined but another one is. Is that
really what we want? Is not having one of the realms really that
different from not providing any realms at all?

But no-matter these behavioural details, I think it would be fairly
easy to add minimal "non-support" for this feature while supporting
the new protocol messages. All they would need to do is understand
what the new protocol messages/fields mean and either ignore them or
throw a clear error. For poolers it's a different story however. For
transaction pooling there's quite a bit of work to be done. I already
mentioned the session-specific ID being a problem, but even assuming
we change that to a global ID there's still difficulties. Key
information is only sent by the server if it wasn't sent before in the
session[1], so a pooler would need to keep it's own cache and send
keys to clients that haven't received them yet.

So yeah, I think it would make sense to put this behind a protocol
extension feature flag, since it's fairly niche and would require
significant work at the pooler side to support.



Re: Support a wildcard in backtrace_functions

2024-04-10 Thread Jelte Fennema-Nio
On Fri, 22 Mar 2024 at 11:09, Jelte Fennema-Nio  wrote:
> On Thu, 21 Mar 2024 at 15:41, Jelte Fennema-Nio  wrote:
> > Attached is a patch that implements this. Since the more I think about
> > it, the more I like this approach.
>
> I now added a 3rd patch to this patchset which changes the
> log_backtrace default to "internal", because it seems quite useful to
> me if user error reports of internal errors included a backtrace.

I think patch 0002 should be considered an Open Item for PG17. Since
it's proposing changing the name of the newly introduced
backtrace_on_internal_error GUC. If we want to change it in this way,
we should do it before the release and preferably before the beta.

I added it to the Open Items list[1] so we don't forget to at least
decide on this.

[1]: https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items




Re: Add trim_trailing_whitespace to editorconfig file

2024-04-09 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 16:58, Jelte Fennema-Nio  wrote:
> > ISTM that with a small shell script, .editorconfig could be generated
> > from .gitattributes?
>
> Honestly, I don't think building such automation is worth the effort.

Okay, I spent the time to add a script to generate the editorconfig
based on .gitattributes after all. So attached is a patch that adds
that.


v5-0001-Add-script-to-keep-.editorconfig-in-sync-with-.gi.patch
Description: Binary data


Re: PostgreSQL 17 Release Management Team & Feature Freeze

2024-04-08 Thread Jelte Fennema-Nio
On Mon, 8 Apr 2024 at 20:15, Tomas Vondra  wrote:
> I 100% understand how frustrating the lack of progress can be, and I
> agree we need to do better. I tried to move a number of stuck patches
> this CF, and I hope (and plan) to do more of this in the future.
>
> But I don't quite see how would this rule modification change the
> situation for non-committers.

The problem that I feel I'm seeing is that committers mostly seem to
materially review large patchsets in the last two commit fests. This
might be not based in reality, but it is definitely how it feels to
me. If someone has stats on this, feel free to share.

I'll sketch a situation: There's a big patch that some non-committer
submitted that has been sitting on the mailinglist for 6 months or
more, only being reviewed by other non-committers, which the submitter
quickly addresses. Then in the final commit fest it is finally
reviewed by a committer, and they say it requires significant changes.
Right now, the submitter can decide to quickly address those changes,
and hope to keep the attention of this committer, to hopefully get it
merged before the deadline after probably a few more back and forths.
But this new rule would mean that those significant changes would be a
reason not to put it in the upcoming release. Which I expect would
first of all really feel like a slap in the face to the submitter,
because it's not their fault that those changes were only requested in
the last commit fest. This would then likely result in the submitter
not following up quickly (why make time right at this moment, if
you're suddenly going to have another full year), which would then
cause the committer to lose context of the patch and thus interest in
the patch. And then finally getting into the exact same situation next
year in the final commit fest, when some other committer didn't agree
with the redesign of the first one and request a new one pushing it
another year.

So yeah, I definitely agree with Matthias. I definitely feel like his
rule would seriously impact contributions made by non-committers.

Maybe a better solution to this problem would be to spread impactful
reviews by committers more evenly throughout the year. Then there
wouldn't be such a rush to address them in the last commit fest.




Re: Flushing large data immediately in pqcomm

2024-04-08 Thread Jelte Fennema-Nio
On Sun, 7 Apr 2024 at 11:34, David Rowley  wrote:
> That seems to require modifying the following function signatures:
> secure_write(), be_tls_write(), be_gssapi_write().  That's not an area
> I'm familiar with, however.

Attached is a new patchset where 0003 does exactly that. The only
place where we need to cast to non-const is for GSS, but that seems
fine (commit message has more details).

I also added patch 0002, which is a small addition to the function
comment of internal_flush_buffer that seemed useful to me to
differentiate it from internal_flush (feel free to ignore/rewrite).


v9-0001-Make-a-few-variables-size_t-in-pqcomm.c.patch
Description: Binary data


v9-0002-Expand-comment-of-internal_flush_buffer.patch
Description: Binary data


v9-0003-Make-backend-libpq-write-functions-take-const-poi.patch
Description: Binary data


Re: Flushing large data immediately in pqcomm

2024-04-08 Thread Jelte Fennema-Nio
On Sun, 7 Apr 2024 at 14:41, David Rowley  wrote:
> Looking at the code in socket_putmessage_noblock(), I don't understand
> why it's ok for PqSendBufferSize to be int but "required" must be
> size_t.  There's a line that does "PqSendBufferSize = required;". It
> kinda looks like they both should be size_t.  Am I missing something
> that you've thought about?


You and Ranier are totally right (I missed this assignment). Attached is v8.


v8-0001-Make-a-few-variables-size_t-in-pqcomm.c.patch
Description: Binary data


Re: Speed up clean meson builds by ~25%

2024-04-08 Thread Jelte Fennema-Nio
On Mon, 8 Apr 2024 at 10:02, Peter Eisentraut  wrote:
> I have tested this with various compilers, and I can confirm that this
> shaves off about 5 seconds from the build wall-clock time, which
> represents about 10%-20% of the total time.  I think this is a good patch.

Great to hear.

> Interestingly, if I apply the analogous changes to the makefiles, I
> don't get any significant improvements.  (Depends on local
> circumstances, of course.)  But I would still suggest to keep the
> makefiles aligned.

Attached is a patch that also updates the Makefiles, but indeed I
don't get any perf gains there either.

On Mon, 8 Apr 2024 at 07:23, Michael Paquier  wrote:
> Well, this is also painful with ./configure.  So, even if we are going
> to move away from it at this point, we still need to support it for a
> couple of years.  It looks to me that letting the clang folks know
> about the situation is the best way forward.

I reported the issue to the clang folks:
https://github.com/llvm/llvm-project/issues/87973

But even if my patch doesn't help for ./configure, it still seems like
a good idea to me to still reduce compile times when using meson while
we wait for clang folks to address the issue.


v2-0001-Speed-up-clean-parallel-meson-builds-a-lot.patch
Description: Binary data


Re: Speed up clean meson builds by ~25%

2024-04-08 Thread Jelte Fennema-Nio
On Mon, 8 Apr 2024 at 10:00, Alexander Lakhin  wrote:
> As I wrote in [1], I didn't observe the issue with clang-18, so maybe it
> is fixed already.
> Perhaps it's worth rechecking...

Using the attached script I got these timings. Clang is significantly
slower in all of them. But especially with -Og the difference between
is huge.

gcc 11.4.0: 7.276s
clang 18.1.3: 17.216s
gcc 11.4.0 --debug: 7.441s
clang 18.1.3 --debug: 18.164s
gcc 11.4.0 --debug -Og: 2.418s
clang 18.1.3 --debug -Og: 14.864s

I reported this same issue to the LLVM project here:
https://github.com/llvm/llvm-project/issues/87973
#!/bin/bash
set -exo pipefail
compile() {
ninja -C build 
src/interfaces/ecpg/preproc/ecpg.p/meson-generated_.._preproc.c.o
rm 
build/src/interfaces/ecpg/preproc/ecpg.p/meson-generated_.._preproc.c.o
time ninja -C build 
src/interfaces/ecpg/preproc/ecpg.p/meson-generated_.._preproc.c.o -v
}

CC=gcc CC_LD=lld meson setup --reconfigure build --wipe > /dev/null
compile
CC=clang-18 CC_LD=lld meson setup --reconfigure build --wipe > /dev/null
compile
CC=gcc CC_LD=lld meson setup --reconfigure --debug build --wipe > /dev/null
compile
CC=clang-18 CC_LD=lld meson setup --reconfigure --debug build --wipe > /dev/null
compile
CC=gcc CC_LD=lld meson setup --reconfigure --debug -Dc_args="-Og" build --wipe 
> /dev/null
compile
CC=clang-18 CC_LD=lld meson setup --reconfigure --debug -Dc_args="-Og" build 
--wipe > /dev/null
compile



Re: Flushing large data immediately in pqcomm

2024-04-07 Thread Jelte Fennema-Nio
On Sun, 7 Apr 2024 at 03:39, Andres Freund  wrote:
> Changing the global vars to size_t seems mildly bogus to me. All it's
> achieving is to use slightly more memory. It also just seems unrelated to the
> change.

I took a closer look at this. I agree that changing PqSendBufferSize
to size_t is unnecessary: given the locations that it is used I see no
risk of overflow anywhere. Changing the type of PqSendPointer and
PqSendStart is needed though, because (as described by Heiki and David
upthread) the argument type of internal_flush_buffer is size_t*. So if
you actually pass int* there, and the sizes are not the same then you
will start writing out of bounds. And because internal_flush_buffer is
introduced in this patch, it is related to this change.

This is what David just committed too.

However, the "required" var actually should be of size_t to avoid
overflow if len is larger than int even without this change. So
attached is a tiny patch that does that.


v7-0001-Avoid-possible-overflow-in-socket_putmessage_nonb.patch
Description: Binary data


Re: Flushing large data immediately in pqcomm

2024-04-06 Thread Jelte Fennema-Nio
On Sat, 6 Apr 2024 at 22:21, Andres Freund  wrote:
> The small regression for small results is still kinda visible, I haven't yet
> tested the patch downthread.

Thanks a lot for the faster test script, I'm also impatient. I still
saw the small regression with David his patch. Here's a v6 where I
think it is now gone. I added inline to internal_put_bytes too. I
think that helped especially because for two calls to
internal_put_bytes len is a constant (1 and 4) that is smaller than
PqSendBufferSize. So for those calls the compiler can now statically
eliminate the new codepath because "len >= PqSendBufferSize" is known
to be false at compile time.

Also I incorporated all of Ranier his comments.


v6-0001-Faster-internal_putbytes.patch
Description: Binary data


Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-06 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 18:48, Robert Haas  wrote:
> Maybe we'd be better off adding a libpq connection
> option that forces the use of a specific minor protocol version, but
> then we'll need backward-compatibility code in libpq basically
> forever. But maybe we need that anyway to avoid older and newer
> servers being unable to communicate.

I think this would be good because it makes testing easy and, like you
said, I think we'll need this backward-compatibility code in libpq
anyway to be able to connect to old servers. To have even better and
more realistic test coverage though, I think we might also want to
actually test new libpq against old postgres servers and vice-versa in
a build farm animal though.

> Plus, you've got all of the consequences for non-core drivers, which
> have to both add support for the new wire protocol - if they don't
> want to seem outdated and eventually obsolete - and also test that
> they're still compatible with all supported server versions.

I think for clients/drivers, the work would generally be pretty
minimal. For almost all proposed changes, clients can "support" the
protocol version update by simply not using the new features, e.g. a
client can "support" the ParameterSet feature, by simply never sending
the ParameterSet message. So binding it to a protocol version bump
doesn't make it any harder for that client to support that protocol
version. I'm not saying that is the case for all protocol changes, but
based on what's being proposed so far that's definitely a very common
theme. Overall, I think this is something to discuss for each protocol
change in isolation: i.e. how to make supporting the new feature as
painless as possible for clients/drivers.

> Connection poolers have the same set of problems.

For connection poolers this is indeed a bigger hassle, because they at
least need to be able to handle all the new message types that a
client can send and maybe do something special for them. But I think
if we're careful to keep connection poolers in mind when designing the
features themselves then I think this isn't necessarily a problem. And
probably indeed for the features that we think are hard for connection
poolers to implement, we should be using protocol extension parameter
feature flags. But I think a lot of protocol would be fairly trivial
for a connection pooler to support.

> The whole thing is
> almost a hole with no bottom. Keeping up with core changes in this
> area could become a massive undertaking for lots and lots of people,
> some of whom may be the sole maintainer of some important driver that
> now needs a whole bunch of work.

I agree with Dave here, if you want to benefit from new features
there's some expectation to keep up with the changes. But to be clear,
we'd still support old protocol versions too. So we wouldn't break
connecting using those clients, they simply wouldn't benefit from some
of the new features. I think that's acceptable.

> I'm not sure how much it improves things if we imagine adding feature
> flags to the existing protocol versions, rather than whole new
> protocol versions, but at least it cuts down on the assumption that
> adopting new features is mandatory, and that such features are
> cumulative. If a driver wants to support TDE but not protocol
> parameters or protocol parameters but not TDE, who are we to say no?
> If in supporting those things we bump the protocol version to 3.2, and
> then 3.3 fixes a huge performance problem, are drivers going to be
> required to add support for features they don't care about to get the
> performance fixes?

I think there's an important trade-off here. On one side we don't want
to make maintainers of clients/poolers do lots of work to support
features they don't care about. And on the other side it seems quite
useful to limit the amount of feature combinations that are used it
the wild (both for users and for us) e.g. the combinations of
backwards compatibility testing you were talking about would explode
if every protocol change was a feature flag. I think this trade-off is
something we should be deciding on based on the specific protocol
change. But if work needed to "support" the feature is "minimal"
(to-be-defined exactly what we consider minimal), I think making it
part of a protocol version bump is reasonable.

> I see some benefit in bumping the protocol version
> for major changes, or for changes that we have an important reason to
> make mandatory, or to make previously-optional features for which
> support has become in practical terms universal part of the base
> feature set. But I'm very skeptical of the idea that we should just
> handle as many things as possible via a protocol version bump. We've
> been avoiding protocol version bumps like the plague since forever,
> and swinging all the way to the other extreme doesn't sound like the
> right idea to me.

I think there's two parts to a protocol version bump:
1. The changes that cause us to consider a protocol 

Re: Flushing large data immediately in pqcomm

2024-04-06 Thread Jelte Fennema-Nio
On Sat, 6 Apr 2024 at 03:34, David Rowley  wrote:
> Does anyone else want to try the attached script on the v5 patch to
> see if their numbers are better?

On my machine (i9-10900X, in Ubuntu 22.04 on WSL on Windows) v5
consistently beats master by ~0.25 seconds:

master:
Run 100 100 500: 1.948975205
Run 1024 10240 20: 3.039986587
Run 1024 1048576 2000: 2.444176276
Run 1048576 1048576 1000: 2.475328596

v5:
Run 100 100 500: 1.997170909
Run 1024 10240 20: 3.057802598
Run 1024 1048576 2000: 2.199449857
Run 1048576 1048576 1000: 2.210328762

The first two runs are pretty much equal, and I ran your script a few
more times and this seems like just random variance (sometimes v5 wins
those, sometimes master does always quite close to each other). But
the last two runs v5 consistently wins.

Weird that on your machines you don't see a difference. Are you sure
you didn't make a silly mistake, like not restarting postgres or
something?




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-05 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 18:43, Tom Lane  wrote:
> I don't buy that argument, actually.  libpq, and pretty much every
> other client AFAIK, has provisions to let higher code levels insert
> random options into the startup packet.  So to make this work libpq
> would have to filter or at least inspect such options, which is
> logic that doesn't exist and doesn't seem nice to need.

libpq actually doesn't support doing this (only by putting them in the
"options" parameter, but _pq_ parameters would not be allowed there),
but indeed many other clients do allow this and indeed likely don't
have logic to filter/disallow _pq_ prefixed parameters.

This seems very easy to address though: Only parse _pq_ options when
protocol version 3.1 is requested by the client, and otherwise always
report them as "not supported". Then clients upgrading to 3.1, they
should filter/disallow _pq_ parameters to be arbitrarily set. I don't
think that's hard/not nice to add, it's literally a prefix check for
the "_pq_." string.

> The other problem with adding these things in the startup packet
> is that when you send that packet, you don't know what the server
> version is and hence don't know if it will take these options.

(imho) the whole point of the _pq_ options is that they don't trigger
an error when they are requested by the client, but not supported by
the server. So I don't understand your problem here.

> What's so bad about insisting that these options must be sent in a
> separate message?

To not require an additional roundtrip waiting for the server to respond.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-05 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 18:30, Dave Cramer  wrote:
>> > I really intended the _pq_ prefix as a way of taking something out of
>> > the GUC namespace, not as a part of the GUC namespace that users would
>> > see. And I'm reluctant to go back on that. If we want to make
>> > pg_protocol.${NAME} mean a wire protocol parameter, well maybe there's
>> > something to that idea [insert caveats here]. But doesn't _pq_ look
>> > like something that was intended to be internal? That's certainly how
>> > I intended it.
>
>
> Is this actually used in practice? If so, how ?

No, it's not used for anything at the moment. This whole thread is
basically about trying to agree on how we want to make protocol
changes in the future in a somewhat standardized way. But using the
tools available that we have to not break connecting to old postgres
servers: ProtocolVersionNegotation messages, minor version numbers,
and _pq_ parameters in the startup message. All of those have so far
been completely theoretical and have not appeared in any client-server
communication.




Re: Speed up clean meson builds by ~25%

2024-04-05 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 17:24, Andres Freund  wrote:
> I recommend opening a bug report for clang, best with an already preprocessed
> input file.

> We're going to need to do something about this from our side as well, I
> suspect. The times aren't great with gcc either, even if not as bad as with
> clang.

Agreed. While not a full solution, I think this patch is still good to
address some of the pain: Waiting 10 seconds at the end of my build
with only 1 of my 10 cores doing anything.

So while this doesn't decrease CPU time spent it does decrease
wall-clock time significantly in some cases, with afaict no downsides.




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-05 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 16:02, Robert Haas  wrote:
> Often?
>
> I kind of hope that the protocol starts to evolve a bit more than it
> has, but I don't want a continuous stream of changes. That will be
> very hard to test and verify correctness, and a hassle for drivers to
> keep up with, and a mess for compatibility.

I definitely think protocol changes require a lot more scrutiny than
many other things, given their hard/impossible to change nature.

But I do think that we shouldn't be at all averse to the act of
bumping the protocol version itself. If we have a single small
protocol change in one release, then imho it's no problem to bump the
protocol version. Bumping the version should be painless. So we
shouldn't be inclined to push an already agreed upon protocol change
to the next release, because there are some more protocol changes in
the pipeline that won't make it in the current one.

I don't think this would be any harder for drivers to keep up with,
then if we'd bulk all changes together. If driver developers only want
to support changes in bulk changes, they can simply choose not to
support 3.1 at all if they want and wait until 3.2 to support
everything in bulk then.

> > So, what approach of checking feature support are you envisioning
> > instead? A function for every feature?
> > Something like SupportsSetProtocolParameter, that returns an error
> > message if it's not supported and NULL otherwise. And then add such a
> > support function for every feature?
>
> Yeah, something like that.
> ...
>
> > I'm also not sure why you're saying a user is not entitled to this
> > information. We already provide users of libpq a way to see the full
> > Postgres version, and the major protocol version. I think allowing a
> > user to access this information is only a good thing. But I agree that
> > providing easy to use feature support functions is a better user
> > experience in some cases.
>
> I guess that's a fair point. But I'm worried that if we expose too
> much of the internals, we won't be able to change things later.

I'll take a look at redesigning the protocol parameter stuff. To work
with dedicated functions instead.

> I really intended the _pq_ prefix as a way of taking something out of
> the GUC namespace, not as a part of the GUC namespace that users would
> see. And I'm reluctant to go back on that. If we want to make
> pg_protocol.${NAME} mean a wire protocol parameter, well maybe there's
> something to that idea [insert caveats here]. But doesn't _pq_ look
> like something that was intended to be internal? That's certainly how
> I intended it.

I agree that _pq_ does look internal and doesn't clearly indicate that
it's a protocol related change. But sadly the _pq_ prefix is the only
one that doesn't error in startup packets, waiting another 5 years
until pg_protocol is allowed in the startup packet doesn't seem like a
reasonable solution either.

How about naming the GUCs pg_protocol.${NAME}, but still requiring the
_pq_ prefix in the StartupPacket. That way only client libraries would
have to see this internal prefix and they could remap it someway. I
see two options for that:
1. At the server replace the _pq_ prefix with pg_protocol. So
_pq_.${NAME} would map to pg_protocol.${name}
2. At the server replace the _pq_.pg_protocol prefix with pg_protocol.
So _pq_.pg_protocol.${NAME} would map to pg_protocol.${name}.

I guess you prefer option 2, because that would still leave lots of
space to do something with the rest of the _pq_ space, i.e.
_pq_.magic_pixie_dust can still be used for something different than a
GUC.

Bikeshedding: I think I prefer protocol.${NAME} over
pg_protocol.${NAME}, it's shorter and it seems obvious that protocol
is the postgres protocol in this context.

This should be a fairly simple change to make.

> Wouldn't libpq already know what value it last set? Or is this needed
> because it doesn't know what the other side's default is?

libpq could/should indeed know this, but for debugging/testing
purposes it is quite useful to have a facility to read the server side
value. I think defaults should always be whatever was happening if the
parameter wasn't specified before, so knowing the server default is
not something the client needs to worry about (i.e. the default is
defined as part of the protocol spec).

> Hmm, OK. I guess if the PGC_PROTOCOL flag makes it so that the GUC can
> only be set using ParameterSet, and it also makes them
> non-transactional, then it's fine. So to be clear, I can't set these
> in postgresql.conf, or postgresql.auto.conf, or via ALTER $ANYTHING,
> or via SET, or in any other way than by sending ParameterStatus
> messages. And when I send a ParameterStatus message, it doesn't matter
> if I'm in a good transaction, an aborted transaction, or no
> transaction at all, and the setting change takes effect regardless of
> that and regardless of any subsequent rollbacks. Is that right?
>
> I feel like maybe it's not, because you seem to be 

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-05 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 16:04, Robert Haas  wrote:
>
> On Thu, Apr 4, 2024 at 1:10 PM Jelte Fennema-Nio  wrote:
> > Attached is a rebased patchset
>
> We should keep talking about this, but I think we're far too close to
> the wire at this point to think about committing anything for v17 at
> this point. These are big changes, they haven't been thoroughly
> reviewed by anyone AFAICT, and we don't have consensus on what we
> ought to be doing. I know that's probably not what you want to hear,
> but realistically, I think that's the only reasonable decision at this
> point.

Agreed on not considering this for PG17 nor this commitfest anymore. I
changed the commit fest entry accordingly.




Re: Speed up clean meson builds by ~25%

2024-04-05 Thread Jelte Fennema-Nio
On Fri, 5 Apr 2024 at 00:45, Jelte Fennema-Nio  wrote:
> It improved clean build times on my machine (10 cores/20 threads) from ~40
> seconds to ~30 seconds.

After discussing this off-list with Bilal, I realized that this gain
is only happening for clang builds on my system. Because those take a
long time as was also recently discussed in[1]. My builds don't take
nearly as long though. I tried with clang 15 through 18 and they all
took 10-22 seconds to run and clang comes from apt.llvm.org on Ubuntu
22.04

[1]: 
https://www.postgresql.org/message-id/flat/CA%2BhUKGLvJ7-%3DfS-J9kN%3DaZWrpyqykwqCBbxXLEhUa9831dPFcg%40mail.gmail.com




Speed up clean meson builds by ~25%

2024-04-04 Thread Jelte Fennema-Nio
Building the generated ecpg preproc file can take a long time. You can
check how long using:

ninja -C build src/interfaces/ecpg/preproc/ecpg.p/meson-generated_.._preproc.c.o

This moves that file much closer to the top of our build order, so
building it can be pipelined much better with other files.

It improved clean build times on my machine (10 cores/20 threads) from ~40
seconds to ~30 seconds.

You can check improvements for yourself with:

ninja -C build clean && ninja -C build all


v1-0001-Speed-up-clean-parallel-meson-builds-a-lot.patch
Description: Binary data


Re: WIP Incremental JSON Parser

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 23:52, Jelte Fennema-Nio  wrote:
> Fair enough, I guess I'll change my invocation to include the ninja
> "test" target too:
> ninja -C build test install-quiet && meson test -C build

Actually that doesn't do what I want either because that actually runs
the tests too... And I prefer to do that using meson directly, so I
can add some filters to it. I'll figure something out when I'm more
fresh tomorrow.




Re: WIP Incremental JSON Parser

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 23:34, Jacob Champion
 wrote:
>
> On Thu, Apr 4, 2024 at 1:42 PM Jelte Fennema-Nio  wrote:
> > Maybe that's something worth addressing too. I expected that
> > install/install-quiet was a strict superset of a plain ninja
> > invocation.
>
> Maybe that's the intent, but I hope not, because I don't see any
> reason for `ninja install` to worry about test-only binaries that
> won't be installed. For the tests, there's a pretty clear rationale
> for the dependency.

Fair enough, I guess I'll change my invocation to include the ninja
"test" target too:
ninja -C build test install-quiet && meson test -C build




Re: Security lessons from liblzma

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 23:06, Daniel Gustafsson  wrote:
>
> > On 4 Apr 2024, at 23:02, Jelte Fennema-Nio  wrote:
>
> > How about we make it meson/make targets, so they are simply cached
> > just like any of our other build artefacts are cached. Then only clean
> > builds are impacted, not every test run.
>
> They already are (well, make not meson yet), they're just not hooked up to any
> top-level commands.  Running "make ssfiles-clean ssfiles" in src/test/ssl
> regenerates all the files from the base config files that define their
> contents.

Okay turns out even generating them in parallel isn't any faster than
running that sequentially. I guess it's because of the strong random
generation being the slow part. Command I used was the following and
took ~5 seconds on my machine:

make -C src/test/ssl sslfiles-clean && make -C src/test/ssl sslfiles -j20

And I think that's actually a good thing, because that would mean
total build time is pretty much not impacted if we'd include it as
part of the regular clean build. Since building these certs are
bottlenecked on randomness, not on CPU (as pretty much all of our
other build artifacts are). So they should pipeline pretty very well
with the other items, assuming build concurrency is set slightly
higher than the number of cores.

As a proof of concept I ran the above command in a simple bash loop constantly:

while true; do make -C src/test/ssl sslfiles-clean && make -C
src/test/ssl sslfiles -j20; done

And then ran a clean (parallel) build in another shell:

ninja -C build clean && ninja -C build

And total build time went from 41 to 43 seconds. To be clear, that's
while constantly running the ssl file creation. If I only run the
creation once, there's no noticeable increase in build time at all.




Re: Security lessons from liblzma

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 22:56, Daniel Gustafsson  wrote:
>
> > On 4 Apr 2024, at 22:47, Tom Lane  wrote:
> >
> > Robert Haas  writes:
> >> On Thu, Apr 4, 2024 at 4:25 PM Daniel Gustafsson  wrote:
> >>> I don't disagree, like I said that very email: it's non-trivial and I 
> >>> wish we
> >>> could make it better somehow, but I don't hav an abundance of good ideas.
> >
> >> Is the basic issue that we can't rely on the necessary toolchain to be
> >> present on every machine where someone might try to build PostgreSQL?
> >
> > IIUC, it's not really that, but that regenerating these files is
> > expensive; multiple seconds even on fast machines.  Putting that
> > into tests that are run many times a day is unappetizing.
>
> That's one aspect of it.  We could cache the results of course to amortize the
> cost over multiple test-runs but at the end of the day it will add time to
> test-runs regardless of what we do.

How about we make it meson/make targets, so they are simply cached
just like any of our other build artefacts are cached. Then only clean
builds are impacted, not every test run.




Re: WIP Incremental JSON Parser

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 20:48, Jacob Champion
 wrote:
>
> On Thu, Apr 4, 2024 at 11:12 AM Jacob Champion
>  wrote:
> > What's in the `...`? I wouldn't expect to find the test binary in your
> > tmp_install.
>
> Oh, I wonder if this is just a build dependency thing? I typically run
> a bare `ninja` right before testing, because I think most of those
> dependencies are missing for the tests at the moment. (For example, if
> I manually remove the `libpq_pipeline` executable and then try to run
> `meson test` without a rebuild, it fails.)
>
> We can certainly fix that (see attached patch as a first draft) but I
> wonder if there was a reason we decided not to during the Meson port?

Yeah, that patch fixed my problem. (as well as running plain ninja)

To clarify: I did do a rebuild. But it turns out that install-quiet
doesn't build everything... The full command I was having problems
with was:

ninja -C build install-quiet && meson test -C build

Maybe that's something worth addressing too. I expected that
install/install-quiet was a strict superset of a plain ninja
invocation.




Re: WIP Incremental JSON Parser

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 18:13, Andrew Dunstan  wrote:
> Argh. You get out of the habit when you're running with meson :-(

I'm having some issues with meson too actually. "meson test -C build"
is now failing with this for me:

Command 'test_json_parser_incremental' not found in
/home/jelte/opensource/postgres/build/tmp_install//home/jelte/.pgenv/pgsql-17beta9/bin,
...




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 14:50, Peter Eisentraut  wrote:
> It appears there are several different perspectives about this.  My
> intuition was that a protocol version change indicates something that we
> eventually want all client libraries to support.  Whereas protocol
> extensions are truly opt-in.
>
> For example, if we didn't have the ParameterStatus message and someone
> wanted to add it, then this ought to be a protocol version change, so
> that we eventually get everyone to adopt it.
>
> But, for example, the column encryption feature is not something I'd
> expect all client libraries to implement, so it can be opt-in.

I think that is a good way of deciding between version bump vs
protocol extension parameter. But I'd like to make one
clarification/addition to this logic, if the protocol feature already
requires opt-in from the client because of how the feature works, then
there's no need for a protocol extension parameter. e.g. if you're
column-encryption feature would require the client to send a
ColumnDecrypt message before the server would exhibit any behaviour
relating to the column-encryption feature, then the client can simply
"support" the feature by never sending the ColumnDecrypt message.
Thus, in such cases a protocol extension parameter would not be
necessary, even if the feature is considered opt-in.

> (I believe we have added a number of new protocol messages since the
> beginning of the 3.0 protocol, without any version change, so "new
> protocol message" might not always be a good example to begin with.)

Personally, I feel like this is something we should change. IMHO, we
should get to a state where protocol minor version bumps are so
low-risk that we can do them whenever we add message types. Right now
there are basically multiple versions of the 3.0 protocol, which is
very confusing to anyone implementing it. Different servers
implementing the 3.0 protocol without the NegotiateVersion message is
a good example of that.

> I fear that if we prefer protocol extensions over version increases,
> then we'd get a very fragmented landscape of different client libraries
> supporting different combinations of things.

+1




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-04-04 Thread Jelte Fennema-Nio
Ugh, I seem to have somehow missed this response completely.

On Thu, 14 Mar 2024 at 21:33, Robert Haas  wrote:
> While I think that will be a common
> pattern, I do not think it will be a universal one. I do agree,
> however, that every possible variation of the protocol is either
> Boolean-valued (i.e. are we doing X or not?) or something more
> complicated (i.e. how exactly are doing X?).

Agreed

> This is definitely not how I was thinking about it. I was imagining
> that we wanted to reserve bumping the protocol version for more
> significant changes, and that we'd use _pq_ parameters for relatively
> minor new functionality whether Boolean-valued or otherwise.

Yeah, we definitely think differently here then. To me bumping the
minor protocol version shouldn't be a thing that we would need to
carefully consider. It should be easy to do, and probably done often.

> The user is entitled to
> know whether PQsetProtocolParameter() will work or not, and the user
> is entitled to know whether it has a chance of working next time if it
> didn't work this time, and when it fails, the user is entitled to a
> good error message explaining the reason for the failure. But the user
> is not entitled to know what negotiation took place over the wire to
> figure that out. They shouldn't need to know that the _pq_ namespace
> exists, and they shouldn't need to know whether we negotiated the
> availability or unavailability of PQsetProtocolParameter() using [a]
> the protocol minor version number, [b] the protocol major version
> number, [c] a protocol option called parameter_set, or [d] a protocol
> option called bananas_foster. Those are all things that might change
> in the future.

So, what approach of checking feature support are you envisioning
instead? A function for every feature?
Something like SupportsSetProtocolParameter, that returns an error
message if it's not supported and NULL otherwise. And then add such a
support function for every feature?

I think I might agree with you that it would be nice for features that
depend on a protocol extension parameter, indeed because in the future
we might make them required and they don't have to update their logic
then. But for features only depending on the protocol version I
honestly don't see the point. A protocol version check is always going
to continue working.

I'm also not sure why you're saying a user is not entitled to this
information. We already provide users of libpq a way to see the full
Postgres version, and the major protocol version. I think allowing a
user to access this information is only a good thing. But I agree that
providing easy to use feature support functions is a better user
experience in some cases.

> But this is another example of a problem that can *easily* be fixed
> without using up the entirety of the _pq_ namespace. You can introduce
> _pq_.dont_error_on_unknown_gucs=1 or
> _pq_.dont_error_on_these_gucs='foo, bar, baz'. The distinction between
> the startup packet containing whizzbang=frob and instead containing
> _pq_.whizzbang=frob shouldn't be just whether an error is thrown if we
> don't know anything about whizzbang.

Nice idea, but that sadly won't work well in practice. Because old PG
versions don't know about _pq_.dont_error_on_unknown_gucs. So that
would mean we'd have to wait another 5 years (and probably more) until
we could set non-_pq_-prefixed GUCs safely in the startup message,
using this approach.

Two side-notes:
1. I realized there is a second benefit to using _pq_ for all GUCs
that change the protocol level that I didn't mention in my previous
email: It allows clients to assume that _pq_ prefixed GUCs will change
the wire-protocol and that they should not allow a user of the client
to set them willy-nilly. I'll go into this benefit more in the rest of
this email.
2. To clarify: I'm suggesting that a startup packet containing
_pq_.whizzbang would actually set the _pq_.whizzbang GUC, and not the
whizzbang GUC. i.e. the _pq_ prefix is not stripped-off when parsing
the startup packet.

> I guess I'm in the same position as you are -- your argument doesn't
> really make any sense to me. That also has the unfortunate
> disadvantage of making it difficult for me to explain why I don't
> agree with you, but let me just tick off a few things that I'm
> thinking about here:

Thanks for listing these thoughts. That makes it much easier to
discuss something concrete.

> 1. Connection poolers. If I'm talking to pgpool and pgpool is talking
> to the server, and pgpool and I agree to use compression, that's
> completely separate from whether pgpool and the server are using
> compression. If I have to interrogate the compression state by
> executing "SHOW some_compression_guc", then I'm going to get the wrong
> answer unless pgpool runs a full SQL parser on every command that I
> execute and intercepts the ones that touch protocol parameters. That's
> bound to be expensive an unreliable -- consider something like SELECT
> 

Re: Add trim_trailing_whitespace to editorconfig file

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 17:23, Peter Eisentraut  wrote:
>  git diff-tree --check $(git hash-object -t tree /dev/null) HEAD
>
> That's what I was hoping for for editorconfig-check, but as I said, the
> experience wasn't good.

Ah, I wasn't able to find that git incantation. I definitely think it
would be good if there was an official cli tool like that for
editorconfig, but the Javascript one was the closest I could find. The
Go one I haven't tried.

On Thu, 4 Apr 2024 at 17:23, Peter Eisentraut  wrote:
>
> On 04.04.24 16:58, Jelte Fennema-Nio wrote:
> > On Thu, 4 Apr 2024 at 15:25, Peter Eisentraut  wrote:
> >> Everybody has git.  Everybody who edits .gitattributes can use git to
> >> check what they did.
> > What CLI command do you use to fix/ gitattributes on all existing
> > files? Afaict there's no command to actually remove the trailing
> > whitespace that git add complains about. If you don't have such a
> > command, then afaict updating gitattributes is also essentially
> > blind-updating.
>
> I don't have a command to fix files automatically, but I have a command
> to check them:
>
>  git diff-tree --check $(git hash-object -t tree /dev/null) HEAD
>
> That's what I was hoping for for editorconfig-check, but as I said, the
> experience wasn't good.
>




Re: Add trim_trailing_whitespace to editorconfig file

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 15:25, Peter Eisentraut  wrote:
> Everybody has git.  Everybody who edits .gitattributes can use git to
> check what they did.

What CLI command do you use to fix/ gitattributes on all existing
files? Afaict there's no command to actually remove the trailing
whitespace that git add complains about. If you don't have such a
command, then afaict updating gitattributes is also essentially
blind-updating.

> But I don't want users of a common tool to bear the
> burden of blindly updating files for a much-less-common tool.

It's used quite a bit. Many editors/IDEs have built in support (Vim,
Visual Studio, IntelliJ), and the ones that don't have an easy to
install plugin. It's not meant to be used as a command line tool, but
as the name suggests it's meant as editor integration.

> ISTM that with a small shell script, .editorconfig could be generated
> from .gitattributes?

Honestly, I don't think building such automation is worth the effort.
Changing the .editorconfig file to be the same is pretty trivial if
you look at the existing examples, honestly editorconfig syntax is
much more straightforward to me than the gitattributes one. Also
gitattributes is only changed very rarely, only 15 times in the 10
years since its creation in our repo, which makes any automation
around it probably not worth the investement.

This whole comment really seems to only really be about 0004. We
already have an outdated editorconfig file in the repo, and it's
severely annoying me whenever I'm writing any docs for postgres
because it doesn't trim my trailing spaces. If we wouldn't have this
editorconfig file in the repo at all, it would actually be better for
me, because I could maintain my own file locally myself. But now
because there's an incorrect file, I'd have to git stash/pop all the
time. Is there any chance the other commits can be at least merged.




Re: Flushing large data immediately in pqcomm

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 13:08, Melih Mutlu  wrote:
> I changed internal_flush() to an inline function, results look better this 
> way.

It seems you also change internal_flush_buffer to be inline (but only
in the actual function definition, not declaration at the top). I
don't think inlining internal_flush_buffer should be necessary to
avoid the perf regressions, i.e. internal_flush is adding extra
indirection compared to master and is only a single line, so that one
makes sense to inline.

Other than that the code looks good to me.

The new results look great.

One thing that is quite interesting about these results is that
increasing the buffer size results in even better performance (by
quite a bit). I don't think we can easily choose a perfect number, as
it seems to be a trade-off between memory usage and perf. But allowing
people to configure it through a GUC like in your second patchset
would be quite useful I think, especially because larger buffers could
be configured for connections that would benefit most for it (e.g.
replication connections or big COPYs).

I think your add-pq_send_buffer_size-GUC.patch is essentially what we
would need there but it would need some extra changes to actually be
merge-able:
1. needs docs
2. rename PQ_SEND_BUFFER_SIZE (at least make it not UPPER_CASE, but
maybe also remove the PQ_ prefix)
3. It's marked as PGC_USERSET, but there's no logic to grow/shrink it
after initial allocation




Re: psql not responding to SIGINT upon db reconnection

2024-04-04 Thread Jelte Fennema-Nio
On Wed, 3 Apr 2024 at 21:49, Robert Haas  wrote:
> It seems to me that 0001 should either remove the pg_unreachable()
> call or change the break to a return, but not both.

Updated the patch to only remove the pg_unreachable call (and keep the breaks).

> but there's not much value in omitting
> pg_unreachable() from unreachable places, either, so I'm not
> convinced.

But I don't agree with this. Having pg_unreachable in places where it
brings no perf benefit has two main downsides:

1. It's extra lines of code
2. If a programmer/reviewer is not careful about maintaining this
unreachable invariant (now or in the future), undefined behavior can
be introduced quite easily.

Also, I'd expect any optimizing compiler to know that the code after
the loop is already unreachable if there are no break/goto statements
in its body.

> I agree with Tristan's analysis of 0002.

Updated 0002, to only change the comment. But honestly I don't think
using "default" really introduces many chances for future bugs in this
case, since it seems very unlikely we'll ever add new variants to the
PostgresPollingStatusType enum.
From b0ccfb6fca3e4ae9e1e62ac3b6a4c5f428b56e04 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v14 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by removing the call to
pg_unreachable, which seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..479f9f2be59 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

From 799eb6989d481ab617c473b8acbbfc1a405c3c7d Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v14 2/2] Change comment on PGRES_POLLING_ACTIVE

Updates the comment on PGRES_POLLING_ACTIVE, since we're stuck with
it forever due to ABI compatibility guarantees.
---
 src/interfaces/libpq/libpq-fe.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..c184e853889 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,7 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
- * compatibility */
+	PGRES_POLLING_ACTIVE		/* unused; keep for backwards compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1



Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-04-04 Thread Jelte Fennema-Nio
On Thu, 4 Apr 2024 at 10:45, Alvaro Herrera  wrote:
> Yeah, this seems to work and I no longer get that complaint from
> headerscheck.

patch LGTM




Re: psql not responding to SIGINT upon db reconnection

2024-04-03 Thread Jelte Fennema-Nio
On Wed, 3 Apr 2024 at 16:55, Tristan Partin  wrote:
> Removing from the switch statement causes a warning:
>
> > [1/2] Compiling C object src/bin/psql/psql.p/command.c.o
> > ../src/bin/psql/command.c: In function ‘wait_until_connected’:
> > ../src/bin/psql/command.c:3803:17: warning: enumeration value 
> > ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
> >  3803 | switch (PQconnectPoll(conn))
> >   | ^~


Ofcourse... fixed now
From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v13 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by both changing the break
statements to return and removing the call to pg_unreachable, which
seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..dc3cc7c10b7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn)
 		 * user has requested a cancellation.
 		 */
 		if (cancel_pressed)
-			break;
+			return;
 
 		/*
 		 * Do not assume that the socket remains the same across
@@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn)
 		 */
 		sock = PQsocket(conn);
 		if (sock == -1)
-			break;
+			return;
 
 		/*
 		 * If the user sends SIGINT between the cancel_pressed check, and
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

From 76cef25162b44adc20172afee47836ca765d3b5c Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v13 2/2] Remove PGRES_POLLING_ACTIVE case

This enum variant has been unused for at least 21 years
44aba280207740d0956160c0288e61f28f024a71. It's been there only for
backwards compatibility of the ABI, so no need to check for it.

Also update the comment on PGRES_POLLING_ACTIVE, since we're stuck with
it forever due to ABI compatibility guarantees.
---
 src/bin/psql/command.c  | 7 ++-
 src/interfaces/libpq/libpq-fe.h | 3 +--
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dc3cc7c10b7..5b31d08bf70 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3802,17 +3802,14 @@ wait_until_connected(PGconn *conn)
 
 		switch (PQconnectPoll(conn))
 		{
-			case PGRES_POLLING_OK:
-			case PGRES_POLLING_FAILED:
-return;
 			case PGRES_POLLING_READING:
 forRead = true;
 continue;
 			case PGRES_POLLING_WRITING:
 forRead = false;
 continue;
-			case PGRES_POLLING_ACTIVE:
-pg_unreachable();
+			default:
+return;
 		}
 	}
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..c184e853889 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,7 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
- * compatibility */
+	PGRES_POLLING_ACTIVE		/* unused; keep for backwards compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1



Re: psql not responding to SIGINT upon db reconnection

2024-04-03 Thread Jelte Fennema-Nio
On Wed, 3 Apr 2024 at 16:31, Tom Lane  wrote:
> If we do the latter, we will almost certainly get pushback from
> distros who check for library ABI breaks.  I fear the comment
> suggesting that we could remove it someday is too optimistic.

Alright, changed patch 0002 to keep the variant. But remove it from
the recently added switch/case. And also updated the comment to remove
the "for awhile".
From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v12 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by both changing the break
statements to return and removing the call to pg_unreachable, which
seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..dc3cc7c10b7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn)
 		 * user has requested a cancellation.
 		 */
 		if (cancel_pressed)
-			break;
+			return;
 
 		/*
 		 * Do not assume that the socket remains the same across
@@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn)
 		 */
 		sock = PQsocket(conn);
 		if (sock == -1)
-			break;
+			return;
 
 		/*
 		 * If the user sends SIGINT between the cancel_pressed check, and
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

From f71634361a0c9ba3416c75fc3ff6d55536f134c4 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v12 2/2] Remove PGRES_POLLING_ACTIVE case

This enum variant has been unused for at least 21 years
44aba280207740d0956160c0288e61f28f024a71. It's been there only for
backwards compatibility of the ABI, so no need to check for it.

Also update the comment on PGRES_POLLING_ACTIVE, since we're stuck with
it forever due to ABI compatibility guarantees.
---
 src/bin/psql/command.c  | 2 --
 src/interfaces/libpq/libpq-fe.h | 3 +--
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dc3cc7c10b7..9a163cf22cd 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3811,8 +3811,6 @@ wait_until_connected(PGconn *conn)
 			case PGRES_POLLING_WRITING:
 forRead = false;
 continue;
-			case PGRES_POLLING_ACTIVE:
-pg_unreachable();
 		}
 	}
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..c184e853889 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,7 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
- * compatibility */
+	PGRES_POLLING_ACTIVE		/* unused; keep for backwards compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1



Re: psql not responding to SIGINT upon db reconnection

2024-04-03 Thread Jelte Fennema-Nio
On Tue, 2 Apr 2024 at 16:33, Robert Haas  wrote:
> Committed it, I did. My thanks for working on this issue, I extend.

Looking at the committed version of this patch, the pg_unreachable
calls seemed weird to me. 1 is actually incorrect, thus possibly
resulting in undefined behaviour. And for the other call an imho
better fix would be to remove the now 21 year unused enum variant,
instead of introducing its only reference in the whole codebase.

Attached are two trivial patches, feel free to remove both of the
pg_unreachable calls.
From 7f4279bb939e2d2707fdd0f471893d734959ab91 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v11 2/2] Remove PGRES_POLLING_ACTIVE

This enum variant has been unused for at least 21 years
44aba280207740d0956160c0288e61f28f024a71. It was left for backwards
compatibility for "awhile". I think that time has come.
---
 src/bin/psql/command.c  | 2 --
 src/interfaces/libpq/libpq-fe.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dc3cc7c10b7..9a163cf22cd 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3811,8 +3811,6 @@ wait_until_connected(PGconn *conn)
 			case PGRES_POLLING_WRITING:
 forRead = false;
 continue;
-			case PGRES_POLLING_ACTIVE:
-pg_unreachable();
 		}
 	}
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..2459b0cf5e1 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,6 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
- * compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1

From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v11 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by both changing the break
statements to return and removing the call to pg_unreachable, which
seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..dc3cc7c10b7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn)
 		 * user has requested a cancellation.
 		 */
 		if (cancel_pressed)
-			break;
+			return;
 
 		/*
 		 * Do not assume that the socket remains the same across
@@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn)
 		 */
 		sock = PQsocket(conn);
 		if (sock == -1)
-			break;
+			return;
 
 		/*
 		 * If the user sends SIGINT between the cancel_pressed check, and
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1



Re: Support prepared statement invalidation when result types change

2024-04-03 Thread Jelte Fennema-Nio
On Sun, 7 Jan 2024 at 07:55, vignesh C  wrote:
> One of the test has aborted in CFBot at [1] with:

Rebased the patchset and removed patch 0003 since it was causing the
CI issue reported by vignesh and it seems much less useful and more
controversial to me anyway (due to the extra required roundtrip).


On Sun, 7 Jan 2024 at 09:17, Shay Rojansky  wrote:
> Just to point out, FWIW, that the .NET Npgsql driver does indeed cache 
> RowDescriptions... The whole point of preparation is to optimize things as 
> much as possible for repeated execution of the query; I get that the value 
> there is much lower than e.g. doing another network roundtrip, but that's 
> still extra work that's better off being cut if it can be.

Hmm, interesting. I totally agree that it's always good to do less
when possible. The problem is that in the face of server side prepared
statement invalidations due to DDL changes to the table or search path
changes, the row types might change. Or the server needs to constantly
throw an error, like it does now, but that seems worse imho.

I'm wondering though if we can create a middleground, where a client
can still cache the RowDescription client side when no DDL or
search_patch changes are happening. But still tell the client about a
new RowDescription when they do happen.

The only way of doing that I can think of is changing the Postgres
protocol in a way similar to this: Allow Execute to return a
RowDescription too, but have the server only do so when the previously
received RowDescription for this prepared statement is now invalid.

This would definitely require some additional tracking at PgBouncer to
make it work, i.e. per client and per server it should now keep track
of the last RowDescription for each prepared statement. But that's
definitely something we could do.

This would make this patch much more involved though, as now it would
suddenly involve an actual protocol change, and that basically depends
on this patch moving forward[1].

[1]: 
https://www.postgresql.org/message-id/flat/cageczqtg2hcmb5gau53uuwcdc7gcnjfll6mnw0wnhwhgq9u...@mail.gmail.com


v6-0002-Completely-remove-fixed_result-from-CachedPlanSou.patch
Description: Binary data


v6-0001-Support-prepared-statement-invalidation-when-resu.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-29 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 19:03, Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > It doesn't fail when it's too fast -- it's just that it doesn't cover
> > the case we want to cover.
>
> That's hardly better, because then you think you have test
> coverage but maybe you don't.

Honestly, that seems quite a lot better. Instead of having randomly
failing builds, you have a test that creates coverage 80+% of the
time. And that also seems a lot better than having no coverage at all
(which is what we had for the last 7 years since introduction of
cancellations to postgres_fdw). It would be good to expand the comment
in the test though saying that the test might not always cover the
intended code path, due to timing problems.

> Could we make this test bulletproof by using an injection point?
> If not, I remain of the opinion that we're better off without it.

Possibly, and if so, I agree that would be better than the currently
added test. But I honestly don't feel like spending the time on
creating such a test. And given 7 years have passed without someone
adding any test for this codepath at all, I don't expect anyone else
will either.

If you both feel we're better off without the test, feel free to
remove it. This was just some small missing test coverage that I
noticed while working on this patch, that I thought I'd quickly
address. I don't particularly care a lot about the specific test.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 17:43, Alvaro Herrera  wrote:
> Hah, you're right, I can reproduce with a smaller timeout, and using SET
> LOCAL works as a fix.  If we're doing that, why not reduce the timeout
> to 1ms?  We don't need to wait extra 9ms ...

I think we don't really want to make the timeout too short. Otherwise
the query might get cancelled before we push any query down to the
FDW. I guess that means that for some slow machines even 10ms is not
enough to make the test do the intended purpose. I'd keep it at 10ms,
which seems long enough for normal systems, while still being pretty
short.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 17:34, Alvaro Herrera  wrote:
>
> Eh, kestrel has also failed[1], apparently every query after the large
> JOIN that this commit added as test fails with a statement timeout error.
>
> [1] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel=2024-03-28%2016%3A01%3A14

Ugh that's annoying, the RESET is timing out too I guess. That can
hopefully be easily fixed by changing the new test to:

BEGIN;
SET LOCAL statement_timeout = '10ms';
select count(*) from ft1 CROSS JOIN ft2 CROSS JOIN ft4 CROSS JOIN ft5;
-- this takes very long
ROLLBACK;




Re: Possibility to disable `ALTER SYSTEM`

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 12:57, Robert Haas  wrote:
> I disagree with a lot of these changes. I think the old version was
> mostly better. But I can live with a lot of it if it makes other
> people happy.

I'd have been fine with many of the previous versions of the docs too.
(I'm not a native english speaker though, so that might be part of it)

> However:

Attached is a patch with your last bit of feedback addressed.


v12-0001-Add-allow_alter_system-GUC.patch
Description: Binary data


Re: Possibility to disable `ALTER SYSTEM`

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 10:24, Jelte Fennema-Nio  wrote:
> I addressed them all I think. Mostly the small changes that were
> suggested, but I rewrote the sentence with "might be discarded". And I
> added references to the new GUC in both places suggested by David.

Changed the error hint to use "external tool" too. And removed a
duplicate header definition of AllowAlterSystem (I moved it to guc.h
so it was together with other definitions a few patches ago, but
apparently forgot to remove it from guc_tables.h)


v11-0001-Add-allow_alter_system-GUC.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Wed, 27 Mar 2024 at 19:27, Alvaro Herrera  wrote:
> I ended up reducing the two PG_TRY blocks to a single one.  I see no
> reason to split them up, and this way it looks more legible.

I definitely agree this looks better. Not sure why I hadn't done that,
maybe it wasn't possible in one of the earlier iterations of the API.




Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-03-28 Thread Jelte Fennema-Nio
On Wed, 27 Mar 2024 at 19:46, Alvaro Herrera  wrote:
>
> On 2024-Mar-27, Alvaro Herrera wrote:
>
> > I changed it so that the error messages are returned as translated
> > phrases, and was bothered by the fact that if errors happen repeatedly,
> > the memory for them might be leaked.  Maybe this is fine depending on
> > the caller's memory context, but since it's only at most one string each
> > time, it's quite easy to just keep track of it so that we can release it
> > on the next.
>
> (Actually this sounds clever but fails pretty obviously if the caller
> does free the string, such as in a memory context reset.  So I guess we
> have to just accept the potential leakage.)

Your changes look good, apart from the prverror stuff indeed. If you
remove the prverror stuff again I think this is ready to commit.




Re: Possibility to disable `ALTER SYSTEM`

2024-03-28 Thread Jelte Fennema-Nio
On Thu, 28 Mar 2024 at 01:43, David G. Johnston
 wrote:
>
> On Wed, Mar 27, 2024 at 5:17 PM Bruce Momjian  wrote:
>>
>> 

I addressed them all I think. Mostly the small changes that were
suggested, but I rewrote the sentence with "might be discarded". And I
added references to the new GUC in both places suggested by David.


v10-0001-Add-allow_alter_system-GUC.patch
Description: Binary data


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread Jelte Fennema-Nio
On Wed, 27 Mar 2024 at 23:06, Bruce Momjian  wrote:
> > > > > +some outside mechanism. In such environments, using 
> > > > > ALTER
> > > > > +SYSTEM to make configuration changes might appear 
> > > > > to work,
> > > > > +but then may be discarded at some point in the future when 
> > > > > that outside
> > > >
> > > > "might"
> > >
> > > This does not seem like a mistake to me. I'm not sure why you think it is.
> >
> > I also think the original sentence was correct, but I don't think it
> > read very naturally. Changed it now in hopes to improve that.
>
> So, might means "possibility" while "may" means permission, so "might"
> is clearer here.

Aaah, I misunderstood your original feedback then. I thought you
didn't like the use of "might" in "might appear to work". But I now
realize you meant "may be discarded" should be changed to "might be
discarded". I addressed that in my latest version of the patch.




  1   2   3   4   5   6   >