Re: POC: Carefully exposing information without authentication

2026-03-24 Thread Greg Sabino Mullane
Thank you for looking over this. New version attached.

On Tue, Feb 17, 2026 at 2:58 PM Andres Freund  wrote:

> What about direct TLS connections?


Not handled.

How can a cluster coordinator trust unauthenticated plain text
> communication that can just be man-in-the-middled?
>

They cannot. But that's why this is only exposing non-critical information.
Right now the security scanners that are banging on port 5432 and scraping
the returned error lines are not worried about man-in-the-middle. :)
Obviously, if your threat model is people capturing and modifying
non-encrypted traffic to your Postgres server, you would not use this.

It's not obvious that it's a good idea to expose this on the same socket as
> normal client connections.  IMO you'd want to limit this to a smaller set
> of interfaces than normal client connections.
>

I'm not entirely clear what that smaller set would mean in practice.


> IIRC the socket is in blocking mode at this point (that's only changed in
> pq_init()), therefore this might actually block?  While it's unlikely, I
> don't see any guarantee that a single receive would actually get the whole
> message from the client either, so this seems like it might fail spuriously.
>

Yes, there are some very unlikely edge cases, but this is meant to be good
enough, not a perfectly bulletproof HTTP server. Clients should try again
on failures. Which if they do occur for this trivial amount of traffic
probably indicates much bigger problems.

If we were to do this, I'd recommend a single expose GUC that has the
> different values as a comma separated list, instead a growing list of GUCs.
>

Done - see attached for a new version which consolidates the bools into a
single comma-separated GUC called "expose_information". I also added some
docs, and changed the "replica" to return "REPLICA" instead of "RECOVERY".
I like the latter better, but replica lines up better with existing tools.

-- 
Cheers,
Greg


0006-Allow-specific-information-to-be-output-directly-by-Postgres.patch
Description: Binary data


Re: POC: Carefully exposing information without authentication

2026-02-17 Thread Andres Freund
Hi,

On 2026-02-17 14:42:48 -0500, Greg Sabino Mullane wrote:
> Subject: [PATCH] Allow specific information to be output directly by Postgres.

I strongly encourage you to include a justification for why this is desirable,
so a casual reviewer doesn't have to reread the thread.


> @@ -148,6 +172,14 @@ BackendInitialize(ClientSocket *client_sock, CAC_state 
> cac)
>   StringInfoData ps_data;
>   MemoryContext oldcontext;
>  
> + /*
> +  * Scan for a simple GET / HEAD request. If this is detected and
> +  * handled, we are done and can immediately exit
> +  */
> + if ((expose_recovery || expose_sysid || expose_version)
> + && ExposeInformation(client_sock->sock))
> + _exit(0); /* Safe to use exit: no state or resources created 
> yet */
> +
>   /* Tell fd.c about the long-lived FD associated with the client_sock */
>   ReserveExternalFD();
>

What about direct TLS connections?

How can a cluster coordinator trust unauthenticated plain text communication
that can just be man-in-the-middled?


It's not obvious that it's a good idea to expose this on the same socket as
normal client connections.  IMO you'd want to limit this to a smaller set of
interfaces than normal client connections.



> +/*
> + * ExposeInformation
> + *
> + * Handle early socket probe before full backend startup.
> + * Responds to small set of predefined endpoints (e.g. GET /info)
> + *
> + * Requires at least one "expose_" GUC to be true.
> + *
> + * Returns true if any endpoint is recognized.
> + */
> +
> +static bool
> +ExposeInformation(pgsocket fd)
> +{
> + static endpoint_action endpoint_actions[] =
> + {
> + {
> + "HEAD /replica", &expose_recovery, EXPOSE_HEAD_REPLICA
> + },
> + {
> + "GET /replica", &expose_recovery, EXPOSE_GET_REPLICA
> + },
> + {
> + "GET /sysid", &expose_sysid, EXPOSE_GET_SYSID
> + },
> + {
> + "GET /version", &expose_version, EXPOSE_GET_VERSION
> + },
> + {
> + "GET /info", NULL, EXPOSE_GET_ALL
> + }
> + };
> +
> + ssize_t n;
> + charbuf[EXPOSE_MAX_QUERY + 1];
> + ExposeReturnTypetype;
> +
> + Assert(expose_recovery || expose_sysid || expose_version);
> +
> + do
> + {
> + n = recv(fd, buf, EXPOSE_MAX_QUERY, MSG_PEEK);
> + } while (n < 0 && errno == EINTR);
>
> + /*
> +  * Leave as soon as possible if no chance we are interested.
> +  * (we also leave on partial reads from slow clients)
> +  * We also simply return false for n == -1
> +  */
> + if (n < EXPOSE_MIN_QUERY)
> + return false;

IIRC the socket is in blocking mode at this point (that's only changed in
pq_init()), therefore this might actually block?  While it's unlikely, I don't
see any guarantee that a single receive would actually get the whole message
from the client either, so this seems like it might fail spuriously.




> diff --git a/src/backend/utils/misc/guc_parameters.dat 
> b/src/backend/utils/misc/guc_parameters.dat
> index 271c033952e..3e99d9f6b7c 100644
> --- a/src/backend/utils/misc/guc_parameters.dat
> +++ b/src/backend/utils/misc/guc_parameters.dat
> @@ -1010,6 +1010,25 @@
>boot_val => 'false',
>  },
>  
> +{ name => 'expose_recovery', type => 'bool', context => 'PGC_SIGHUP', group 
> => 'CONN_AUTH_AUTH',
> +  short_desc => 'Exposes if the server is in recovery mode without a login.',
> +  variable => 'expose_recovery',
> +  boot_val => 'false',
> +},
> +
> +{ name => 'expose_sysid', type => 'bool', context => 'PGC_SIGHUP', group => 
> 'CONN_AUTH_AUTH',
> +  short_desc => 'Exposes the system identifier without a login.',
> +  variable => 'expose_sysid',
> +  boot_val => 'false',
> +},
> +
> +{ name => 'expose_version', type => 'bool', context => 'PGC_SIGHUP', group 
> => 'CONN_AUTH_AUTH',
> +  short_desc => 'Exposes the server version without a login.',
> +  variable => 'expose_version',
> +  boot_val => 'false',
> +},
> +

If we were to do this, I'd recommend a single expose GUC that has the
different values as a comma separated list, instead a growing list of GUCs.

Greetings,

Andres Freund




Re: POC: Carefully exposing information without authentication

2026-02-17 Thread Greg Sabino Mullane
Please find attached a rebased and lightly reworked version of this patch.
The most significant change is the test file now uses IO::Socket::INET via
$node->raw_connect. Also changed to allow case-insensitive calls, moved to
a better docs group, moved the defines and typedefs up, and changed the
exit to just a simple _exit()

Cheers,
Greg


0005-Allow-specific-information-to-be-output-directly-by-Postgres.patch
Description: Binary data


Re: POC: Carefully exposing information without authentication

2026-01-09 Thread Greg Sabino Mullane
On Fri, Jan 9, 2026 at 8:56 AM Antonin Houska  wrote:

> 1. Add a new field to the PGconn structure


This kind of defeats one of the major strengths of this patch, which is
allowing systems that don't speak the protocol to get at this information.


> Regarding configuration, I'd prefer a single GUC. The value can be a
> comma-separated list of keywords, each representing particular piece of
> information to be exposed.
>

Yes, I could see some advantages to that, although I still like the
simplicity of separate boolean values. I've no strong feelings either way.
Let's see if others weigh in.

Thanks for looking over this patch!

Cheers,
Greg


Re: POC: Carefully exposing information without authentication

2026-01-09 Thread Antonin Houska
Greg Sabino Mullane  wrote:

> Version 4 attached, rebased to account for new tests, plus a new instra-test
> check to make sure LWP::UserAgent is available before running.

I'm still not sure it's necessary to handle the problem at socket level. I
imagine it can be implemented this way:

1. Add a new field to the PGconn structure, indicating that the client is only
requesting the server status information, and adjust pg_isready so it sets
this option.

2. Adjust libpq frontend (pqBuildStartupPacket3) so it adds the corresponding
option to the startup packet.

3. On server, if ProcessStartupPacket() sees that option, call ereport(FATAL)
with a specific error code, and let the appropriate GUCs control the contents
of the error message. pg_isready would then just print out the message.

I haven't tried to write any code, so it's possible that I'm missing
something.

Regarding configuration, I'd prefer a single GUC. The value can be a
comma-separated list of keywords, each representing particular piece of
information to be exposed.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com




Re: POC: Carefully exposing information without authentication

2025-10-23 Thread Greg Sabino Mullane
Version 4 attached, rebased to account for new tests, plus a new
instra-test check to make sure LWP::UserAgent is available before running.

Cheers,
Greg


0004-Allow-specific-information-to-be-output-directly-by-Postgres.patch
Description: Binary data


Re: POC: Carefully exposing information without authentication

2025-10-02 Thread Greg Sabino Mullane
Please find attached version 3, rebased for PG 19 and now featuring some
tests.

Cheers,
Greg


0003-Allow-specific-information-to-be-output-directly-by-Postgres.patch
Description: Binary data


Re: POC: Carefully exposing information without authentication

2025-05-30 Thread Greg Sabino Mullane
On Fri, May 30, 2025 at 9:34 PM Tom Lane  wrote:

> I think calling it in the postmaster is a nonstarter.


Thanks for the feedback. Please find attached version two, which moves the
code to the very start of BackendInitialize in
tcop/backend_startup.c. If we handle the request, we simply proc_exit and
avoid all the other backend startup stuff. So still a big win. I also made
a first rough pass at the documentation.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


0002-Allow-specific-information-to-be-output-directly-by-Postgres.patch
Description: Binary data


Re: POC: Carefully exposing information without authentication

2025-05-30 Thread Tom Lane
Greg Sabino Mullane  writes:
> Good question. Forking is expensive, and there is also a lot of
> housekeeping associated with it that is simply not needed here. We want
> this to be lightweight, and simple. No need to fork if we are just going to
> do a few strncmp() calls and a send().

send() can block.  I think calling it in the postmaster is a
nonstarter.  For comparison, we make an effort to not do any
communication with incoming clients until after forking a child
to do the communication.  The one exception is if we have to
report fork failure --- but we don't make any strong guarantees
about that report succeeding.  (IIRC, we put the port into nonblock
mode and try only once.)  That's probably not a behavior you want
to adopt for non-edge-case usages.

Another point is that you'll recall that there's a lot of
interest in switching to a threaded model.  The argument that
"fork is too expensive" may not have a long shelf life.

I'm not taking a position on whether $SUBJECT is a good idea
in the first place.

regards, tom lane




Re: POC: Carefully exposing information without authentication

2025-05-30 Thread Greg Sabino Mullane
On Fri, May 30, 2025 at 11:02 AM Antonin Houska  wrote:

> Why is it important not to fork?


Good question. Forking is expensive, and there is also a lot of
housekeeping associated with it that is simply not needed here. We want
this to be lightweight, and simple. No need to fork if we are just going to
do a few strncmp() calls and a send(). However, I'm not highly opposed to
fork-first, as I understand that we want to not slow down postmaster. My
testing showed a barely measurable impact, but I will defer to whatever
decision the elder Postgres gods decide on.


> My understanding is that pg_is_ready also tries to start a regular
> connection, i.e. forks a new backend.


Yep. I consider pg_isready a spiritual cousin to this feature, but it's not
something that can really do what this does.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


Re: POC: Carefully exposing information without authentication

2025-05-30 Thread Antonin Houska
Greg Sabino Mullane  wrote:

> Proposal: Allow a carefully curated selection of information to be shown 
> without authentication.
> 
> A common task for an HA system or a load balancer is to quickly determine 
> which of your Postgres clusters is the primary, and which are the
> replicas. The canonical way to do this is to log in to each server with a 
> valid username and password, and then run pg_is_in_recovery().
> That's a lot of work to determine if a server is a replica or not, and it 
> struck me that this true/false information about a running cluster is not
> super-sensitive information. In other words, would it really be wrong if 
> there was a way to advertise that information without having to log in?
> I toyed with the idea of Postgres maintaining some sort of signal file, but 
> then I realized that we already have a process, listening on a known
> port, that has that information available to us.
> 
> Thus, this POC (proof of concept), which lets the postmaster scan for 
> incoming requests and quickly handle them *before* doing forking and
> authenticating. We scan for a simple trigger string, and immediately return 
> the information to the client.

Why is it important not to fork?  My understanding is that pg_is_ready also
tries to start a regular connection, i.e. forks a new backend. I think this
functionality would fit into libpq. (I've got no strong opinion on the amount
of information to be revealed this way. In any case, a GUC to enable the
feature only if the DBA wants it makes sense.)

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com