POC: Carefully exposing information without authentication

Greg Sabino Mullane Thu, 29 May 2025 07:33:54 -0700

Proposal: Allow a carefully curated selection of information to be shown
without authentication.


A common task for an HA system or a load balancer is to quickly determine
which of your Postgres clusters is the primary, and which are the replicas.
The canonical way to do this is to log in to each server with a valid
username and password, and then run pg_is_in_recovery().
That's a lot of work to determine if a server is a replica or not, and it
struck me that this true/false information about a running cluster is not
super-sensitive information. In other words, would it really be wrong if
there was a way to advertise that information without having to log in? I
toyed with the idea of Postgres maintaining some sort of signal file, but
then I realized that we already have a process, listening on a known port,
that has that information available to us.

Thus, this POC (proof of concept), which lets the postmaster scan for
incoming requests and quickly handle them *before* doing forking and
authenticating. We scan for a simple trigger string, and immediately return
the information to the client.

It also occured to me that since we are going to need to provide a
non-Postgres-protocol special trigger string, and we might as well do
something like "GET /info" to allow existing programs to treat Postgres as
a mini http server. To that end, we end
up with something like this:

$ psql -p 5432 -tc 'select pg_is_in_recovery()'
t

$ curl http://localhost:5432/foobar
curl: (52) Empty reply from server

$ curl http://localhost:5432/info
RECOVERY: 1

To accomplish this, we have boolean GUC flags (defaulting to false) that
control which information is exposed. For the example above, the
expose_recovery boolean has been set to true. If any of these GUCs are
true, we take a slight detour right after we accept() but before we
actually fork. We use recv with the MSG_PEEK flag to take a quick scan of
the incoming data, and use strncmp to see if it matches. If it does, we
send() some information and move on without forking. If it doesn't, we
simply move on as if we were never there, and proceed to the next step of
forking a new backend to start the authentication process.

There are three pieces of information that can be exposed with this patch.
There may be more in the future, but these are all simple, global, and not
(IMO) security leaks. The GUCs are expose_recovery, expose_sysid, and
expose_version. Each one adds a line to the output in a KEY: VALUE format
for the GET /info endpoint. The raw value is output for the direct
endpoints:

* GET /replica
* GET /sysid
* GET /version

The expose_recovery GUC uses RecoveryInProgress() to return a 1 or a 0.
This is returned by the GET /replica endpoint.

The expose_sysid GUC returns GetSystemIdentifier(). Since this can be
thought of as a fingerprint for the server, it's a nice way for external
programs to determine if the cluster is the same one it saw last time, or
for leader/replica matching.

The purpose of the expose_version GUC is to output PG_VERSION_NUM. This
will allow external tools - particularly security scanners - to know the
exact version of Postgres that is running. While some may consider this
privileged information, tools are already taking advantage of our debug
loophole to make an educated guess about the version. See my old post about
this:
https://www.endpointdev.com/blog/2010/05/finding-postgresql-version-without/
Note that this guess by security scanners is sometimes wrong, or only able
to cover a range of versions. Thus, we should give them the correct answer,
rather than providing a dubious one via some trickery.

Here's some example output with all three enabled:

$ psql -c 'alter system set expose_recovery=on'
$ psql -c 'alter system set expose_version=on'
$ psql -c 'alter system set expose_sysid=on'
$ psql -c 'select pg_reload_conf()'

$ curl http://localhost:5432/version
180000

$ curl http://localhost:5432/info
RECOVERY: 1
SYSID: 7504513530771111839
VERSION: 180000

But wait! We can do more. For the recovery, we don't even need a string
that spells out "RECOVERY:", we only need to know if the server is in
recovery or not. In short, a boolean. Patroni does this in its API with a
call of HEAD /replica. It returns a different HTTP code if the server is a
replica (200) or not a replica (503). We can do the same thing! What's
more, we can do it in a way that will allow existing calls to simply point
to the postgres server instead of a Patroni process, and get the same
result back, but faster. Here's an example of what that looks like:

## Calling Patroni
$ curl -s -w "%{http_code}" -o /dev/null -I http://localhost:8008/replica
200

## Calling Postgres directly
$ curl -s -w "%{http_code}" -o /dev/null -I http://localhost:5432/replica
200

Here's a simple Python program showing how easy it is to grab this
information:

import socket

try:
  with socket.create_connection(('localhost', 5432), timeout=1) as s:
    s.sendall(b'GET /sysid')
    print(s.recv(200).split(b'\r\n\r\n',1)[1].decode())
except Exception as e:
  print(f"Error: {e}")


That's the basic idea: proof of concept patch is attached. Additional
things to do:

* handling socket quirks (esp. Win32)
* docs (once details are hashed out)
* moving things around (everything is in one function right now for reading
ease)

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

0001-Allow-specific-information-to-be-output-directly-by-Postgres.patch
Description: Binary data

POC: Carefully exposing information without authentication

Reply via email to