Re: Better shared data structure management and resizable shared data structures

Robert Haas Wed, 25 Mar 2026 11:37:46 -0700

On Sat, Mar 21, 2026 at 8:14 PM Heikki Linnakangas <[email protected]> wrote:
> I see your point too, the options will indeed easily start to split
> between constant parts and parts that are set later.


Yeah. That definitely seems worth trying to fix.

> I don't like the idea of just using the string name as the handle. I
> think it will come handy to have a backend-private descriptor or handle
> to the shared memory region. But perhaps the "name" and "ptr" should be
> regular arguments, for example. If some parameters are passed as
> arguments and some in the struct, that also splits things in an awkward
> way, though.

I agree that having a handle object seems potentially useful. Putting
arguments that are required to be constant in the struct and passing
other things as arguments or in some other way seems like a good idea.

> I wonder if we should set aside a small amount of memory, like 10-20 kB,
> in the fixed shmem segment for small structs like that. Currently, such
> allocations will usually succeed because we leave some wiggle room, but
> the  can also be consumed by other things like locks. But we could
> reserve a small amount of memory specifically for small allocations in
> extensions like this.

Yeah, I don't really understand why we let the lock table use up that
space. I mean, I think it would be useful to have a way to let the
lock table expand without a server restart, and I also suspect that we
could come up with a less-silly data structure than the PROCLOCK hash,
but also if the only thing keeping you from running out of lock space
is the wiggle room, maybe you just need to bump up
max_locks_per_transaction. Like, you could easily burn through the
wiggle room, get an error anyway, and then later find that you also
now can't load certain extensions without a server restart.

> Shmem callbacks
> ---------------
>
> I separated the request/init/fn callbacks from the structs. There's now
> a concept of "shmem callbacks", which you register in _PG_init(). For
> example:
>
> static void pgss_shmem_request(void *arg);
> static void pgss_shmem_init(void *arg);
>
> static const ShmemCallbacks pgss_shmem_callbacks = {
>      .request_fn = pgss_shmem_request,
>      .init_fn = pgss_shmem_init,
>      .attach_fn = NULL, /* no special attach actions needed */
> };

What's the advantage of coupling the functions together this way, vs.
just registering each callback individually?

Also, I don't understand what "arg" is supposed to be doing. It
doesn't seem to be getting used for anything.

I wonder if we should think about just adjusting the timing of the
existing callbacks instead of adding new things. I mean, maybe that's
too likely to cause silent breakage, in which case we could also
replace them with new callbacks with slightly different names. But
there's something to be said for a hard cutover -- or at least not
letting both ways coexist for more than a few releases.

> This is a similar to the old shmem_request and shmem_startup hooks. It's
> not a one-to-one replacement though, the callbacks are called at
> different stages than the old hooks (to make things convenient with the
> new facility):
>
> * The request_fn callback is called in postmaster startup, at the same
> stage as the old shmem_request callback was. But in EXEC_BACKEND mode,
> it's *also* called in each backend.
>
> * The init_fn callback is only called in postmaster startup, when it's
> time to initialize the area. (Ignoring the special "after startup"
> allocations for now).

Trying to improve the timing of the callbacks makes sense to me as a
concept, without taking a position on these particular choices (and
definitely hoping for some after-startup improvements).

Looking at v7-0018 (any chance you have a public branch where I can
look at this without having to download and apply all the patches one
by one?) it seems like this gets rid of the need for a bunch of
IsUnderPostmaster checks in individual subsystems, and some if
(found)/if (!found) checks, which I definitely like.

> Shmem requests
> --------------
>
> To register a shmem area, you call ShmemRequestStruct() or
> ShmmeRequestHash() from the request callback function. For example:
>
> static void
> pgss_shmem_request(void *arg)
> {
>      static ShmemHashDesc pgssSharedHashDesc = {
>         .name = "pg_stat_statements hash",
>          .ptr = &pgss_hash,
>          .hash_info.keysize = sizeof(pgssHashKey),
>          .hash_info.entrysize = sizeof(pgssEntry),
>          .hash_flags = HASH_ELEM | HASH_BLOBS,
>      };
>      static ShmemStructDesc pgssSharedStateDesc = {
>          .name = "pg_stat_statements",
>          .size = sizeof(pgssSharedState),
>          .ptr = (void **) &pgss,
>      };
>
>      pgssSharedHashDesc.init_size = pgss_max;
>      pgssSharedHashDesc.max_size = pgss_max;
>      ShmemRequestHash(&pgssSharedHashDesc);
>      ShmemRequestStruct(&pgssSharedStateDesc);
> }
>
> Initialization happens in the init callback, which is called after all
> the pointers (pgss, pgss_hash) have been set.

I think this is not bad. I suppose it lets you get a complete list of
all of the descriptors someplace. It seems to avoid double-computing
the request size, or any possibility of drift between the requested
size and the allocated size. Having to create the struct and then set
the size afterward in some cases is a tiny bit awkward-looking, but
it's not awful. I do wonder if some coding pattern might creep in
where people create the struct and register it and then try to change
the size, or other fields, afterward. I'm tempted to propose making
the struct non-static and having the registration functions copy it,
so that there can be absolutely no question of getting away with such
antisocial behavior.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Better shared data structure management and resizable shared data structures

Reply via email to