Re: Refactoring backend fork+exec code
On 18/05/2024 08:24, Thomas Munro wrote: Nitpicking from UBSan with EXEC_BACKEND on Linux (line numbers may be a bit off, from a branch of mine): ../src/backend/postmaster/launch_backend.c:772:2: runtime error: null pointer passed as argument 2, which is declared to never be null ==13303==Using libbacktrace symbolizer. #0 0x564b0202 in save_backend_variables ../src/backend/postmaster/launch_backend.c:772 #1 0x564b0242 in internal_forkexec ../src/backend/postmaster/launch_backend.c:311 #2 0x564b0bdd in postmaster_child_launch ../src/backend/postmaster/launch_backend.c:244 #3 0x564b3121 in StartChildProcess ../src/backend/postmaster/postmaster.c:3928 #4 0x564b933a in PostmasterMain ../src/backend/postmaster/postmaster.c:1357 #5 0x562de4ad in main ../src/backend/main/main.c:197 #6 0x7667ad09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) #7 0x55e34279 in _start (/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8e0279) This silences it: - memcpy(param->startup_data, startup_data, startup_data_len); + if (startup_data_len > 0) + memcpy(param->startup_data, startup_data, startup_data_len); Fixed, thanks! On 17/06/2024 21:36, Nathan Bossart wrote: While looking into [0], I noticed that main() still only checks for the --fork prefix, but IIUC commit aafc05d removed all --fork* options except for --forkchild. I've attached a patch to strengthen the check in main(). This is definitely just a nitpick. Fixed this too, thanks! -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
While looking into [0], I noticed that main() still only checks for the --fork prefix, but IIUC commit aafc05d removed all --fork* options except for --forkchild. I've attached a patch to strengthen the check in main(). This is definitely just a nitpick. [0] https://postgr.es/m/CAKAnmmJkZtZAiSryho%3DgYpbvC7H-HNjEDAh16F3SoC9LPu8rqQ%40mail.gmail.com -- nathan diff --git a/src/backend/main/main.c b/src/backend/main/main.c index bfd0c5ed65..4672aab837 100644 --- a/src/backend/main/main.c +++ b/src/backend/main/main.c @@ -185,7 +185,7 @@ main(int argc, char *argv[]) else if (argc > 1 && strcmp(argv[1], "--boot") == 0) BootstrapModeMain(argc, argv, false); #ifdef EXEC_BACKEND - else if (argc > 1 && strncmp(argv[1], "--fork", 6) == 0) + else if (argc > 1 && strncmp(argv[1], "--forkchild", 11) == 0) SubPostmasterMain(argc, argv); #endif else if (argc > 1 && strcmp(argv[1], "--describe-config") == 0)
Re: Refactoring backend fork+exec code
On Mon, Mar 18, 2024 at 10:41 PM Heikki Linnakangas wrote: > Committed, with some final cosmetic cleanups. Thanks everyone! Nitpicking from UBSan with EXEC_BACKEND on Linux (line numbers may be a bit off, from a branch of mine): ../src/backend/postmaster/launch_backend.c:772:2: runtime error: null pointer passed as argument 2, which is declared to never be null ==13303==Using libbacktrace symbolizer. #0 0x564b0202 in save_backend_variables ../src/backend/postmaster/launch_backend.c:772 #1 0x564b0242 in internal_forkexec ../src/backend/postmaster/launch_backend.c:311 #2 0x564b0bdd in postmaster_child_launch ../src/backend/postmaster/launch_backend.c:244 #3 0x564b3121 in StartChildProcess ../src/backend/postmaster/postmaster.c:3928 #4 0x564b933a in PostmasterMain ../src/backend/postmaster/postmaster.c:1357 #5 0x562de4ad in main ../src/backend/main/main.c:197 #6 0x7667ad09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) #7 0x55e34279 in _start (/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8e0279) This silences it: - memcpy(param->startup_data, startup_data, startup_data_len); + if (startup_data_len > 0) + memcpy(param->startup_data, startup_data, startup_data_len); (I found that out by testing EXEC_BACKEND on CI. I also learned that the Mac and FreeBSD tasks fail with EXEC_BACKEND because of SysV shmem bleating. We probably should go and crank up the relevant sysctls in the .cirrus.tasks.yml...)
Re: Refactoring backend fork+exec code
On 28.04.2024 22:36, Heikki Linnakangas wrote: Peter E noticed and Michael fixed them in commit 768ceeeaa1 already. Didn't check that is already fixed in the current master. Sorry! Thanks for pointing this out! With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Refactoring backend fork+exec code
On 27/04/2024 11:27, Anton A. Melnikov wrote: Hello! Maybe add PGDLLIMPORT to extern bool LoadedSSL; and extern struct ClientSocket *MyClientSocket; definitions in the src/include/postmaster/postmaster.h ? Peter E noticed and Michael fixed them in commit 768ceeeaa1 already. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
Hello! Maybe add PGDLLIMPORT to extern bool LoadedSSL; and extern struct ClientSocket *MyClientSocket; definitions in the src/include/postmaster/postmaster.h ? With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Refactoring backend fork+exec code
On Wed, 20 Mar 2024 at 08:16, Heikki Linnakangas wrote: > Yeah, it's not a very valuable assertion. Removed, thanks! How about we add it as a static assert instead of removing it, like we have for many other similar arrays. v1-0001-Add-child_process_kinds-static-assert.patch Description: Binary data
Re: Refactoring backend fork+exec code
On 20/03/2024 07:37, Tom Lane wrote: A couple of buildfarm animals don't like these tests: Assert(child_type >= 0 && child_type < lengthof(child_process_kinds)); for example ayu | 2024-03-19 13:08:05 | launch_backend.c:211:39: warning: comparison of constant 16 with expression of type 'BackendType' (aka 'enum BackendType') is always true [-Wtautological-constant-out-of-range-compare] ayu | 2024-03-19 13:08:05 | launch_backend.c:233:39: warning: comparison of constant 16 with expression of type 'BackendType' (aka 'enum BackendType') is always true [-Wtautological-constant-out-of-range-compare] I'm not real sure why it's moaning about the "<" check but not the ">= 0" check, which ought to be equally tautological given the assumption that the input is a valid member of the enum. But in any case, exactly how much value do these assertions carry? If you're intent on keeping them, perhaps casting child_type to int here would suppress the warnings. But personally I think I'd lose the Asserts. Yeah, it's not a very valuable assertion. Removed, thanks! -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
Heikki Linnakangas writes: > Committed, with some final cosmetic cleanups. Thanks everyone! A couple of buildfarm animals don't like these tests: Assert(child_type >= 0 && child_type < lengthof(child_process_kinds)); for example ayu | 2024-03-19 13:08:05 | launch_backend.c:211:39: warning: comparison of constant 16 with expression of type 'BackendType' (aka 'enum BackendType') is always true [-Wtautological-constant-out-of-range-compare] ayu | 2024-03-19 13:08:05 | launch_backend.c:233:39: warning: comparison of constant 16 with expression of type 'BackendType' (aka 'enum BackendType') is always true [-Wtautological-constant-out-of-range-compare] I'm not real sure why it's moaning about the "<" check but not the ">= 0" check, which ought to be equally tautological given the assumption that the input is a valid member of the enum. But in any case, exactly how much value do these assertions carry? If you're intent on keeping them, perhaps casting child_type to int here would suppress the warnings. But personally I think I'd lose the Asserts. regards, tom lane
Re: Refactoring backend fork+exec code
On 13/03/2024 09:30, Heikki Linnakangas wrote: Attached is a new version of the remaining patches. Committed, with some final cosmetic cleanups. Thanks everyone! -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On Mon Mar 4, 2024 at 3:05 AM CST, Heikki Linnakangas wrote: I've now completed many of the side-quests, here are the patches that remain. The first three patches form a logical unit. They move the initialization of the Port struct from postmaster to the backend process. Currently, that work is split between the postmaster and the backend process so that postmaster fills in the socket and some other fields, and the backend process fills the rest after reading the startup packet. With these patches, there is a new much smaller ClientSocket struct that is passed from the postmaster to the child process, which contains just the fields that postmaster initializes. The Port struct is allocated in the child process. That makes the backend startup easier to understand. I plan to commit those three patches next if there are no objections. That leaves the rest of the patches. I think they're in pretty good shape too, and I've gotten some review on those earlier and have addressed the comments I got so far, but would still appreciate another round of review. - * *MyProcPort, because ConnCreate() allocated that space with malloc() - * ... else we'd need to copy the Port data first. Also, subsidiary data - * such as the username isn't lost either; see ProcessStartupPacket(). + * *MyProcPort, because that space is allocated in stack ... else we'd + * need to copy the Port data first. Also, subsidiary data such as the + * username isn't lost either; see ProcessStartupPacket(). s/allocated in/allocated on the The first 3 patches seem good to go, in my opinion. @@ -225,14 +331,13 @@ internal_forkexec(int argc, char *argv[], ClientSocket *client_sock, BackgroundW return -1; } -/* Make sure caller set up argv properly */ -Assert(argc >= 3); -Assert(argv[argc] == NULL); -Assert(strncmp(argv[1], "--fork", 6) == 0); -Assert(argv[2] == NULL); - -/* Insert temp file name after --fork argument */ +/* set up argv properly */ +argv[0] = "postgres"; +snprintf(forkav, MAXPGPATH, "--forkchild=%s", child_kind); +argv[1] = forkav; +/* Insert temp file name after --forkchild argument */ argv[2] = tmpfilename; +argv[3] = NULL; Should we use postgres_exec_path instead of the naked "postgres" here? +/* in postmaster, fork failed ... */ +ereport(LOG, +(errmsg("could not fork worker process: %m"))); +/* undo what assign_backendlist_entry did */ +ReleasePostmasterChildSlot(rw->rw_child_slot); +rw->rw_child_slot = 0; +pfree(rw->rw_backend); +rw->rw_backend = NULL; +/* mark entry as crashed, so we'll try again later */ +rw->rw_crashed_at = GetCurrentTimestamp(); +return false; I think the error message should include the word "background." It would be more consistent with the log message above it. +typedef struct +{ +intsyslogFile; +intcsvlogFile; +intjsonlogFile; +} syslogger_startup_data; It would be nice if all of these startup data structs were named similarly. For instance, a previous one was BackendStartupInfo. It would help with greppability. I noticed there were a few XXX comments left that you created. I'll highlight them here for more visibility. +/* XXX: where does this belong? */ +extern bool LoadedSSL; Perhaps near the My* variables or maybe in the Port struct? +#ifdef EXEC_BACKEND + +/* + * Need to reinitialize the SSL library in the backend, since the context + * structures contain function pointers and cannot be passed through the + * parameter file. + * + * If for some reason reload fails (maybe the user installed broken key + * files), soldier on without SSL; that's better than all connections + * becoming impossible. + * + * XXX should we do this in all child processes? For the moment it's + * enough to do it in backend children. XXX good question indeed + */ +#ifdef USE_SSL +if (EnableSSL) +{ +if (secure_initialize(false) == 0) +LoadedSSL = true; +else +ereport(LOG, +(errmsg("SSL configuration could not be loaded in child process"))); +} +#endif +#endif Here you added the "good question indeed." I am not sure what the best answer is either! :) +/* XXX: translation? */ +ereport(LOG, +(errmsg("could not fork %s process: %m", PostmasterChildName(type; I assume you are referring to the child name here?
Re: Refactoring backend fork+exec code
On 05/03/2024 11:44, Richard Guo wrote: I noticed that there are still three places in backend_status.c where pgstat_get_beentry_by_backend_id() is referenced. I think we should replace them with pgstat_get_beentry_by_proc_number(). Fixed, thanks! -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On Mon, Mar 4, 2024 at 1:40 AM Heikki Linnakangas wrote: > On 22/02/2024 02:37, Heikki Linnakangas wrote: > > Here's another patch version that does that. Yeah, I agree it's nicer in > > the end. > > > > I'm pretty happy with this now. I'll read through these patches myself > > again after sleeping over it and try to get this committed by the end of > > the week, but another pair of eyes wouldn't hurt. > > And pushed. Thanks for the reviews! I noticed that there are still three places in backend_status.c where pgstat_get_beentry_by_backend_id() is referenced. I think we should replace them with pgstat_get_beentry_by_proc_number(). Thanks Richard
Re: Refactoring backend fork+exec code
On 22/02/2024 02:37, Heikki Linnakangas wrote: On 15/02/2024 07:09, Robert Haas wrote: On Thu, Feb 15, 2024 at 3:07 AM Andres Freund wrote: I think the last remaining question here is about the 0- vs 1-based indexing of BackendIds. Is it a good idea to switch to 0-based indexing? And if we do it, should we reserve PGPROC 0. I'm on the fence on this one. I lean towards it being a good idea. Having two internal indexing schemes was bad enough so far, but at least one would fairly quickly notice if one used the wrong one. If they're just offset by 1, it might end up taking longer, because that'll often also be a valid id. Yeah, I think making everything 0-based is probably the best way forward long term. It might require more cleanup work to get there, but it's just a lot simpler in the end, IMHO. Here's another patch version that does that. Yeah, I agree it's nicer in the end. I'm pretty happy with this now. I'll read through these patches myself again after sleeping over it and try to get this committed by the end of the week, but another pair of eyes wouldn't hurt. And pushed. Thanks for the reviews! -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On Thu, Feb 15, 2024 at 3:07 AM Andres Freund wrote: > > I think the last remaining question here is about the 0- vs 1-based indexing > > of BackendIds. Is it a good idea to switch to 0-based indexing? And if we do > > it, should we reserve PGPROC 0. I'm on the fence on this one. > > I lean towards it being a good idea. Having two internal indexing schemes was > bad enough so far, but at least one would fairly quickly notice if one used > the wrong one. If they're just offset by 1, it might end up taking longer, > because that'll often also be a valid id. Yeah, I think making everything 0-based is probably the best way forward long term. It might require more cleanup work to get there, but it's just a lot simpler in the end, IMHO. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Refactoring backend fork+exec code
Hi, On 2024-02-08 13:19:53 +0200, Heikki Linnakangas wrote: > > > - /* > > > - * Assign the ProcSignalSlot for an auxiliary process. Since it doesn't > > > - * have a BackendId, the slot is statically allocated based on the > > > - * auxiliary process type (MyAuxProcType). Backends use slots indexed > > > in > > > - * the range from 1 to MaxBackends (inclusive), so we use MaxBackends + > > > - * AuxProcType + 1 as the index of the slot for an auxiliary process. > > > - * > > > - * This will need rethinking if we ever want more than one of a > > > particular > > > - * auxiliary process type. > > > - */ > > > - ProcSignalInit(MaxBackends + MyAuxProcType + 1); > > > + ProcSignalInit(); > > > > Now that we don't need the offset here, we could move ProcSignalInit() into > > BsaeInit() I think? > > Hmm, doesn't feel right to me. BaseInit() is mostly concerned with setting > up backend-private structures, and it's also called for a standalone > backend. It already initializes a lot of shared subsystems (pgstat, replication slots and arguable things like the buffer pool, temporary file access and WAL). And note that it already requires that MyProc is already set (but it's not yet "added" to the procarray, i.e. doesn't do visibility stuff at that stage). I don't think that BaseInit() being called by standalone backends really poses a problem? So is InitPostgres(), which does call ProcSignalInit() in standalone processes. My mental model is that BaseInit() is for stuff that's shared between processes that do attach to databases and those that don't. Right now the initialization flow is something like this ascii diagram: standalone: \ /-> StartupXLOG() \ -> InitProcess() -\ /-> ProcArrayAdd() -> SharedInvalBackendInit() -> ProcSignalInit()- -> pgstat_beinit() -> attach to db -> pgstat_bestart() normal backend: /\ / -> BaseInit() - aux process:InitAuxiliaryProcess() -/ \-- -> ProcSignalInit()-> pgstat_beinit() -> pgstat_bestart() The only reason ProcSignalInit() happens kinda late is that historically we used BackendIds as the index, which were only assigned in SharedInvalBackendInit() for normal processes. But that doesn't make sense anymore after your changes. Similarly, we do pgstat_beinit() quite late, but that's again only because it uses MyBackendId, which today is only assigned during SharedInvalBackendInit(). I don't think we can do pgstat_bestart() earlier though, which is a shame, given the four calls to it inside InitPostgres(). > I feel the process initialization codepaths could use some cleanup in > general. Not sure what exactly. Very much agreed. > > > +/* > > > + * BackendIdGetProc -- get a backend's PGPROC given its backend ID > > > + * > > > + * The result may be out of date arbitrarily quickly, so the caller > > > + * must be careful about how this information is used. NULL is > > > + * returned if the backend is not active. > > > + */ > > > +PGPROC * > > > +BackendIdGetProc(int backendID) > > > +{ > > > + PGPROC *result; > > > + > > > + if (backendID < 1 || backendID > ProcGlobal->allProcCount) > > > + return NULL; > > > > Hm, doesn't calling BackendIdGetProc() with these values a bug? That's not > > about being out of date or such. > > Perhaps. I just followed the example of the old implementation, which also > returns NULL on bogus inputs. Fair enough. Makes it harder to not notice bugs, but that's not on this patchset to fix... > I think the last remaining question here is about the 0- vs 1-based indexing > of BackendIds. Is it a good idea to switch to 0-based indexing? And if we do > it, should we reserve PGPROC 0. I'm on the fence on this one. I lean towards it being a good idea. Having two internal indexing schemes was bad enough so far, but at least one would fairly quickly notice if one used the wrong one. If they're just offset by 1, it might end up taking longer, because that'll often also be a valid id. But I think you have the author's prerogative on this one. If we do so, I think it might be better to standardize on MyProcNumber instead of MyBackendId. That'll force looking at code where indexing shifts by 1 - and it also seems more descriptive, as inside postgres it's imo clearer what a "proc number" is than what a "backend id" is. Particularly because the latter is also used for things that aren't backends... The only exception are SQL level users, for those I think it might make sense to keep the current 1 based indexing, there's just a few functions where we'd need to translate. > @@ -791,6 +792,7 @@ ProcArrayEndTransactionInternal(PGPROC *proc, > TransactionId latestXid) > static vo
Re: Refactoring backend fork+exec code
On 07/02/2024 20:25, Andres Freund wrote: On 2024-01-30 02:08:36 +0200, Heikki Linnakangas wrote: From 54f22231bb2540fc5957c14005956161e6fc9dac Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 24 Jan 2024 23:15:55 +0200 Subject: [PATCH v8 1/5] Remove superfluous 'pgprocno' field from PGPROC It was always just the index of the PGPROC entry from the beginning of the proc array. Introduce a macro to compute it from the pointer instead. Hm. The pointer math here is bit more expensive than in some other cases, as the struct is fairly large and sizeof(PGPROC) isn't a power of two. Adding more math into loops like in TransactionGroupUpdateXidStatus() might end up showing up. I added a MyProcNumber global variable that is set to GetNumberFromPGProc(MyProc). I'm not really concerned about the extra math, but with MyProcNumber it should definitely not be an issue. The few GetNumberFromPGProc() invocations that remain are in less performance-critical paths. (In later patch, I switch backend ids to 0-based indexing, which replaces MyProcNumber references with MyBackendId) Is this really related to the rest of the series? It's not strictly necessary, but it felt prudent to remove it now, since I'm removing the backendID field too. You can now convert a backend ID into an index into the PGPROC array simply by subtracting 1. We still use 0-based "pgprocnos" in various places, for indexes into the PGPROC array, but the only difference now is that backend IDs start at 1 while pgprocnos start at 0. Why aren't we using 0-based indexing for both? InvalidBackendId is -1, so there'd not be a conflict, right? Correct. I was being conservative and didn't dare to change the old convention. The backend ids are visible in a few places like "pg_temp_0" schema names, and pg_stat_get_*() functions. One alternative would be to reserve and waste allProcs[0]. Then pgprocno and backend ID could both be direct indexes to the array, but 0 would not be used. If we switch to 0-based indexing, it begs the question: why don't we merge the concepts of "pgprocno" and "BackendId" completely and call it the same thing everywhere? It probably would be best in the long run. It feels like a lot of churn though. Anyway, I switched to 0-based indexing in the attached new version, to see what it looks like. @@ -457,7 +442,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid, Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE)); Assert(gxact != NULL); - proc = &ProcGlobal->allProcs[gxact->pgprocno]; + proc = GetPGProcByNumber(gxact->pgprocno); /* Initialize the PGPROC entry */ MemSet(proc, 0, sizeof(PGPROC)); This set of changes is independent of this commit, isn't it? Yes. It's just for symmetry, now that we use GetNumberFromPGProc() to get the pgprocno. diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c index ab86e802f21..39171fea06b 100644 --- a/src/backend/postmaster/auxprocess.c +++ b/src/backend/postmaster/auxprocess.c @@ -107,17 +107,7 @@ AuxiliaryProcessMain(AuxProcType auxtype) BaseInit(); - /* -* Assign the ProcSignalSlot for an auxiliary process. Since it doesn't -* have a BackendId, the slot is statically allocated based on the -* auxiliary process type (MyAuxProcType). Backends use slots indexed in -* the range from 1 to MaxBackends (inclusive), so we use MaxBackends + -* AuxProcType + 1 as the index of the slot for an auxiliary process. -* -* This will need rethinking if we ever want more than one of a particular -* auxiliary process type. -*/ - ProcSignalInit(MaxBackends + MyAuxProcType + 1); + ProcSignalInit(); Now that we don't need the offset here, we could move ProcSignalInit() into BsaeInit() I think? Hmm, doesn't feel right to me. BaseInit() is mostly concerned with setting up backend-private structures, and it's also called for a standalone backend. I feel the process initialization codepaths could use some cleanup in general. Not sure what exactly. +/* + * BackendIdGetProc -- get a backend's PGPROC given its backend ID + * + * The result may be out of date arbitrarily quickly, so the caller + * must be careful about how this information is used. NULL is + * returned if the backend is not active. + */ +PGPROC * +BackendIdGetProc(int backendID) +{ + PGPROC *result; + + if (backendID < 1 || backendID > ProcGlobal->allProcCount) + return NULL; Hm, doesn't calling BackendIdGetProc() with these values a bug? That's not about being out of date or such. Perhaps. I just followed the example of the old implementation, which also returns NULL on bogus inputs. +/* + * BackendIdGetTransactionIds -- get a backend's transaction status + * + * Get the xid, xmin, nsubxid and overflow status of the backend. Th
Re: Refactoring backend fork+exec code
Hi, On 2024-01-30 02:08:36 +0200, Heikki Linnakangas wrote: > I spent some more time on the 'lastBackend' optimization in sinvaladt.c. I > realized that it became very useless with these patches, because aux > processes are allocated pgprocno's after all the slots for regular backends. > There are always aux processes running, so lastBackend would always have a > value close to the max anyway. I replaced that with a dense 'pgprocnos' > array that keeps track of the exact slots that are in use. I'm not 100% sure > this is worth the effort; does any real world workload send shared > invalidations so frequently that this matters? In any case, this should > avoid the regression if such a workload exists. > > New patch set attached. I think these are ready to be committed, but would > appreciate a final review. > From 54f22231bb2540fc5957c14005956161e6fc9dac Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Wed, 24 Jan 2024 23:15:55 +0200 > Subject: [PATCH v8 1/5] Remove superfluous 'pgprocno' field from PGPROC > > It was always just the index of the PGPROC entry from the beginning of > the proc array. Introduce a macro to compute it from the pointer > instead. Hm. The pointer math here is bit more expensive than in some other cases, as the struct is fairly large and sizeof(PGPROC) isn't a power of two. Adding more math into loops like in TransactionGroupUpdateXidStatus() might end up showing up. I've been thinking that we likely should pad PGPROC to some more convenient boundary, but... Is this really related to the rest of the series? > From 4e0121e064804b73ef8a5dc10be27b85968ea1af Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Mon, 29 Jan 2024 23:50:34 +0200 > Subject: [PATCH v8 2/5] Redefine backend ID to be an index into the proc > array. > > Previously, backend ID was an index into the ProcState array, in the > shared cache invalidation manager (sinvaladt.c). The entry in the > ProcState array was reserved at backend startup by scanning the array > for a free entry, and that was also when the backend got its backend > ID. Things becomes slightly simpler if we redefine backend ID to be > the index into the PGPROC array, and directly use it also as an index > to the ProcState array. This uses a little more memory, as we reserve > a few extra slots in the ProcState array for aux processes that don't > need them, but the simplicity is worth it. > Aux processes now also have a backend ID. This simplifies the > reservation of BackendStatusArray and ProcSignal slots. > > You can now convert a backend ID into an index into the PGPROC array > simply by subtracting 1. We still use 0-based "pgprocnos" in various > places, for indexes into the PGPROC array, but the only difference now > is that backend IDs start at 1 while pgprocnos start at 0. Why aren't we using 0-based indexing for both? InvalidBackendId is -1, so there'd not be a conflict, right? > One potential downside of this patch is that the ProcState array might > get less densely packed, as we we don't try so hard to assign > low-numbered backend ID anymore. If it's less densely packed, > lastBackend will stay at a higher value, and SIInsertDataEntries() and > SICleanupQueue() need to scan over more unused entries. I think that's > fine. They are performance critical enough to matter, and there was no > guarantee on dense packing before either: If you launched a lot of > backends concurrently, and kept the last one open, lastBackend would > also stay at a high value. It's perhaps worth noting here that there's a future patch that also addresses this to some degree? > @@ -457,7 +442,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, > TransactionId xid, const char *gid, > Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE)); > > Assert(gxact != NULL); > - proc = &ProcGlobal->allProcs[gxact->pgprocno]; > + proc = GetPGProcByNumber(gxact->pgprocno); > > /* Initialize the PGPROC entry */ > MemSet(proc, 0, sizeof(PGPROC)); This set of changes is independent of this commit, isn't it? > diff --git a/src/backend/postmaster/auxprocess.c > b/src/backend/postmaster/auxprocess.c > index ab86e802f21..39171fea06b 100644 > --- a/src/backend/postmaster/auxprocess.c > +++ b/src/backend/postmaster/auxprocess.c > @@ -107,17 +107,7 @@ AuxiliaryProcessMain(AuxProcType auxtype) > > BaseInit(); > > - /* > - * Assign the ProcSignalSlot for an auxiliary process. Since it doesn't > - * have a BackendId, the slot is statically allocated based on the > - * auxiliary process type (MyAuxProcType). Backends use slots indexed > in > - * the range from 1 to MaxBackends (inclusive), so we use MaxBackends + > - * AuxProcType + 1 as the index of the slot for an auxiliary process. > - * > - * This will need rethinking if we ever want more than one of a > particular > - * auxiliary process type. > - */ > - ProcSignalInit(MaxBackends + MyAuxProcTy
Re: Refactoring backend fork+exec code
On 30/01/2024 02:08, Heikki Linnakangas wrote: On 29/01/2024 17:54, reid.thomp...@crunchydata.com wrote: On Thu, 2024-01-25 at 01:51 +0200, Heikki Linnakangas wrote: And here we go. BackendID is now a 1-based index directly into the PGPROC array. Would it be worthwhile to also note in this comment FIRST_AUX_PROC's and IsAuxProcess()'s dependency on B_ARCHIVER and it's location in the enum table? Yeah, that might be in order. Although looking closer, it's only used in IsAuxProcess, which is only used in one sanity check in AuxProcessMain(). And even that gets refactored away by the later patches in this thread. So on second thoughts, I'll just remove it altogether. I spent some more time on the 'lastBackend' optimization in sinvaladt.c. I realized that it became very useless with these patches, because aux processes are allocated pgprocno's after all the slots for regular backends. There are always aux processes running, so lastBackend would always have a value close to the max anyway. I replaced that with a dense 'pgprocnos' array that keeps track of the exact slots that are in use. I'm not 100% sure this is worth the effort; does any real world workload send shared invalidations so frequently that this matters? In any case, this should avoid the regression if such a workload exists. New patch set attached. I think these are ready to be committed, but would appreciate a final review. contrib/amcheck 003_cic_2pc.pl test failures revealed a bug that required some reworking: In a PGPROC entry for a prepared xact, the PGPROC's backendID needs to be the original backend's ID, because the prepared xact is holding the lock on the original virtual transaction id. When a transaction's ownership is moved from the original backend's PGPROC entry to the prepared xact PGPROC entry, the backendID needs to be copied over. My patch removed the field altogether, so it was not copied over, which made it look like it the original VXID lock was released at prepare. I fixed that by adding back the backendID field. For regular backends, it's always equal to pgprocno + 1, but for prepared xacts, it's the original backend's ID. To make that less confusing, I moved the backendID and lxid fields together under a 'vxid' struct. The two fields together form the virtual transaction ID, and that's the only context where the 'backendID' field should now be looked at. I also squashed the 'lastBackend' changes in sinvaladt.c to the main patch. -- Heikki Linnakangas Neon (https://neon.tech) From 96c583b32db843fb07d38fd78f1e205882a78b01 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 24 Jan 2024 23:15:55 +0200 Subject: [PATCH v9 1/4] Remove superfluous 'pgprocno' field from PGPROC It was always just the index of the PGPROC entry from the beginning of the proc array. Introduce a macro to compute it from the pointer instead. Reviewed-by: XXX Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407%40iki.fi --- src/backend/access/transam/clog.c | 4 ++-- src/backend/access/transam/twophase.c | 11 +-- src/backend/access/transam/xlog.c | 2 +- src/backend/postmaster/bgwriter.c | 2 +- src/backend/postmaster/pgarch.c | 2 +- src/backend/postmaster/walsummarizer.c| 2 +- src/backend/storage/buffer/bufmgr.c | 6 +++--- src/backend/storage/ipc/procarray.c | 6 +++--- src/backend/storage/lmgr/condition_variable.c | 12 ++-- src/backend/storage/lmgr/lwlock.c | 6 +++--- src/backend/storage/lmgr/predicate.c | 2 +- src/backend/storage/lmgr/proc.c | 1 - src/include/storage/lock.h| 2 +- src/include/storage/proc.h| 6 +- 14 files changed, 29 insertions(+), 35 deletions(-) diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index f6e7da7ffc9..7550309c25a 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -458,7 +458,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status, * less efficiently. */ if (nextidx != INVALID_PGPROCNO && - ProcGlobal->allProcs[nextidx].clogGroupMemberPage != proc->clogGroupMemberPage) + GetPGProcByNumber(nextidx)->clogGroupMemberPage != proc->clogGroupMemberPage) { /* * Ensure that this proc is not a member of any clog group that @@ -473,7 +473,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status, if (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst, &nextidx, - (uint32) proc->pgprocno)) + (uint32) GetNumberFromPGProc(proc))) break; } diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 8426458f7f5..234c8d08ebc 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -284,7 +284,7 @@ Tw
Re: Refactoring backend fork+exec code
On 29/01/2024 17:54, reid.thomp...@crunchydata.com wrote: On Thu, 2024-01-25 at 01:51 +0200, Heikki Linnakangas wrote: And here we go. BackendID is now a 1-based index directly into the PGPROC array. Would it be worthwhile to also note in this comment FIRST_AUX_PROC's and IsAuxProcess()'s dependency on B_ARCHIVER and it's location in the enum table? Yeah, that might be in order. Although looking closer, it's only used in IsAuxProcess, which is only used in one sanity check in AuxProcessMain(). And even that gets refactored away by the later patches in this thread. So on second thoughts, I'll just remove it altogether. I spent some more time on the 'lastBackend' optimization in sinvaladt.c. I realized that it became very useless with these patches, because aux processes are allocated pgprocno's after all the slots for regular backends. There are always aux processes running, so lastBackend would always have a value close to the max anyway. I replaced that with a dense 'pgprocnos' array that keeps track of the exact slots that are in use. I'm not 100% sure this is worth the effort; does any real world workload send shared invalidations so frequently that this matters? In any case, this should avoid the regression if such a workload exists. New patch set attached. I think these are ready to be committed, but would appreciate a final review. -- Heikki Linnakangas Neon (https://neon.tech) From 54f22231bb2540fc5957c14005956161e6fc9dac Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 24 Jan 2024 23:15:55 +0200 Subject: [PATCH v8 1/5] Remove superfluous 'pgprocno' field from PGPROC It was always just the index of the PGPROC entry from the beginning of the proc array. Introduce a macro to compute it from the pointer instead. Reviewed-by: XXX Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407%40iki.fi --- src/backend/access/transam/clog.c | 4 ++-- src/backend/access/transam/twophase.c | 11 +-- src/backend/access/transam/xlog.c | 2 +- src/backend/postmaster/bgwriter.c | 2 +- src/backend/postmaster/pgarch.c | 2 +- src/backend/postmaster/walsummarizer.c| 2 +- src/backend/storage/buffer/bufmgr.c | 6 +++--- src/backend/storage/ipc/procarray.c | 6 +++--- src/backend/storage/lmgr/condition_variable.c | 12 ++-- src/backend/storage/lmgr/lwlock.c | 6 +++--- src/backend/storage/lmgr/predicate.c | 2 +- src/backend/storage/lmgr/proc.c | 1 - src/include/storage/lock.h| 2 +- src/include/storage/proc.h| 6 +- 14 files changed, 29 insertions(+), 35 deletions(-) diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index f6e7da7ffc9..7550309c25a 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -458,7 +458,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status, * less efficiently. */ if (nextidx != INVALID_PGPROCNO && - ProcGlobal->allProcs[nextidx].clogGroupMemberPage != proc->clogGroupMemberPage) + GetPGProcByNumber(nextidx)->clogGroupMemberPage != proc->clogGroupMemberPage) { /* * Ensure that this proc is not a member of any clog group that @@ -473,7 +473,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status, if (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst, &nextidx, - (uint32) proc->pgprocno)) + (uint32) GetNumberFromPGProc(proc))) break; } diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 8426458f7f5..234c8d08ebc 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -284,7 +284,7 @@ TwoPhaseShmemInit(void) TwoPhaseState->freeGXacts = &gxacts[i]; /* associate it with a PGPROC assigned by InitProcGlobal */ - gxacts[i].pgprocno = PreparedXactProcs[i].pgprocno; + gxacts[i].pgprocno = GetNumberFromPGProc(&PreparedXactProcs[i]); /* * Assign a unique ID for each dummy proc, so that the range of @@ -461,7 +461,6 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid, /* Initialize the PGPROC entry */ MemSet(proc, 0, sizeof(PGPROC)); - proc->pgprocno = gxact->pgprocno; dlist_node_init(&proc->links); proc->waitStatus = PROC_WAIT_STATUS_OK; if (LocalTransactionIdIsValid(MyProc->lxid)) @@ -780,7 +779,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS) while (status->array != NULL && status->currIdx < status->ngxacts) { GlobalTransaction gxact = &status->array[status->currIdx++]; - PGPROC *proc = &ProcGlobal->allProcs[gxact->pgprocno]; + PGPROC *proc = GetPGProcByNumber(gxact->pgprocno); Datum values[5] = {0}; bool nulls[5] = {0}; HeapTuple tuple; @@ -935,7 +934,7 @@ TwoPhaseGetDummyProc(Transaction
Re: Refactoring backend fork+exec code
On Thu, 2024-01-25 at 01:51 +0200, Heikki Linnakangas wrote: > > And here we go. BackendID is now a 1-based index directly into the > PGPROC array. > Would it be worthwhile to also note in this comment FIRST_AUX_PROC's and IsAuxProcess()'s dependency on B_ARCHIVER and it's location in the enum table? /* ¦* Auxiliary processes. These have PGPROC entries, but they are not ¦* attached to any particular database. There can be only one of each of ¦* these running at a time. ¦* ¦* If you modify these, make sure to update NUM_AUXILIARY_PROCS and the ¦* glossary in the docs. ¦*/ B_ARCHIVER, B_BG_WRITER, B_CHECKPOINTER, B_STARTUP, B_WAL_RECEIVER, B_WAL_SUMMARIZER, B_WAL_WRITER, } BackendType; #define BACKEND_NUM_TYPES (B_WAL_WRITER + 1) extern PGDLLIMPORT BackendType MyBackendType; #define FIRST_AUX_PROC B_ARCHIVER #define IsAuxProcess(type) (MyBackendType >= FIRST_AUX_PROC)
Re: Refactoring backend fork+exec code
On 23/01/2024 21:50, Andres Freund wrote: On 2024-01-23 21:07:08 +0200, Heikki Linnakangas wrote: On 22/01/2024 23:07, Andres Freund wrote: diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c index 1a1050c8da1..92f24db4e18 100644 --- a/src/backend/utils/activity/backend_status.c +++ b/src/backend/utils/activity/backend_status.c @@ -257,17 +257,16 @@ pgstat_beinit(void) else { /* Must be an auxiliary process */ - Assert(MyAuxProcType != NotAnAuxProcess); + Assert(IsAuxProcess(MyBackendType)); /* * Assign the MyBEEntry for an auxiliary process. Since it doesn't * have a BackendId, the slot is statically allocated based on the -* auxiliary process type (MyAuxProcType). Backends use slots indexed -* in the range from 0 to MaxBackends (exclusive), so we use -* MaxBackends + AuxProcType as the index of the slot for an auxiliary -* process. +* auxiliary process type. Backends use slots indexed in the range +* from 0 to MaxBackends (exclusive), and aux processes use the slots +* after that. */ - MyBEEntry = &BackendStatusArray[MaxBackends + MyAuxProcType]; + MyBEEntry = &BackendStatusArray[MaxBackends + MyBackendType - FIRST_AUX_PROC]; } Hm, this seems less than pretty. It's probably ok for now, but it seems like a better fix might be to just start assigning backend ids to aux procs or switch to indexing by pgprocno? Using pgprocno is a good idea. Come to think of it, why do we even have a concept of backend ID that's separate from pgprocno? backend ID is used to index the ProcState array, but AFAICS we could use pgprocno as the index to that, too. I think we should do that. There are a few processes not participating in sinval, but it doesn't make enough of a difference to make sinval slower. And I think there'd be far bigger efficiency improvements to sinvaladt than not having a handful more entries. And here we go. BackendID is now a 1-based index directly into the PGPROC array. -- Heikki Linnakangas Neon (https://neon.tech) From 87db35c877f4492b98848dfcd0baa49cc8100c1b Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 24 Jan 2024 23:15:55 +0200 Subject: [PATCH v7 1/4] Remove superfluous 'pgprocno' field from PGPROC It was always just the index of the PGPROC entry from the beginning of the proc array. Introduce a macro to compute it from the pointer instead. Reviewed-by: XXX Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407%40iki.fi --- src/backend/access/transam/clog.c | 4 ++-- src/backend/access/transam/twophase.c | 11 +-- src/backend/access/transam/xlog.c | 2 +- src/backend/postmaster/bgwriter.c | 2 +- src/backend/postmaster/pgarch.c | 2 +- src/backend/postmaster/walsummarizer.c| 2 +- src/backend/storage/buffer/bufmgr.c | 6 +++--- src/backend/storage/ipc/procarray.c | 6 +++--- src/backend/storage/lmgr/condition_variable.c | 12 ++-- src/backend/storage/lmgr/lwlock.c | 6 +++--- src/backend/storage/lmgr/predicate.c | 2 +- src/backend/storage/lmgr/proc.c | 1 - src/include/storage/lock.h| 2 +- src/include/storage/proc.h| 6 +- 14 files changed, 29 insertions(+), 35 deletions(-) diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index f6e7da7ffc9..7550309c25a 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -458,7 +458,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status, * less efficiently. */ if (nextidx != INVALID_PGPROCNO && - ProcGlobal->allProcs[nextidx].clogGroupMemberPage != proc->clogGroupMemberPage) + GetPGProcByNumber(nextidx)->clogGroupMemberPage != proc->clogGroupMemberPage) { /* * Ensure that this proc is not a member of any clog group that @@ -473,7 +473,7 @@ TransactionGroupUpdateXidStatus(TransactionId xid, XidStatus status, if (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst, &nextidx, - (uint32) proc->pgprocno)) + (uint32) GetNumberFromPGProc(proc))) break; } diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 8426458f7f5..234c8d08ebc 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -284,7 +284,7 @@ TwoPhaseShmemInit(void) TwoPhaseState->freeGXacts = &gxacts[i]; /* associate it with a PGPROC assigned by InitProcGlobal */ - gxacts[i].pgprocno = PreparedXactProcs[i].pgprocno; + gxacts[i].pgprocno = GetNumberFromPG
Re: Refactoring backend fork+exec code
Hi, On 2024-01-23 21:07:08 +0200, Heikki Linnakangas wrote: > On 22/01/2024 23:07, Andres Freund wrote: > > > diff --git a/src/backend/utils/activity/backend_status.c > > > b/src/backend/utils/activity/backend_status.c > > > index 1a1050c8da1..92f24db4e18 100644 > > > --- a/src/backend/utils/activity/backend_status.c > > > +++ b/src/backend/utils/activity/backend_status.c > > > @@ -257,17 +257,16 @@ pgstat_beinit(void) > > > else > > > { > > > /* Must be an auxiliary process */ > > > - Assert(MyAuxProcType != NotAnAuxProcess); > > > + Assert(IsAuxProcess(MyBackendType)); > > > /* > > >* Assign the MyBEEntry for an auxiliary process. > > > Since it doesn't > > >* have a BackendId, the slot is statically allocated > > > based on the > > > - * auxiliary process type (MyAuxProcType). Backends use slots > > > indexed > > > - * in the range from 0 to MaxBackends (exclusive), so we use > > > - * MaxBackends + AuxProcType as the index of the slot for an > > > auxiliary > > > - * process. > > > + * auxiliary process type. Backends use slots indexed in the > > > range > > > + * from 0 to MaxBackends (exclusive), and aux processes use the > > > slots > > > + * after that. > > >*/ > > > - MyBEEntry = &BackendStatusArray[MaxBackends + MyAuxProcType]; > > > + MyBEEntry = &BackendStatusArray[MaxBackends + MyBackendType - > > > FIRST_AUX_PROC]; > > > } > > > > Hm, this seems less than pretty. It's probably ok for now, but it seems > > like a > > better fix might be to just start assigning backend ids to aux procs or > > switch > > to indexing by pgprocno? > > Using pgprocno is a good idea. Come to think of it, why do we even have a > concept of backend ID that's separate from pgprocno? backend ID is used to > index the ProcState array, but AFAICS we could use pgprocno as the index to > that, too. I think we should do that. There are a few processes not participating in sinval, but it doesn't make enough of a difference to make sinval slower. And I think there'd be far bigger efficiency improvements to sinvaladt than not having a handful more entries. Greetings, Andres Freund
Re: Refactoring backend fork+exec code
On 22/01/2024 23:07, Andres Freund wrote: On 2024-01-10 14:35:52 +0200, Heikki Linnakangas wrote: @@ -5344,31 +5344,31 @@ StartChildProcess(AuxProcType type) errno = save_errno; switch (type) { - case StartupProcess: + case B_STARTUP: ereport(LOG, (errmsg("could not fork startup process: %m"))); break; - case ArchiverProcess: + case B_ARCHIVER: ereport(LOG, (errmsg("could not fork archiver process: %m"))); break; - case BgWriterProcess: + case B_BG_WRITER: ereport(LOG, (errmsg("could not fork background writer process: %m"))); break; - case CheckpointerProcess: + case B_CHECKPOINTER: ereport(LOG, (errmsg("could not fork checkpointer process: %m"))); break; - case WalWriterProcess: + case B_WAL_WRITER: ereport(LOG, (errmsg("could not fork WAL writer process: %m"))); break; - case WalReceiverProcess: + case B_WAL_RECEIVER: ereport(LOG, (errmsg("could not fork WAL receiver process: %m"))); break; - case WalSummarizerProcess: + case B_WAL_SUMMARIZER: ereport(LOG, (errmsg("could not fork WAL summarizer process: %m"))); break; Seems we should replace this with something slightly more generic one of these days... The later patches in this thread will turn these into ereport(LOG, (errmsg("could not fork %s process: %m", PostmasterChildName(type; diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c index 1a1050c8da1..92f24db4e18 100644 --- a/src/backend/utils/activity/backend_status.c +++ b/src/backend/utils/activity/backend_status.c @@ -257,17 +257,16 @@ pgstat_beinit(void) else { /* Must be an auxiliary process */ - Assert(MyAuxProcType != NotAnAuxProcess); + Assert(IsAuxProcess(MyBackendType)); /* * Assign the MyBEEntry for an auxiliary process. Since it doesn't * have a BackendId, the slot is statically allocated based on the -* auxiliary process type (MyAuxProcType). Backends use slots indexed -* in the range from 0 to MaxBackends (exclusive), so we use -* MaxBackends + AuxProcType as the index of the slot for an auxiliary -* process. +* auxiliary process type. Backends use slots indexed in the range +* from 0 to MaxBackends (exclusive), and aux processes use the slots +* after that. */ - MyBEEntry = &BackendStatusArray[MaxBackends + MyAuxProcType]; + MyBEEntry = &BackendStatusArray[MaxBackends + MyBackendType - FIRST_AUX_PROC]; } Hm, this seems less than pretty. It's probably ok for now, but it seems like a better fix might be to just start assigning backend ids to aux procs or switch to indexing by pgprocno? Using pgprocno is a good idea. Come to think of it, why do we even have a concept of backend ID that's separate from pgprocno? backend ID is used to index the ProcState array, but AFAICS we could use pgprocno as the index to that, too. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
Hi, On 2024-01-10 14:35:52 +0200, Heikki Linnakangas wrote: > Here's a patch that gets rid of AuxProcType. It's independent of the other > patches in this thread; if this is committed, I'll rebase the rest of the > patches over this and get rid of the new PMC_* enum. > > Three patches, actually. The first one fixes an existing comment that I > noticed to be incorrect while working on this. I'll push that soon, barring > objections. The second one gets rid of AuxProcType, and the third one > replaces IsBackgroundWorker, IsAutoVacuumLauncherProcess() and > IsAutoVacuumWorkerProcess() with checks on MyBackendType. So MyBackendType > is now the primary way to check what kind of a process the current process > is. > > 'am_walsender' would also be fairly straightforward to remove and replace > with AmWalSenderProcess(). I didn't do that because we also have > am_db_walsender and am_cascading_walsender which cannot be directly replaced > with MyBackendType. Given that, it might be best to keep am_walsender for > symmetry. > @@ -561,13 +561,13 @@ static void ShmemBackendArrayAdd(Backend *bn); > static void ShmemBackendArrayRemove(Backend *bn); > #endif /* EXEC_BACKEND > */ > > -#define StartupDataBase()StartChildProcess(StartupProcess) > -#define StartArchiver() > StartChildProcess(ArchiverProcess) > -#define StartBackgroundWriter() StartChildProcess(BgWriterProcess) > -#define StartCheckpointer() StartChildProcess(CheckpointerProcess) > -#define StartWalWriter() StartChildProcess(WalWriterProcess) > -#define StartWalReceiver() StartChildProcess(WalReceiverProcess) > -#define StartWalSummarizer() StartChildProcess(WalSummarizerProcess) > +#define StartupDataBase()StartChildProcess(B_STARTUP) > +#define StartArchiver() StartChildProcess(B_ARCHIVER) > +#define StartBackgroundWriter() StartChildProcess(B_BG_WRITER) > +#define StartCheckpointer() StartChildProcess(B_CHECKPOINTER) > +#define StartWalWriter() StartChildProcess(B_WAL_WRITER) > +#define StartWalReceiver() StartChildProcess(B_WAL_RECEIVER) > +#define StartWalSummarizer() StartChildProcess(B_WAL_SUMMARIZER) Not for this commit, but we ought to rip out these macros - all they do is to make it harder to understand the code. > @@ -5344,31 +5344,31 @@ StartChildProcess(AuxProcType type) > errno = save_errno; > switch (type) > { > - case StartupProcess: > + case B_STARTUP: > ereport(LOG, > (errmsg("could not fork startup > process: %m"))); > break; > - case ArchiverProcess: > + case B_ARCHIVER: > ereport(LOG, > (errmsg("could not fork > archiver process: %m"))); > break; > - case BgWriterProcess: > + case B_BG_WRITER: > ereport(LOG, > (errmsg("could not fork > background writer process: %m"))); > break; > - case CheckpointerProcess: > + case B_CHECKPOINTER: > ereport(LOG, > (errmsg("could not fork > checkpointer process: %m"))); > break; > - case WalWriterProcess: > + case B_WAL_WRITER: > ereport(LOG, > (errmsg("could not fork WAL > writer process: %m"))); > break; > - case WalReceiverProcess: > + case B_WAL_RECEIVER: > ereport(LOG, > (errmsg("could not fork WAL > receiver process: %m"))); > break; > - case WalSummarizerProcess: > + case B_WAL_SUMMARIZER: > ereport(LOG, > (errmsg("could not fork WAL > summarizer process: %m"))); > break; Seems we should replace this with something slightly more generic one of these days... > diff --git a/src/backend/utils/activity/backend_status.c > b/src/backend/utils/activity/backend_status.c > index 1a1050c8da1..92f24db4e18 100644 > --- a/src/backend/utils/activity/backend_status.c > +++ b/src/backend/utils/activity/backend_status.c > @@ -257,17 +257,16 @@ pgstat_beinit(void) > else > { > /* Must be an auxiliary process */ > - Assert(MyAuxP
Re: Refactoring backend fork+exec code
On 08/12/2023 14:33, Heikki Linnakangas wrote: + [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true}, + [PMC_AV_WORKER] = {"autovacuum worker", AutoVacWorkerMain, true}, + [PMC_BGWORKER] = {"bgworker", BackgroundWorkerMain, true}, + [PMC_SYSLOGGER] = {"syslogger", SysLoggerMain, false}, + + [PMC_STARTUP] = {"startup", StartupProcessMain, true}, + [PMC_BGWRITER] = {"bgwriter", BackgroundWriterMain, true}, + [PMC_ARCHIVER] = {"archiver", PgArchiverMain, true}, + [PMC_CHECKPOINTER] = {"checkpointer", CheckpointerMain, true}, + [PMC_WAL_WRITER] = {"wal_writer", WalWriterMain, true}, + [PMC_WAL_RECEIVER] = {"wal_receiver", WalReceiverMain, true}, +}; It feels like we have too many different ways of documenting the type of a process. This new PMC_ stuff, enum AuxProcType, enum BackendType. Agreed. And "am_walsender" and such variables. Here's a patch that gets rid of AuxProcType. It's independent of the other patches in this thread; if this is committed, I'll rebase the rest of the patches over this and get rid of the new PMC_* enum. Three patches, actually. The first one fixes an existing comment that I noticed to be incorrect while working on this. I'll push that soon, barring objections. The second one gets rid of AuxProcType, and the third one replaces IsBackgroundWorker, IsAutoVacuumLauncherProcess() and IsAutoVacuumWorkerProcess() with checks on MyBackendType. So MyBackendType is now the primary way to check what kind of a process the current process is. 'am_walsender' would also be fairly straightforward to remove and replace with AmWalSenderProcess(). I didn't do that because we also have am_db_walsender and am_cascading_walsender which cannot be directly replaced with MyBackendType. Given that, it might be best to keep am_walsender for symmetry. -- Heikki Linnakangas Neon (https://neon.tech) From a0ecc54763252de74907fd0e4aed30fdb753ff86 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 10 Jan 2024 13:46:23 +0200 Subject: [PATCH v6 1/3] Fix incorrect comment on how BackendStatusArray is indexed The comment was copy-pasted from the call to ProcSignalInit() in AuxiliaryProcessMain(), which uses a similar scheme of having reserved slots for aux processes after MaxBackends slots for backends. However, ProcSignalInit() indexing starts from 1, whereas BackendStatusArray starts from 0. The code is correct, but the comment was wrong. Reviewed-by: XXX Discussion: XXX --- src/backend/utils/activity/backend_status.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c index 3dac79c30eb..1a1050c8da1 100644 --- a/src/backend/utils/activity/backend_status.c +++ b/src/backend/utils/activity/backend_status.c @@ -263,9 +263,9 @@ pgstat_beinit(void) * Assign the MyBEEntry for an auxiliary process. Since it doesn't * have a BackendId, the slot is statically allocated based on the * auxiliary process type (MyAuxProcType). Backends use slots indexed - * in the range from 1 to MaxBackends (inclusive), so we use - * MaxBackends + AuxBackendType + 1 as the index of the slot for an - * auxiliary process. + * in the range from 0 to MaxBackends (exclusive), so we use + * MaxBackends + AuxProcType as the index of the slot for an auxiliary + * process. */ MyBEEntry = &BackendStatusArray[MaxBackends + MyAuxProcType]; } -- 2.39.2 From 68b8c662af229f93244b45282f3112d9a5d809b9 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 10 Jan 2024 14:00:09 +0200 Subject: [PATCH v6 2/3] Remove MyAuxProcType, use MyBackendType instead MyAuxProcType was redundant with MyBackendType. Reviewed-by: XXX Discussion: XXX --- src/backend/postmaster/auxprocess.c | 68 +--- src/backend/postmaster/postmaster.c | 36 +-- src/backend/utils/activity/backend_status.c | 11 ++-- src/include/miscadmin.h | 71 ++--- src/include/postmaster/auxprocess.h | 2 +- src/tools/pgindent/typedefs.list| 1 - 6 files changed, 74 insertions(+), 115 deletions(-) diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c index ab86e802f21..bd3815b63d0 100644 --- a/src/backend/postmaster/auxprocess.c +++ b/src/backend/postmaster/auxprocess.c @@ -38,14 +38,6 @@ static void ShutdownAuxiliaryProcess(int code, Datum arg); -/* - * global variables - * - */ - -AuxProcType MyAuxProcType = NotAnAuxProcess; /* declared in miscadmin.h */ - - /* * AuxiliaryProcessMain * @@ -55,39 +47,13 @@ AuxProcType MyAuxProcType = NotAnAuxProcess; /* declared in miscadmin.h */ * This code is here just because of historical reasons. */ void -AuxiliaryProcessMain(AuxProcType auxtype) +AuxiliaryProcessMain(BackendType auxtype) { Assert(IsUnderPostmaster
Re: Refactoring backend fork+exec code
On 02/12/2023 05:00, Alexander Lakhin wrote: What bothered me additionally, is an error detected after server start. I couldn't see it without the patches applied. I mean, on HEAD I now see `make check` passing, but with the patches it fails: ... # parallel group (20 tests): interval date numerology polygon box macaddr8 macaddr multirangetypes line timestamp timetz timestamptz time circle strings lseg inet md5 path point not ok 22 + strings 1048 ms # (test process exited with exit code 2) not ok 23 + md5 1052 ms # (test process exited with exit code 2) ... src/test/regress/log/postmaster.log contains: ==00:00:00:30.730 1713480== Syscall param write(buf) points to uninitialised byte(s) ==00:00:00:30.730 1713480== at 0x5245A37: write (write.c:26) ==00:00:00:30.730 1713480== by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180) ==00:00:00:30.730 1713480== by 0x51BC84F: new_do_write (fileops.c:448) ==00:00:00:30.730 1713480== by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254) ==00:00:00:30.730 1713480== by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196) ==00:00:00:30.730 1713480== by 0x51B1056: fwrite (iofwrite.c:39) ==00:00:00:30.730 1713480== by 0x5540CF: internal_forkexec (postmaster.c:4526) ==00:00:00:30.730 1713480== by 0x5543C0: bgworker_forkexec (postmaster.c:5624) ==00:00:00:30.730 1713480== by 0x555477: do_start_bgworker (postmaster.c:5665) ==00:00:00:30.730 1713480== by 0x555738: maybe_start_bgworkers (postmaster.c:5928) ==00:00:00:30.730 1713480== by 0x556072: process_pm_pmsignal (postmaster.c:5080) ==00:00:00:30.730 1713480== by 0x556610: ServerLoop (postmaster.c:1761) ==00:00:00:30.730 1713480== by 0x557BE2: PostmasterMain (postmaster.c:1469) ==00:00:00:30.730 1713480== by 0x47216B: main (main.c:198) ==00:00:00:30.730 1713480== Address 0x1ffeffd8c0 is on thread 1's stack ==00:00:00:30.730 1713480== in frame #4, created by internal_forkexec (postmaster.c:4482) ==00:00:00:30.730 1713480== ... Ack, I see this too. I fixed it by adding MCXT_ALLOC_ZERO to the allocation of the BackendWorker struct. That's a little heavy-handed, like with the previous failures the uninitialized padding bytes are written to the file and read back, but not accessed after that. But it seems like the simplest fix. This isn't performance critical after all. I also renamed the 'has_worker' field to 'has_bgworker', per Tristan's suggestion. Pushed with those changes. Thanks for the reviews! 2023-12-02 05:14:30.751 MSK client backend[1713740] pg_regress/rangetypes FATAL: terminating connection due to unexpected postmaster exit 2023-12-02 05:14:31.033 MSK client backend[1713734] pg_regress/numeric FATAL: postmaster exited during a parallel transaction TRAP: failed Assert("!IsTransactionOrTransactionBlock()"), File: "pgstat.c", Line: 591, PID: 1713734 I haven't looked deeper yet, but it seems that we see two issues here (and Assert is not directly caused by the patches set.) I have not been able to reproduce this one. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
Hello Heikki, 01.12.2023 23:44, Heikki Linnakangas wrote: With memset(param, 0, sizeof(*param)); added at the beginning of save_backend_variables(), server starts successfully, but then `make check` fails with other Valgrind error. That's actually a pre-existing issue, I'm seeing that even on unpatched 'master'. Thank you for fixing that! Yes, I had discovered it before, but yesterday I decided to check whether your patches improve the situation... What bothered me additionally, is an error detected after server start. I couldn't see it without the patches applied. I mean, on HEAD I now see `make check` passing, but with the patches it fails: ... # parallel group (20 tests): interval date numerology polygon box macaddr8 macaddr multirangetypes line timestamp timetz timestamptz time circle strings lseg inet md5 path point not ok 22 + strings 1048 ms # (test process exited with exit code 2) not ok 23 + md5 1052 ms # (test process exited with exit code 2) ... src/test/regress/log/postmaster.log contains: ==00:00:00:30.730 1713480== Syscall param write(buf) points to uninitialised byte(s) ==00:00:00:30.730 1713480== at 0x5245A37: write (write.c:26) ==00:00:00:30.730 1713480== by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180) ==00:00:00:30.730 1713480== by 0x51BC84F: new_do_write (fileops.c:448) ==00:00:00:30.730 1713480== by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254) ==00:00:00:30.730 1713480== by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196) ==00:00:00:30.730 1713480== by 0x51B1056: fwrite (iofwrite.c:39) ==00:00:00:30.730 1713480== by 0x5540CF: internal_forkexec (postmaster.c:4526) ==00:00:00:30.730 1713480== by 0x5543C0: bgworker_forkexec (postmaster.c:5624) ==00:00:00:30.730 1713480== by 0x555477: do_start_bgworker (postmaster.c:5665) ==00:00:00:30.730 1713480== by 0x555738: maybe_start_bgworkers (postmaster.c:5928) ==00:00:00:30.730 1713480== by 0x556072: process_pm_pmsignal (postmaster.c:5080) ==00:00:00:30.730 1713480== by 0x556610: ServerLoop (postmaster.c:1761) ==00:00:00:30.730 1713480== by 0x557BE2: PostmasterMain (postmaster.c:1469) ==00:00:00:30.730 1713480== by 0x47216B: main (main.c:198) ==00:00:00:30.730 1713480== Address 0x1ffeffd8c0 is on thread 1's stack ==00:00:00:30.730 1713480== in frame #4, created by internal_forkexec (postmaster.c:4482) ==00:00:00:30.730 1713480== ... 2023-12-02 05:14:30.751 MSK client backend[1713740] pg_regress/rangetypes FATAL: terminating connection due to unexpected postmaster exit 2023-12-02 05:14:31.033 MSK client backend[1713734] pg_regress/numeric FATAL: postmaster exited during a parallel transaction TRAP: failed Assert("!IsTransactionOrTransactionBlock()"), File: "pgstat.c", Line: 591, PID: 1713734 I haven't looked deeper yet, but it seems that we see two issues here (and Assert is not directly caused by the patches set.) Best regards, Alexander
Re: Refactoring backend fork+exec code
"Tristan Partin" writes: > On Fri Dec 1, 2023 at 2:44 PM CST, Heikki Linnakangas wrote: >> Committed a fix with memset(). I'm not sure what our policy with >> backpatching this kind of issues is. This goes back to all supported >> versions, but given the lack of complaints, I chose to not backpatch. > Seems like a harmless think to backpatch. It is conceivable that someone > might want to run Valgrind on something other than HEAD. FWIW, I agree with Heikki's conclusion. EXEC_BACKEND on non-Windows is already a niche developer-only setup, and given the lack of complaints, nobody's that interested in running Valgrind with it. Fixing it on HEAD seems like plenty. regards, tom lane
Re: Refactoring backend fork+exec code
On Fri Dec 1, 2023 at 2:44 PM CST, Heikki Linnakangas wrote: On 01/12/2023 16:00, Alexander Lakhin wrote: > Maybe you could look at issues with running `make check` under Valgrind > when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND": > # +++ regress check in src/test/regress +++ > # initializing database system by copying initdb template > # postmaster failed, examine ".../src/test/regress/log/postmaster.log" for the reason > Bail out!make[1]: *** > > ... > 2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG: listening on Unix socket "/tmp/pg_regress-pPFNk0/.s.PGSQL.55312" > ==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised byte(s) > ==00:00:00:01.692 1259396== at 0x5245A37: write (write.c:26) > ==00:00:00:01.692 1259396== by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180) > ==00:00:00:01.692 1259396== by 0x51BC84F: new_do_write (fileops.c:448) > ==00:00:00:01.692 1259396== by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254) > ==00:00:00:01.692 1259396== by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196) > ==00:00:00:01.692 1259396== by 0x51B1056: fwrite (iofwrite.c:39) > ==00:00:00:01.692 1259396== by 0x552E21: internal_forkexec (postmaster.c:4518) > ==00:00:00:01.692 1259396== by 0x5546A1: postmaster_forkexec (postmaster.c:) > ==00:00:00:01.692 1259396== by 0x55471C: StartChildProcess (postmaster.c:5275) > ==00:00:00:01.692 1259396== by 0x557B61: PostmasterMain (postmaster.c:1454) > ==00:00:00:01.692 1259396== by 0x472136: main (main.c:198) > ==00:00:00:01.692 1259396== Address 0x1ffeffdc11 is on thread 1's stack > ==00:00:00:01.692 1259396== in frame #4, created by internal_forkexec (postmaster.c:4482) > ==00:00:00:01.692 1259396== > > With memset(param, 0, sizeof(*param)); added at the beginning of > save_backend_variables(), server starts successfully, but then > `make check` fails with other Valgrind error. That's actually a pre-existing issue, I'm seeing that even on unpatched 'master'. In a nutshell, the problem is that the uninitialized padding bytes in BackendParameters are written to the file. When we read the file back, we don't access the padding bytes, so that's harmless. But Valgrind doesn't know that. On Windows, the file is created with CreateFileMapping(INVALID_HANDLE_VALUE, ...) and we write the variables directly to the mapping. If I understand the Windows API docs correctly, it is guaranteed to be initialized to zeros, so we don't have this problem on Windows, only on other platforms with EXEC_BACKEND. I think it makes sense to clear the memory on other platforms too, since that's what we do on Windows. Committed a fix with memset(). I'm not sure what our policy with backpatching this kind of issues is. This goes back to all supported versions, but given the lack of complaints, I chose to not backpatch. Seems like a harmless think to backpatch. It is conceivable that someone might want to run Valgrind on something other than HEAD. -- Tristan Partin Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On 01/12/2023 16:00, Alexander Lakhin wrote: Maybe you could look at issues with running `make check` under Valgrind when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND": # +++ regress check in src/test/regress +++ # initializing database system by copying initdb template # postmaster failed, examine ".../src/test/regress/log/postmaster.log" for the reason Bail out!make[1]: *** ... 2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG: listening on Unix socket "/tmp/pg_regress-pPFNk0/.s.PGSQL.55312" ==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised byte(s) ==00:00:00:01.692 1259396== at 0x5245A37: write (write.c:26) ==00:00:00:01.692 1259396== by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180) ==00:00:00:01.692 1259396== by 0x51BC84F: new_do_write (fileops.c:448) ==00:00:00:01.692 1259396== by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254) ==00:00:00:01.692 1259396== by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196) ==00:00:00:01.692 1259396== by 0x51B1056: fwrite (iofwrite.c:39) ==00:00:00:01.692 1259396== by 0x552E21: internal_forkexec (postmaster.c:4518) ==00:00:00:01.692 1259396== by 0x5546A1: postmaster_forkexec (postmaster.c:) ==00:00:00:01.692 1259396== by 0x55471C: StartChildProcess (postmaster.c:5275) ==00:00:00:01.692 1259396== by 0x557B61: PostmasterMain (postmaster.c:1454) ==00:00:00:01.692 1259396== by 0x472136: main (main.c:198) ==00:00:00:01.692 1259396== Address 0x1ffeffdc11 is on thread 1's stack ==00:00:00:01.692 1259396== in frame #4, created by internal_forkexec (postmaster.c:4482) ==00:00:00:01.692 1259396== With memset(param, 0, sizeof(*param)); added at the beginning of save_backend_variables(), server starts successfully, but then `make check` fails with other Valgrind error. That's actually a pre-existing issue, I'm seeing that even on unpatched 'master'. In a nutshell, the problem is that the uninitialized padding bytes in BackendParameters are written to the file. When we read the file back, we don't access the padding bytes, so that's harmless. But Valgrind doesn't know that. On Windows, the file is created with CreateFileMapping(INVALID_HANDLE_VALUE, ...) and we write the variables directly to the mapping. If I understand the Windows API docs correctly, it is guaranteed to be initialized to zeros, so we don't have this problem on Windows, only on other platforms with EXEC_BACKEND. I think it makes sense to clear the memory on other platforms too, since that's what we do on Windows. Committed a fix with memset(). I'm not sure what our policy with backpatching this kind of issues is. This goes back to all supported versions, but given the lack of complaints, I chose to not backpatch. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On Fri Dec 1, 2023 at 6:10 AM CST, Heikki Linnakangas wrote: On 30/11/2023 20:44, Tristan Partin wrote: > Patches 1-3 seem committable as-is. Thanks for the review! I'm focusing on patches 1-3 now, and will come back to the rest after committing 1-3. There was one test failure with EXEC_BACKEND from patch 2, in 'test_shm_mq'. In restore_backend_variables() I checked if 'bgw_name' is empty to decide if the BackgroundWorker struct is filled in or not, but it turns out that 'test_shm_mq' doesn't fill in bgw_name. It probably should, I think that's an oversight in 'test_shm_mq', but that's a separate issue. I did some more refactoring of patch 2, to fix that and to improve it in general. The BackgroundWorker struct is now passed through the fork-related functions similarly to the Port struct. That seems more consistent. Attached is new version of these patches. For easier review, I made the new refactorings compared in a new commit 0003. I will squash that before pushing, but this makes it easier to see what changed. Barring any new feedback or issues, I will commit these. My only thought is that perhaps has_bg_worker is a better name than has_worker, but I agree that having a flag is better than checking bgw_name. -- Tristan Partin Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
Hello Heikki, 01.12.2023 15:10, Heikki Linnakangas wrote: Attached is new version of these patches. For easier review, I made the new refactorings compared in a new commit 0003. I will squash that before pushing, but this makes it easier to see what changed. Barring any new feedback or issues, I will commit these. Maybe you could look at issues with running `make check` under Valgrind when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND": # +++ regress check in src/test/regress +++ # initializing database system by copying initdb template # postmaster failed, examine ".../src/test/regress/log/postmaster.log" for the reason Bail out!make[1]: *** ... 2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG: listening on Unix socket "/tmp/pg_regress-pPFNk0/.s.PGSQL.55312" ==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised byte(s) ==00:00:00:01.692 1259396== at 0x5245A37: write (write.c:26) ==00:00:00:01.692 1259396== by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1180) ==00:00:00:01.692 1259396== by 0x51BC84F: new_do_write (fileops.c:448) ==00:00:00:01.692 1259396== by 0x51BC84F: _IO_new_file_xsputn (fileops.c:1254) ==00:00:00:01.692 1259396== by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1196) ==00:00:00:01.692 1259396== by 0x51B1056: fwrite (iofwrite.c:39) ==00:00:00:01.692 1259396== by 0x552E21: internal_forkexec (postmaster.c:4518) ==00:00:00:01.692 1259396== by 0x5546A1: postmaster_forkexec (postmaster.c:) ==00:00:00:01.692 1259396== by 0x55471C: StartChildProcess (postmaster.c:5275) ==00:00:00:01.692 1259396== by 0x557B61: PostmasterMain (postmaster.c:1454) ==00:00:00:01.692 1259396== by 0x472136: main (main.c:198) ==00:00:00:01.692 1259396== Address 0x1ffeffdc11 is on thread 1's stack ==00:00:00:01.692 1259396== in frame #4, created by internal_forkexec (postmaster.c:4482) ==00:00:00:01.692 1259396== With memset(param, 0, sizeof(*param)); added at the beginning of save_backend_variables(), server starts successfully, but then `make check` fails with other Valgrind error. Best regards, Alexander
Re: Refactoring backend fork+exec code
On 30/11/2023 20:44, Tristan Partin wrote: Patches 1-3 seem committable as-is. Thanks for the review! I'm focusing on patches 1-3 now, and will come back to the rest after committing 1-3. There was one test failure with EXEC_BACKEND from patch 2, in 'test_shm_mq'. In restore_backend_variables() I checked if 'bgw_name' is empty to decide if the BackgroundWorker struct is filled in or not, but it turns out that 'test_shm_mq' doesn't fill in bgw_name. It probably should, I think that's an oversight in 'test_shm_mq', but that's a separate issue. I did some more refactoring of patch 2, to fix that and to improve it in general. The BackgroundWorker struct is now passed through the fork-related functions similarly to the Port struct. That seems more consistent. Attached is new version of these patches. For easier review, I made the new refactorings compared in a new commit 0003. I will squash that before pushing, but this makes it easier to see what changed. Barring any new feedback or issues, I will commit these. -- Heikki Linnakangas Neon (https://neon.tech) From 6dd44d7057e535eb441c9ab8fda1bbdd8079f2fb Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 30 Nov 2023 00:01:25 +0200 Subject: [PATCH v4 1/4] Refactor CreateSharedMemoryAndSemaphores For clarity, have separate functions for *creating* the shared memory and semaphores at postmaster or single-user backend startup, and for *attaching* to existing shared memory structures in EXEC_BACKEND case. CreateSharedMemoryAndSemaphores() is now called only at postmaster startup, and a new AttachSharedMemoryStructs() function is called at backend startup in EXEC_BACKEND mode. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce...@iki.fi --- src/backend/postmaster/postmaster.c | 20 +-- src/backend/replication/walreceiver.c | 3 +- src/backend/storage/ipc/ipci.c| 178 +++--- src/backend/storage/lmgr/proc.c | 2 +- src/include/storage/ipc.h | 3 + 5 files changed, 117 insertions(+), 89 deletions(-) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 7a5cd06c5c9..b03e88e6702 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -4909,11 +4909,11 @@ SubPostmasterMain(int argc, char *argv[]) /* Restore basic shared memory pointers */ InitShmemAccess(UsedShmemSegAddr); - /* Need a PGPROC to run CreateSharedMemoryAndSemaphores */ + /* Need a PGPROC to run AttachSharedMemoryStructs */ InitProcess(); /* Attach process to shared data structures */ - CreateSharedMemoryAndSemaphores(); + AttachSharedMemoryStructs(); /* And run the backend */ BackendRun(&port); /* does not return */ @@ -4927,11 +4927,11 @@ SubPostmasterMain(int argc, char *argv[]) /* Restore basic shared memory pointers */ InitShmemAccess(UsedShmemSegAddr); - /* Need a PGPROC to run CreateSharedMemoryAndSemaphores */ + /* Need a PGPROC to run AttachSharedMemoryStructs */ InitAuxiliaryProcess(); /* Attach process to shared data structures */ - CreateSharedMemoryAndSemaphores(); + AttachSharedMemoryStructs(); auxtype = atoi(argv[3]); AuxiliaryProcessMain(auxtype); /* does not return */ @@ -4941,11 +4941,11 @@ SubPostmasterMain(int argc, char *argv[]) /* Restore basic shared memory pointers */ InitShmemAccess(UsedShmemSegAddr); - /* Need a PGPROC to run CreateSharedMemoryAndSemaphores */ + /* Need a PGPROC to run AttachSharedMemoryStructs */ InitProcess(); /* Attach process to shared data structures */ - CreateSharedMemoryAndSemaphores(); + AttachSharedMemoryStructs(); AutoVacLauncherMain(argc - 2, argv + 2); /* does not return */ } @@ -4954,11 +4954,11 @@ SubPostmasterMain(int argc, char *argv[]) /* Restore basic shared memory pointers */ InitShmemAccess(UsedShmemSegAddr); - /* Need a PGPROC to run CreateSharedMemoryAndSemaphores */ + /* Need a PGPROC to run AttachSharedMemoryStructs */ InitProcess(); /* Attach process to shared data structures */ - CreateSharedMemoryAndSemaphores(); + AttachSharedMemoryStructs(); AutoVacWorkerMain(argc - 2, argv + 2); /* does not return */ } @@ -4972,11 +4972,11 @@ SubPostmasterMain(int argc, char *argv[]) /* Restore basic shared memory pointers */ InitShmemAccess(UsedShmemSegAddr); - /* Need a PGPROC to run CreateSharedMemoryAndSemaphores */ + /* Need a PGPROC to run AttachSharedMemoryStructs */ InitProcess(); /* Attach process to shared data structures */ - CreateSharedMemoryAndSemaphores(); + AttachSharedMemoryStructs(); /* Fetch MyBgworkerEntry from shared memory */ shmem_slot = atoi(argv[1] + 15); diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index 2398167f495..26ded928a71 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replicati
Re: Refactoring backend fork+exec code
Hi, On 2023-12-01 01:36:13 +0200, Heikki Linnakangas wrote: > On 30/11/2023 22:26, Andres Freund wrote: > > On 2023-11-30 01:36:25 +0200, Heikki Linnakangas wrote: > > > From a96b6e92fdeaa947bf32774c425419b8f987b8e2 Mon Sep 17 00:00:00 2001 > > > From: Heikki Linnakangas > > > Date: Thu, 30 Nov 2023 00:01:25 +0200 > > > Subject: [PATCH v3 1/7] Refactor CreateSharedMemoryAndSemaphores > > > > > > For clarity, have separate functions for *creating* the shared memory > > > and semaphores at postmaster or single-user backend startup, and > > > for *attaching* to existing shared memory structures in EXEC_BACKEND > > > case. CreateSharedMemoryAndSemaphores() is now called only at > > > postmaster startup, and a new AttachSharedMemoryStructs() function is > > > called at backend startup in EXEC_BACKEND mode. > > > > I assume CreateSharedMemoryAndSemaphores() is still called during crash > > restart? > > Yes. > > > I wonder if it shouldn't split three ways: > > 1) create > > 2) initialize > > 3) attach > > Why? What would be the difference between create and initialize phases? Mainly because I somehow mis-remembered how we deal with the shared memory allocation when crashing. I somehow had remembered that we reused the same allocation across restarts, but reinitialized it from scratch. There's a kernel of truth to that, because we can end up re-attaching to an existing sysv shared memory segment. But not more. Perhaps I was confusing things with the listen sockets? Andres
Re: Refactoring backend fork+exec code
On 30/11/2023 22:26, Andres Freund wrote: On 2023-11-30 01:36:25 +0200, Heikki Linnakangas wrote: From a96b6e92fdeaa947bf32774c425419b8f987b8e2 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 30 Nov 2023 00:01:25 +0200 Subject: [PATCH v3 1/7] Refactor CreateSharedMemoryAndSemaphores For clarity, have separate functions for *creating* the shared memory and semaphores at postmaster or single-user backend startup, and for *attaching* to existing shared memory structures in EXEC_BACKEND case. CreateSharedMemoryAndSemaphores() is now called only at postmaster startup, and a new AttachSharedMemoryStructs() function is called at backend startup in EXEC_BACKEND mode. I assume CreateSharedMemoryAndSemaphores() is still called during crash restart? Yes. I wonder if it shouldn't split three ways: 1) create 2) initialize 3) attach Why? What would be the difference between create and initialize phases? -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On 30/11/2023 22:26, Andres Freund wrote: Aside: Somewhat odd that InitAuxiliaryProcess() doesn't call InitLWLockAccess(). Yeah that caught my eye too. It seems to have been an oversight in commit 1c6821be31f. Before that, in 9.4, the lwlock stats were printed for aux processes too, on shutdown. Committed a fix for that to master. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Refactoring backend fork+exec code
On Fri, Dec 1, 2023 at 9:31 AM Andres Freund wrote: > On 2023-11-30 12:44:33 -0600, Tristan Partin wrote: > > > +/* > > > + * Set reference point for stack-depth checking. This might > > > seem > > > + * redundant in !EXEC_BACKEND builds; but it's not because the > > > postmaster > > > + * launches its children from signal handlers, so we might be > > > running on > > > + * an alternative stack. XXX still true? > > > + */ > > > +(void) set_stack_base(); > > > > Looks like there is still this XXX left. Can't say I completely understand > > the second sentence either. > > We used to start some child processes of postmaster in signal handlers. That > was fixed in > > commit 7389aad6366 > Author: Thomas Munro > Date: 2023-01-12 12:34:23 +1300 > > Use WaitEventSet API for postmaster's event loop. > > > In some cases signal handlers run on a separate stack, which meant that the > set_stack_base() we did in postmaster would yield a completely bogus stack > depth estimation. So this comment should likely have been removed. Thomas? Right, I should delete that comment in master and 16. While wondering what to write instead, my first thought is that it is better to leave the actual call there though, because otherwise there is a small difference in stack reference point between EXEC_BACKEND and !EXEC_BACKEND builds, consumed by a few postmaster stack frames. So the new comment would just say that. (I did idly wonder if there is a longjmp trick to zap those frames post-fork, not looked into and probably doesn't really improve anything we care about...)
Re: Refactoring backend fork+exec code
Hi, On 2023-11-30 12:44:33 -0600, Tristan Partin wrote: > > +/* > > + * Set reference point for stack-depth checking. This might seem > > + * redundant in !EXEC_BACKEND builds; but it's not because the > > postmaster > > + * launches its children from signal handlers, so we might be > > running on > > + * an alternative stack. XXX still true? > > + */ > > +(void) set_stack_base(); > > Looks like there is still this XXX left. Can't say I completely understand > the second sentence either. We used to start some child processes of postmaster in signal handlers. That was fixed in commit 7389aad6366 Author: Thomas Munro Date: 2023-01-12 12:34:23 +1300 Use WaitEventSet API for postmaster's event loop. In some cases signal handlers run on a separate stack, which meant that the set_stack_base() we did in postmaster would yield a completely bogus stack depth estimation. So this comment should likely have been removed. Thomas? Greetings, Andres Freund
Re: Refactoring backend fork+exec code
Hi, On 2023-11-30 01:36:25 +0200, Heikki Linnakangas wrote: > - patch 1 splits CreateSharedMemoryAndSemaphores into two functions: > CreateSharedMemoryAndSemaphores is now only called at postmaster startup, > and a new function called AttachSharedMemoryStructs() is called in backends > in EXEC_BACKEND mode. I extracted the common parts of those functions to a > new static function. (Some of this refactoring used to be part of the 3rd > patch in the series, but it seems useful on its own, so I split it out.) I like that idea. > From a96b6e92fdeaa947bf32774c425419b8f987b8e2 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Thu, 30 Nov 2023 00:01:25 +0200 > Subject: [PATCH v3 1/7] Refactor CreateSharedMemoryAndSemaphores > > For clarity, have separate functions for *creating* the shared memory > and semaphores at postmaster or single-user backend startup, and > for *attaching* to existing shared memory structures in EXEC_BACKEND > case. CreateSharedMemoryAndSemaphores() is now called only at > postmaster startup, and a new AttachSharedMemoryStructs() function is > called at backend startup in EXEC_BACKEND mode. I assume CreateSharedMemoryAndSemaphores() is still called during crash restart? I wonder if it shouldn't split three ways: 1) create 2) initialize 3) attach > From 3478cafcf74a5c8d649e0287e6c72669a29c0e70 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Thu, 30 Nov 2023 00:02:03 +0200 > Subject: [PATCH v3 2/7] Pass BackgroundWorker entry in the parameter file in > EXEC_BACKEND mode > > This makes it possible to move InitProcess later in SubPostmasterMain > (in next commit), as we no longer need to access shared memory to read > background worker entry. > static void read_backend_variables(char *id, Port *port); > @@ -4831,7 +4833,7 @@ SubPostmasterMain(int argc, char *argv[]) > strcmp(argv[1], "--forkavlauncher") == 0 || > strcmp(argv[1], "--forkavworker") == 0 || > strcmp(argv[1], "--forkaux") == 0 || > - strncmp(argv[1], "--forkbgworker=", 15) == 0) > + strncmp(argv[1], "--forkbgworker", 14) == 0) > PGSharedMemoryReAttach(); > else > PGSharedMemoryNoReAttach(); > @@ -4962,10 +4964,8 @@ SubPostmasterMain(int argc, char *argv[]) > > AutoVacWorkerMain(argc - 2, argv + 2); /* does not return */ > } > - if (strncmp(argv[1], "--forkbgworker=", 15) == 0) > + if (strncmp(argv[1], "--forkbgworker", 14) == 0) Now that we don't need to look at parameters anymore, these should probably be just a strcmp(), like the other cases? > From 0d071474e12a70ff8113c7b0731c5b97fec45007 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Wed, 29 Nov 2023 23:47:25 +0200 > Subject: [PATCH v3 3/7] Refactor how InitProcess is called > > The order of process initialization steps is now more consistent > between !EXEC_BACKEND and EXEC_BACKEND modes. InitProcess() is called > at the same place in either mode. We can now also move the > AttachSharedMemoryStructs() call into InitProcess() itself. This > reduces the number of "#ifdef EXEC_BACKEND" blocks. Yay. > diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c > index cdfdd6fbe1d..6c708777dde 100644 > --- a/src/backend/storage/lmgr/proc.c > +++ b/src/backend/storage/lmgr/proc.c > @@ -461,6 +461,12 @@ InitProcess(void) >*/ > InitLWLockAccess(); > InitDeadLockChecking(); > + > +#ifdef EXEC_BACKEND > + /* Attach process to shared data structures */ > + if (IsUnderPostmaster) > + AttachSharedMemoryStructs(); > +#endif > } > > /* > @@ -614,6 +620,12 @@ InitAuxiliaryProcess(void) >* Arrange to clean up at process exit. >*/ > on_shmem_exit(AuxiliaryProcKill, Int32GetDatum(proctype)); > + > +#ifdef EXEC_BACKEND > + /* Attach process to shared data structures */ > + if (IsUnderPostmaster) > + AttachSharedMemoryStructs(); > +#endif > } Aside: Somewhat odd that InitAuxiliaryProcess() doesn't call InitLWLockAccess(). I think a short comment explaining why we can attach to shmem structs after already accessing shared memory earlier in the function would be worthwhile. > From ce51876f87f1e4317e25baf64184749448fcd033 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Thu, 30 Nov 2023 00:07:34 +0200 > Subject: [PATCH v3 7/7] Refactor postmaster child process launching > > - Move code related to launching backend processes to new source file, > launch_backend.c > > - Introduce new postmaster_child_launch() function that deals with the > differences between EXEC_BACKEND and fork mode. > > - Refactor the mechanism of passing information from the parent to > child process. Instead of using different command-line arguments when > launching the child process in EXEC_BACKEND mode, pass a > variable-length blob of data along with all the global variables. The > contents of that blob depend o
Re: Refactoring backend fork+exec code
On Wed Nov 29, 2023 at 5:36 PM CST, Heikki Linnakangas wrote: On 11/10/2023 14:12, Heikki Linnakangas wrote: Here's another rebased patch set. Compared to previous version, I did a little more refactoring around CreateSharedMemoryAndSemaphores and InitProcess: - patch 1 splits CreateSharedMemoryAndSemaphores into two functions: CreateSharedMemoryAndSemaphores is now only called at postmaster startup, and a new function called AttachSharedMemoryStructs() is called in backends in EXEC_BACKEND mode. I extracted the common parts of those functions to a new static function. (Some of this refactoring used to be part of the 3rd patch in the series, but it seems useful on its own, so I split it out.) - patch 3 moves the call to AttachSharedMemoryStructs() to InitProcess(), reducing the boilerplate code a little. The patches are incrementally useful, so if you don't have time to review all of them, a review on a subset would be useful too. From 8886db1ed6bae21bf6d77c9bb1230edbb55e24f9 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 30 Nov 2023 00:04:22 +0200 Subject: [PATCH v3 4/7] Pass CAC as argument to backend process For me, being new to the code, it would be nice to have more of an explanation as to why this is "better." I don't doubt it; it would just help me and future readers of this commit in the future. More of an explanation in the commit message would suffice. My other comment on this commit is that we now seem to have lost the context on what CAC stands for. Before we had the member variable to explain it. A comment on the enum would be great or changing cac named variables to canAcceptConnections. I did notice in patch 7 that there are still some variables named canAcceptConnections around, so I'll leave this comment up to you. From 98f8397b32a0b36e221475b32697c9c5bbca86a0 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 11 Oct 2023 13:38:06 +0300 Subject: [PATCH v3 5/7] Remove ConnCreate and ConnFree, and allocate Port in stack I like it separate. From 79aab42705a8cb0e16e61c33052fc56fdd4fca76 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Wed, 11 Oct 2023 13:38:10 +0300 Subject: [PATCH v3 6/7] Introduce ClientSocket, rename some funcs +static intBackendStartup(ClientSocket *port); s/port/client_sock -port->remote_hostname = strdup(remote_host); +port->remote_hostname = pstrdup(remote_host); +MemoryContextSwitchTo(oldcontext); Something funky with the whitespace here, but my eyes might also be playing tricks on me. Mixing spaces in tabs like what do in this codebase makes it difficult to review :). From ce51876f87f1e4317e25baf64184749448fcd033 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 30 Nov 2023 00:07:34 +0200 Subject: [PATCH v3 7/7] Refactor postmaster child process launching +entry_kinds[child_type].main_fn(startup_data, startup_data_len); +Assert(false); Seems like you want the pg_unreachable() macro here instead of Assert(false). Similar comment at the end of SubPostmasterMain(). +if (fwrite(param, paramsz, 1, fp) != 1) +{ +ereport(LOG, +(errcode_for_file_access(), + errmsg("could not write to file \"%s\": %m", tmpfilename))); +FreeFile(fp); +return -1; +} + +/* Release file */ +if (FreeFile(fp)) +{ +ereport(LOG, +(errcode_for_file_access(), + errmsg("could not write to file \"%s\": %m", tmpfilename))); +return -1; +} Two pieces of feedback here. I generally find write(2) more useful than fwrite(3) because write(2) will report a useful errno, whereas fwrite(2) just uses ferror(3). The additional errno information might be valuable context in the log message. Up to you if you think it is also valuable. The log message if FreeFile() fails doesn't seem to make sense to me. I didn't see any file writing in that code path, but it is possible that I missed something. +/* + * Set reference point for stack-depth checking. This might seem + * redundant in !EXEC_BACKEND builds; but it's not because the postmaster + * launches its children from signal handlers, so we might be running on + * an alternative stack. XXX still true? + */ +(void) set_stack_base(); Looks like there is still this XXX left. Can't say I completely understand the second sentence either. +/* + * make sure stderr is in binary mode before anything can possibly be + * written to it, in case it's actually the syslogger pipe, so the pipe + * chunking protocol isn't disturbed. Non-logpipe data gets translated on +
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On Fri, Oct 06, 2023 at 10:27:22AM +0300, Heikki Linnakangas wrote: > I did a quick test with syslogger enabled before committing, but didn't > notice the segfault. I missed it because syslogger gets restarted and then > it worked. Thanks, Heikki. -- Michael signature.asc Description: PGP signature
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On 06/10/2023 09:50, Michael Paquier wrote: On Fri, Oct 06, 2023 at 02:30:16PM +0900, Michael Paquier wrote: Okay, the backtrace is not that useful. I'll see if I can get something better, still it seems like this has broken the way the syslogger closes these ports. And here you go: Program terminated with signal SIGSEGV, Segmentation fault. #0 GetMemoryChunkMethodID (pointer=0x0) at mcxt.c:196 196 header = *((const uint64 *) ((const char *) pointer - sizeof(uint64))); (gdb) bt #0 GetMemoryChunkMethodID (pointer=0x0) at mcxt.c:196 #1 0x557d04176d59 in pfree (pointer=0x0) at mcxt.c:1463 #2 0x557d03e8eab3 in ClosePostmasterPorts (am_syslogger=true) at postmaster.c:2571 #3 0x557d03e93ac2 in SysLogger_Start () at syslogger.c:686 #4 0x557d03e8c5b7 in PostmasterMain (argc=3, argv=0x557d0471ed00) at postmaster.c:1148 #5 0x557d03d48e34 in main (argc=3, argv=0x557d0471ed00) at main.c:198 (gdb) up 2 #2 0x557d03e8eab3 in ClosePostmasterPorts (am_syslogger=true) at postmaster.c:2571 2571 pfree(ListenSockets); (gdb) p ListenSockets $1 = (pgsocket *) 0x0 Fixed, thanks! I did a quick test with syslogger enabled before committing, but didn't notice the segfault. I missed it because syslogger gets restarted and then it worked. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On Fri, Oct 06, 2023 at 02:30:16PM +0900, Michael Paquier wrote: > Okay, the backtrace is not that useful. I'll see if I can get > something better, still it seems like this has broken the way the > syslogger closes these ports. And here you go: Program terminated with signal SIGSEGV, Segmentation fault. #0 GetMemoryChunkMethodID (pointer=0x0) at mcxt.c:196 196 header = *((const uint64 *) ((const char *) pointer - sizeof(uint64))); (gdb) bt #0 GetMemoryChunkMethodID (pointer=0x0) at mcxt.c:196 #1 0x557d04176d59 in pfree (pointer=0x0) at mcxt.c:1463 #2 0x557d03e8eab3 in ClosePostmasterPorts (am_syslogger=true) at postmaster.c:2571 #3 0x557d03e93ac2 in SysLogger_Start () at syslogger.c:686 #4 0x557d03e8c5b7 in PostmasterMain (argc=3, argv=0x557d0471ed00) at postmaster.c:1148 #5 0x557d03d48e34 in main (argc=3, argv=0x557d0471ed00) at main.c:198 (gdb) up 2 #2 0x557d03e8eab3 in ClosePostmasterPorts (am_syslogger=true) at postmaster.c:2571 2571 pfree(ListenSockets); (gdb) p ListenSockets $1 = (pgsocket *) 0x0 -- Michael signature.asc Description: PGP signature
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On Thu, Oct 05, 2023 at 03:08:37PM +0300, Heikki Linnakangas wrote: > This seems pretty uncontroversial, and I heard no objections, so I went > ahead and committed that. It looks like e29c4643951 is causing issues here. While doing benchmarking on a cluster compiled with -O2, I got a crash: LOG: system logger process (PID 27924) was terminated by signal 11: Segmentation fault Program terminated with signal SIGSEGV, Segmentation fault. #0 0x55ef3b9aed20 in pfree () (gdb) bt #0 0x55ef3b9aed20 in pfree () #1 0x55ef3b7e0e41 in ClosePostmasterPorts () #2 0x55ef3b7e6649 in SysLogger_Start () #3 0x55ef3b7e4413 in PostmasterMain () Okay, the backtrace is not that useful. I'll see if I can get something better, still it seems like this has broken the way the syslogger closes these ports. -- Michael signature.asc Description: PGP signature
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On 29/08/2023 09:58, Heikki Linnakangas wrote: On 29/08/2023 09:21, Heikki Linnakangas wrote: Thinking about this some more, the ListenSockets array is a bit silly anyway. We fill the array starting from index 0, always append to the end, and never remove entries from it. It would seem more straightforward to keep track of the used size of the array. Currently we always loop through the unused parts too, and e.g. ConfigurePostmasterWaitSet() needs to walk the array to count how many elements are in use. Like this. This seems pretty uncontroversial, and I heard no objections, so I went ahead and committed that. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On 29/08/2023 09:21, Heikki Linnakangas wrote: Thinking about this some more, the ListenSockets array is a bit silly anyway. We fill the array starting from index 0, always append to the end, and never remove entries from it. It would seem more straightforward to keep track of the used size of the array. Currently we always loop through the unused parts too, and e.g. ConfigurePostmasterWaitSet() needs to walk the array to count how many elements are in use. Like this. -- Heikki Linnakangas Neon (https://neon.tech) From 796280f07dd5dbf50897c9895715ab5e2dbf187b Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Tue, 29 Aug 2023 09:53:08 +0300 Subject: [PATCH 1/1] Refactor ListenSocket array. Keep track of the used size of the array. That avoids looping through the whole array in a few places. It doesn't matter from a performance point of view since the array is small anyway, but this feels less surprising and is a little less code. Now that we have an explicit NumListenSockets variable that is statically initialized to 0, we don't need the loop to initialize the array. Allocate the array in PostmasterContext. The array isn't needed in child processes, so this allows reusing that memory. We could easily make the array resizable now, but we haven't heard any complaints about the current 64 sockets limit. --- src/backend/libpq/pqcomm.c | 18 +++ src/backend/postmaster/postmaster.c | 76 +++-- src/include/libpq/libpq.h | 2 +- 3 files changed, 36 insertions(+), 60 deletions(-) diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 8d217b0645..48ae7704fb 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -311,8 +311,9 @@ socket_close(int code, Datum arg) * specified. For TCP ports, hostName is either NULL for all interfaces or * the interface to listen on, and unixSocketDir is ignored (can be NULL). * - * Successfully opened sockets are added to the ListenSocket[] array (of - * length MaxListen), at the first position that isn't PGINVALID_SOCKET. + * Successfully opened sockets are appended to the ListenSockets[] array. The + * current size of the array is *NumListenSockets, it is updated to reflect + * the opened sockets. MaxListen is the allocated size of the array. * * RETURNS: STATUS_OK or STATUS_ERROR */ @@ -320,7 +321,7 @@ socket_close(int code, Datum arg) int StreamServerPort(int family, const char *hostName, unsigned short portNumber, const char *unixSocketDir, - pgsocket ListenSocket[], int MaxListen) + pgsocket ListenSockets[], int *NumListenSockets, int MaxListen) { pgsocket fd; int err; @@ -335,7 +336,6 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber, struct addrinfo *addrs = NULL, *addr; struct addrinfo hint; - int listen_index = 0; int added = 0; char unixSocketPath[MAXPGPATH]; #if !defined(WIN32) || defined(IPV6_V6ONLY) @@ -401,12 +401,7 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber, } /* See if there is still room to add 1 more socket. */ - for (; listen_index < MaxListen; listen_index++) - { - if (ListenSocket[listen_index] == PGINVALID_SOCKET) -break; - } - if (listen_index >= MaxListen) + if (*NumListenSockets == MaxListen) { ereport(LOG, (errmsg("could not bind to all requested addresses: MAXLISTEN (%d) exceeded", @@ -573,7 +568,8 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber, (errmsg("listening on %s address \"%s\", port %d", familyDesc, addrDesc, (int) portNumber))); - ListenSocket[listen_index] = fd; + ListenSockets[*NumListenSockets] = fd; + (*NumListenSockets)++; added++; } diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index d7bfb28ff3..2659329b82 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -226,7 +226,8 @@ int ReservedConnections; /* The socket(s) we're listening to. */ #define MAXLISTEN 64 -static pgsocket ListenSocket[MAXLISTEN]; +static int NumListenSockets = 0; +static pgsocket *ListenSockets = NULL; /* still more option variables */ bool EnableSSL = false; @@ -587,7 +588,6 @@ PostmasterMain(int argc, char *argv[]) int status; char *userDoption = NULL; bool listen_addr_saved = false; - int i; char *output_config_variable = NULL; InitProcessGlobals(); @@ -1141,17 +1141,6 @@ PostmasterMain(int argc, char *argv[]) errmsg("could not remove file \"%s\": %m", LOG_METAINFO_DATAFILE))); - /* - * Initialize input sockets. - * - * Mark them all closed, and set up an on_proc_exit function that's - * charged with closing the sockets again at postmaster shutdown. - */ - for (i = 0; i < MAXLISTEN; i++) - ListenSocket[i] = PGINVALID_SOCKET; - - on_proc_exit(CloseServerPorts, 0); - /* * If enabled, start up syslogger collection subp
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On 29/08/2023 01:28, Michael Paquier wrote: In case you've not noticed: https://www.postgresql.org/message-id/zovvuqe0rdj2s...@paquier.xyz But it does not really matter now ;) Ah sorry, missed that thread. Yes, so it seems. Syslogger is started before the ListenSockets array is initialized, so its still all zeros. When syslogger is forked, the child process tries to close all the listen sockets, which are all zeros. So syslogger calls close(0) MAXLISTEN (64) times. Attached patch moves the array initialization earlier. Yep, I've reached the same conclusion. Wouldn't it be cleaner to move the callback registration of CloseServerPorts() closer to the array initialization, though? Ok, pushed that way. I checked the history of this: it goes back to commit 9a86f03b4e in version 13. The SysLogger_Start() call used to be later, after setting p ListenSockets, but that commit moved it. So I backpatched this to v13. The first close(0) actually does have an effect: it closes stdin, which is fd 0. That is surely accidental, but I wonder if we should indeed close stdin in child processes? The postmaster process doesn't do anything with stdin either, although I guess a library might try to read a passphrase from stdin before starting up, for example. We would have heard about that, wouldn't we? I may be missing something of course, but on HEAD, the array initialization is done before starting any child processes, and the syslogger is the first one forked. Yes, syslogger is the only process that closes stdin. After moving the initialization, it doesn't close it either. Thinking about this some more, the ListenSockets array is a bit silly anyway. We fill the array starting from index 0, always append to the end, and never remove entries from it. It would seem more straightforward to keep track of the used size of the array. Currently we always loop through the unused parts too, and e.g. ConfigurePostmasterWaitSet() needs to walk the array to count how many elements are in use. -- Heikki Linnakangas Neon (https://neon.tech)
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On Mon, Aug 28, 2023 at 11:52:15PM +0300, Heikki Linnakangas wrote: > On 28/08/2023 18:55, Jeff Janes wrote: >> Since this commit, I'm getting a lot (63 per restart) of messages: >> >> LOG: could not close client or listen socket: Bad file descriptor >> All I have to do to get the message is turn logging_collector = on and >> restart. >> >> The close failure condition existed before the commit, it just wasn't >> logged before. So, did the extra logging added here just uncover a >> pre-existing bug? In case you've not noticed: https://www.postgresql.org/message-id/zovvuqe0rdj2s...@paquier.xyz But it does not really matter now ;) > Yes, so it seems. Syslogger is started before the ListenSockets array is > initialized, so its still all zeros. When syslogger is forked, the child > process tries to close all the listen sockets, which are all zeros. So > syslogger calls close(0) MAXLISTEN (64) times. Attached patch moves the > array initialization earlier. Yep, I've reached the same conclusion. Wouldn't it be cleaner to move the callback registration of CloseServerPorts() closer to the array initialization, though? > The first close(0) actually does have an effect: it closes stdin, which is > fd 0. That is surely accidental, but I wonder if we should indeed close > stdin in child processes? The postmaster process doesn't do anything with > stdin either, although I guess a library might try to read a passphrase from > stdin before starting up, for example. We would have heard about that, wouldn't we? I may be missing something of course, but on HEAD, the array initialization is done before starting any child processes, and the syslogger is the first one forked. -- Michael signature.asc Description: PGP signature
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On 28/08/2023 18:55, Jeff Janes wrote: Since this commit, I'm getting a lot (63 per restart) of messages: LOG: could not close client or listen socket: Bad file descriptor All I have to do to get the message is turn logging_collector = on and restart. The close failure condition existed before the commit, it just wasn't logged before. So, did the extra logging added here just uncover a pre-existing bug? Yes, so it seems. Syslogger is started before the ListenSockets array is initialized, so its still all zeros. When syslogger is forked, the child process tries to close all the listen sockets, which are all zeros. So syslogger calls close(0) MAXLISTEN (64) times. Attached patch moves the array initialization earlier. The first close(0) actually does have an effect: it closes stdin, which is fd 0. That is surely accidental, but I wonder if we should indeed close stdin in child processes? The postmaster process doesn't do anything with stdin either, although I guess a library might try to read a passphrase from stdin before starting up, for example. -- Heikki Linnakangas Neon (https://neon.tech) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 41bccb46a87..6b9e10bffa0 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -596,6 +596,9 @@ PostmasterMain(int argc, char *argv[]) IsPostmasterEnvironment = true; + for (i = 0; i < MAXLISTEN; i++) + ListenSocket[i] = PGINVALID_SOCKET; + /* * Start our win32 signal implementation */ @@ -1176,12 +1179,9 @@ PostmasterMain(int argc, char *argv[]) /* * Establish input sockets. * - * First, mark them all closed, and set up an on_proc_exit function that's - * charged with closing the sockets again at postmaster shutdown. + * Set up an on_proc_exit function that's charged with closing the sockets + * again at postmaster shutdown. */ - for (i = 0; i < MAXLISTEN; i++) - ListenSocket[i] = PGINVALID_SOCKET; - on_proc_exit(CloseServerPorts, 0); if (ListenAddresses)
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On Thu, Aug 24, 2023 at 10:05 AM Heikki Linnakangas wrote: > On 24/08/2023 15:48, Thomas Munro wrote: > > LGTM. I vaguely recall thinking that it might be better to keep > > EXEC_BACKEND and !EXEC_BACKEND working the same which might be why I > > didn't try this one, but it looks fine with the comment to explain, as > > you have it. (It's a shame we can't use O_CLOFORK.) > > Yeah, O_CLOFORK would be nice.. > > Committed, thanks! > > Since this commit, I'm getting a lot (63 per restart) of messages: LOG: could not close client or listen socket: Bad file descriptor All I have to do to get the message is turn logging_collector = on and restart. The close failure condition existed before the commit, it just wasn't logged before. So, did the extra logging added here just uncover a pre-existing bug? The LOG message is sent to the terminal, not to the log file. Cheers, Jeff
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On 24/08/2023 15:48, Thomas Munro wrote: LGTM. I vaguely recall thinking that it might be better to keep EXEC_BACKEND and !EXEC_BACKEND working the same which might be why I didn't try this one, but it looks fine with the comment to explain, as you have it. (It's a shame we can't use O_CLOFORK.) Yeah, O_CLOFORK would be nice.. Committed, thanks! -- Heikki Linnakangas Neon (https://neon.tech)
Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
On Thu, Aug 24, 2023 at 11:41 PM Heikki Linnakangas wrote: > On 11/07/2023 01:50, Andres Freund wrote: > >> From: Heikki Linnakangas > >> Date: Mon, 12 Jun 2023 16:33:20 +0300 > >> Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets > >> > >> We went through some effort to close them in the child process. Better to > >> not hand them down to the child process in the first place. > > > > I think Thomas has a larger version of this patch: > > https://postgr.es/m/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com > > Hmm, no, that's a little different. Thomas added the FD_CLOEXEC option > to the *accepted* socket in commit 1da569ca1f. That was part of that > thread. This patch adds the option to the *listen* sockets. That was not > discussed in that thread, but it's certainly in the same vein. > > Thomas: What do you think of the attached? LGTM. I vaguely recall thinking that it might be better to keep EXEC_BACKEND and !EXEC_BACKEND working the same which might be why I didn't try this one, but it looks fine with the comment to explain, as you have it. (It's a shame we can't use O_CLOFORK.) There was some question in the other thread about whether doing that to the server socket might affect accepted sockets too on some OS, but I can at least confirm that your patch works fine on FreeBSD in an EXEC_BACKEND build. I think there were some historical disagreements about which socket properties were inherited, but not that.
Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)
Focusing on this one patch in this series: On 11/07/2023 01:50, Andres Freund wrote: From: Heikki Linnakangas Date: Mon, 12 Jun 2023 16:33:20 +0300 Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets We went through some effort to close them in the child process. Better to not hand them down to the child process in the first place. I think Thomas has a larger version of this patch: https://postgr.es/m/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com Hmm, no, that's a little different. Thomas added the FD_CLOEXEC option to the *accepted* socket in commit 1da569ca1f. That was part of that thread. This patch adds the option to the *listen* sockets. That was not discussed in that thread, but it's certainly in the same vein. Thomas: What do you think of the attached? On 11/07/2023 00:07, Tristan Partin wrote: @@ -831,7 +834,8 @@ StreamConnection(pgsocket server_fd, Port *port) void StreamClose(pgsocket sock) { - closesocket(sock); + if (closesocket(sock) != 0) + elog(LOG, "closesocket failed: %m"); } /* Do you think WARNING would be a more appropriate log level? No, WARNING is for messages that you expect the client to receive. This failure is unexpected at the system level, the message is for the administrator. The distinction isn't always very clear, but LOG seems more appropriate in this case. -- Heikki Linnakangas Neon (https://neon.tech) From 95bd11720c8e88a1f65b9d600f50d71662241ba9 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Thu, 24 Aug 2023 13:53:33 +0300 Subject: [PATCH v2 1/1] Use FD_CLOEXEC on ListenSockets It's good hygiene if e.g. an extension launches a subprogram when being loaded. And we went through some effort to close them in the child process in EXEC_BACKEND mode; better to not hand them down to the child process in the first place. We still need to close them after fork when !EXEC_BACKEND, but it's a little simpler. In the passing, LOG a message if closing the client connection or listen socket fails. Shouldn't happen, but if it does, would be nice to know. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce...@iki.fi --- src/backend/libpq/pqcomm.c | 6 +- src/backend/postmaster/postmaster.c | 14 ++ 2 files changed, 11 insertions(+), 9 deletions(-) diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 1bdcbe4f636..8d217b06455 100644 --- a/src/backend/libpq/pqcomm.c +++ b/src/backend/libpq/pqcomm.c @@ -458,6 +458,9 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber, } #ifndef WIN32 + /* Don't give the listen socket to any subprograms we execute. */ + if (fcntl(fd, F_SETFD, FD_CLOEXEC) < 0) + elog(FATAL, "fcntl(F_SETFD) failed on socket: %m"); /* * Without the SO_REUSEADDR flag, a new postmaster can't be started @@ -831,7 +834,8 @@ StreamConnection(pgsocket server_fd, Port *port) void StreamClose(pgsocket sock) { - closesocket(sock); + if (closesocket(sock) != 0) + elog(LOG, "could not close client or listen socket: %m"); } /* diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 07d376d77ec..41bccb46a87 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -496,7 +496,6 @@ typedef struct Port port; InheritableSocket portsocket; char DataDir[MAXPGPATH]; - pgsocket ListenSocket[MAXLISTEN]; int32 MyCancelKey; int MyPMChildSlot; #ifndef WIN32 @@ -2545,8 +2544,6 @@ ConnFree(Port *port) void ClosePostmasterPorts(bool am_syslogger) { - int i; - /* Release resources held by the postmaster's WaitEventSet. */ if (pm_wait_set) { @@ -2573,8 +2570,12 @@ ClosePostmasterPorts(bool am_syslogger) /* * Close the postmaster's listen sockets. These aren't tracked by fd.c, * so we don't call ReleaseExternalFD() here. + * + * The listen sockets are marked as FD_CLOEXEC, so this isn't needed in + * EXEC_BACKEND mode. */ - for (i = 0; i < MAXLISTEN; i++) +#ifndef EXEC_BACKEND + for (int i = 0; i < MAXLISTEN; i++) { if (ListenSocket[i] != PGINVALID_SOCKET) { @@ -2582,6 +2583,7 @@ ClosePostmasterPorts(bool am_syslogger) ListenSocket[i] = PGINVALID_SOCKET; } } +#endif /* * If using syslogger, close the read side of the pipe. We don't bother @@ -6038,8 +6040,6 @@ save_backend_variables(BackendParameters *param, Port *port, strlcpy(param->DataDir, DataDir, MAXPGPATH); - memcpy(¶m->ListenSocket, &ListenSocket, sizeof(ListenSocket)); - param->MyCancelKey = MyCancelKey; param->MyPMChildSlot = MyPMChildSlot; @@ -6271,8 +6271,6 @@ restore_backend_variables(BackendParameters *param, Port *port) SetDataDir(param->DataDir); - memcpy(&ListenSocket, ¶m->ListenSocket, sizeof(ListenSocket)); - MyCancelKey = param->MyCancelKey; MyPMChildSlot = param->MyPMChildSlot; -- 2.39.2
Re: Refactoring backend fork+exec code
Hi, On 2023-06-18 14:22:33 +0300, Heikki Linnakangas wrote: > I started to look at the code in postmaster.c related to launching child > processes. I tried to reduce the difference between EXEC_BACKEND and > !EXEC_BACKEND code paths, and put the code that needs to differ behind a > better abstraction. I started doing this to help with implementing > multi-threading, but it doesn't introduce anything thread-related yet and I > think this improves readability anyway. Yes please! This code is absolutely awful. > From 0cb6f8d665980d30a5d2a29013000744f16bf813 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Sun, 18 Jun 2023 11:00:21 +0300 > Subject: [PATCH 3/9] Refactor CreateSharedMemoryAndSemaphores. > > Moves InitProcess calls a little later in EXEC_BACKEND case. What's the reason for this part? ISTM that we'd really want to get away from plastering duplicated InitProcess() etc everywhere. I think this might be easier to understand if you just changed did the CreateSharedMemoryAndSemaphores() -> AttachSharedMemoryAndSemaphores() piece in this commit, and the rest later. > +void > +AttachSharedMemoryAndSemaphores(void) > +{ > + /* InitProcess must've been called already */ Perhaps worth an assertion to make it easier to see that the order is wrong? > From 1d89eec53c7fefa7a4a8c011c9f19e3df64dc436 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Mon, 12 Jun 2023 16:33:20 +0300 > Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets > > We went through some effort to close them in the child process. Better to > not hand them down to the child process in the first place. I think Thomas has a larger version of this patch: https://postgr.es/m/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com > From 65384b9a6cfb3b9b589041526216e0f64d64bea5 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Sun, 18 Jun 2023 13:56:44 +0300 > Subject: [PATCH 8/9] Introduce ClientSocket, rename some funcs > > - Move more of the work on a client socket to the child process. > > - Reduce the amount of data that needs to be passed from postmaster to > child. (Used to pass a full Port struct, although most of the fields were > empty. Now we pass the much slimmer ClientSocket.) I think there might be extensions accessing Port. Not sure if it's worth worrying about, but ... > --- a/src/backend/postmaster/autovacuum.c > +++ b/src/backend/postmaster/autovacuum.c > @@ -476,8 +476,8 @@ AutoVacLauncherMain(int argc, char *argv[]) > pqsignal(SIGCHLD, SIG_DFL); > > /* > - * Create a per-backend PGPROC struct in shared memory. We must do > - * this before we can use LWLocks. > + * Create a per-backend PGPROC struct in shared memory. We must do this > + * before we can use LWLocks. >*/ > InitProcess(); > Don't think this was intentional? > From b33cfeb28a5419045acb659a01410b2b463bea3e Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Sun, 18 Jun 2023 13:59:48 +0300 > Subject: [PATCH 9/9] Refactor postmaster child process launching > > - Move code related to launching backend processes to new source file, > process_start.c I think you might have renamed this to launch_backend.c? > - Introduce new postmaster_child_launch() function that deals with the > differences between EXEC_BACKEND and fork mode. > > - Refactor the mechanism of passing informaton from the parent to > child process. Instead of using different command-line arguments > when launching the child process in EXEC_BACKEND mode, pass a > variable-length blob of data along with all the global > variables. The contents of that blob depends on the kind of child > process being launched. In !EXEC_BACKEND mode, we use the same blob, > but it's simply inherited from the parent to child process. > +constPMChildEntry entry_kinds[] = { > + {"backend", BackendMain, true}, > + > + {"autovacuum launcher", AutoVacLauncherMain, true}, > + {"autovacuum worker", AutoVacWorkerMain, true}, > + {"bgworker", BackgroundWorkerMain, true}, > + {"syslogger", SysLoggerMain, false}, > + > + {"startup", StartupProcessMain, true}, > + {"bgwriter", BackgroundWriterMain, true}, > + {"archiver", PgArchiverMain, true}, > + {"checkpointer", CheckpointerMain, true}, > + {"wal_writer", WalWriterMain, true}, > + {"wal_receiver", WalReceiverMain, true}, > +}; I'd assign them with the PostmasterChildType as index, so there's no danger of getting out of order. constPMChildEntry entry_kinds = { [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true}, ... } or such should work. I'd also use designated initializers for the fields, it's otherwise hard to know what true means etc. I think it might be good to put more into array. If we e.g. knew whether a particular child type is a backend-like, and aux process or syslogger, we could avoid the duplicated InitAuxiliary
Re: Refactoring backend fork+exec code
> From 1d89eec53c7fefa7a4a8c011c9f19e3df64dc436 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Mon, 12 Jun 2023 16:33:20 +0300 > Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets > @@ -831,7 +834,8 @@ StreamConnection(pgsocket server_fd, Port *port) > void > StreamClose(pgsocket sock) > { > - closesocket(sock); > + if (closesocket(sock) != 0) > + elog(LOG, "closesocket failed: %m"); > } > > /* Do you think WARNING would be a more appropriate log level? > From 2f518be9e96cfed1a1a49b4af8f7cb4a837aa784 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Mon, 12 Jun 2023 18:07:54 +0300 > Subject: [PATCH 5/9] Move "too many clients already" et al. checks from > ProcessStartupPacket. This seems like a change you could push already (assuming another maintainer agrees with you), which makes reviews for this patchset even easier. > From c25b67c045018a2bf05e6ff53819d26e561fc83f Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Sun, 18 Jun 2023 14:11:16 +0300 > Subject: [PATCH 6/9] Pass CAC as argument to backend process. Could you expand a bit more on why it is better to pass it as a separate argument? Does it not fit well conceptually in struct Port? > @@ -4498,15 +4510,19 @@ postmaster_forkexec(int argc, char *argv[]) > * returns the pid of the fork/exec'd process, or -1 on failure > */ > static pid_t > -backend_forkexec(Port *port) > +backend_forkexec(Port *port, CAC_state cac) > { > - char *av[4]; > + char *av[5]; > int ac = 0; > + charcacbuf[10]; > > av[ac++] = "postgres"; > av[ac++] = "--forkbackend"; > av[ac++] = NULL;/* filled in by > internal_forkexec */ > > + snprintf(cacbuf, sizeof(cacbuf), "%d", (int) cac); > + av[ac++] = cacbuf; Might be worth a sanity check that there wasn't any truncation into cacbuf, which is an impossibility as the code is written, but still useful for catching a future developer error. Is it worth adding a command line option at all instead of having the naked positional argument? It would help anybody who might read the command line what the seemingly random integer stands for. > @@ -4910,7 +4926,10 @@ SubPostmasterMain(int argc, char *argv[]) > /* Run backend or appropriate child */ > if (strcmp(argv[1], "--forkbackend") == 0) > { > - Assert(argc == 3); /* shouldn't be any more args > */ > + CAC_state cac; > + > + Assert(argc == 4); > + cac = (CAC_state) atoi(argv[3]); Perhaps an assert or full error checking that atoi succeeds would be useful for similar reasons to my previous comment. > From 658cba5cdb2e5c45faff84566906d2fcaa8a3674 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Mon, 12 Jun 2023 18:03:03 +0300 > Subject: [PATCH 7/9] Remove ConnCreate and ConnFree, and allocate Port in > stack. Again, seems like another patch that could be pushed separately assuming others don't have any comments. > From 65384b9a6cfb3b9b589041526216e0f64d64bea5 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Sun, 18 Jun 2023 13:56:44 +0300 > Subject: [PATCH 8/9] Introduce ClientSocket, rename some funcs > @@ -1499,7 +1499,7 @@ CloseServerPorts(int status, Datum arg) > { > if (ListenSocket[i] != PGINVALID_SOCKET) > { > - StreamClose(ListenSocket[i]); > + closesocket(ListenSocket[i]); > ListenSocket[i] = PGINVALID_SOCKET; > } > } I see you have been adding log messages in the case of closesocket() failing. Do you think it is worth doing here as well? One strange part about this patch is that in patch 4, you edit StreamClose() to emit a log message in the case of closesocket() failure, but then this patch just completely removes it. > @@ -4407,11 +4420,11 @@ BackendInitialize(Port *port, CAC_state cac) > * Doesn't return at all. > */ > static void > -BackendRun(Port *port) > +BackendRun(void) > { > /* > -* Create a per-backend PGPROC struct in shared memory. We must do > -* this before we can use LWLocks (in > AttachSharedMemoryAndSemaphores). > +* Create a per-backend PGPROC struct in shared memory. We must do > this > +* before we can use LWLocks (in AttachSharedMemoryAndSemaphores). > */ > InitProcess(); This comment reflow probably fits better in the patch that added the AttachSharedMemoryAndSemaphores function. > From b33cfeb28a5419045acb659a01410b2b463bea3e Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas > Date: Sun, 18 Jun 2023 13:59:48 +0300 > Subject: [PATCH 9/9] Refactor postmaster child process launching > - Move code related to launching backend processes to new source file, > process_start.c Since this seems pretty self-co