Re: [HACKERS] logical replication launcher crash on buildfarm
On Sat, Apr 15, 2017 at 12:01 AM, Tom Lane wrote: > Maybe we need to rethink the division of labor between parallel.c > and execParallel.c, but that would depend on somebody explaining > what the difference is in the first place. parallel.c handles general concerns related to any sort of parallelism, particularly those related to transaction management, hence it's in src/backend/access/transam. execParallel.c handles concerns that are specific to the executor, hence it's in src/backend/executor. The header comment for execParallel.c is fairly informative on this point; now that I look at it, parallel.c has only a stub comment, which really ought to be expanded. Note that while parallel *query* uses both, a parallel utility command will likely use only the facilities in parallel.c, so the distinction is not academic. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
Petr Jelinek writes: > On 15/04/17 06:01, Tom Lane wrote: >> I've had more than enough of seeing buildfarm failures from culicidae, >> so I whacked this around until I was happy with it and pushed it. >> Further adjustments welcome of course. > Thanks. Seems like culicidae is finally happy with master. Well, only sort of happy, but I see how we can fix that. >> I don't really understand why 9.6 needs a significantly different >> patch. AFAICS, it simply does not work (with any reliability) >> for a loadable module to call CreateParallelContext directly in 9.6. > I think the problem is that if somebody was using CreateParallelContext > it may have worked on unix when just normally forking if the extension > was loaded via shared_preload_libraries. And lot of extensions are linux > only. So we might break working setup for somebody if we change the > function signature. Ah, I didn't think of shared_preload_libraries. OK, we can't change the signature in released branches. > We may need to keep CreateParallelContext as is in back branches and add > some CreateParallelContextInternal which would do what > CreateParallelContext does in master (and is used by postgres) and then > make CreateParallelContextForExternalFunction simple wrapper around > that. It's somewhat ugly though. Yeah, but ugly compatibility hacks in back branches are pretty common. But rather than inventing "CreateParallelContextInternal", I suggest we just have the core code call CreateParallelContextForExternalFunction. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 15/04/17 06:01, Tom Lane wrote: > Petr Jelinek writes: >> So this is what I came up with on plane. Generalized the loading >> functionality into load_library_function which that can load either >> known postgres functions or call load_external_function. > > I've had more than enough of seeing buildfarm failures from culicidae, > so I whacked this around until I was happy with it and pushed it. > Further adjustments welcome of course. Thanks. Seems like culicidae is finally happy with master. > >> I am not quite sure if fmgr.c is best place to put it, but I didn't want >> to include stuff from executor in bgworker.c. > > I really did not care for putting those references into fmgr.c. Aside > from the layering violations involved, there was a type cheat, in that > you had functions with entirely different signatures in the same list. > I eventually decided that the least ugly solution was to create two > different pointer arrays and frontend functions, one for bgworkers and > one for parallel workers. That ends up requiring only one extra > export/import, to make ParallelQueryMain accessible in parallel.c. > Maybe we need to rethink the division of labor between parallel.c > and execParallel.c, but that would depend on somebody explaining > what the difference is in the first place. > Sure, I did not really like the placement in fmgr.c either, just could not find good place that would work well for both. I tend to err on the less code duplication side of "less ugly", but the different function signatures is reasonable argument for splitting it as we get additional compiler check. >> As with previous patch, 9.6 will need quite different patch as we need >> to keep compatibility there, > > I don't really understand why 9.6 needs a significantly different > patch. AFAICS, it simply does not work (with any reliability) > for a loadable module to call CreateParallelContext directly in 9.6. > The only thing that actually works is to call the undocumented > CreateParallelContextForExternalFunction. I suggest that we could > back-patch this as-is, if we add CreateParallelContextForExternalFunction > as a trivial wrapper around CreateParallelContext. > I think the problem is that if somebody was using CreateParallelContext it may have worked on unix when just normally forking if the extension was loaded via shared_preload_libraries. And lot of extensions are linux only. So we might break working setup for somebody if we change the function signature. We may need to keep CreateParallelContext as is in back branches and add some CreateParallelContextInternal which would do what CreateParallelContext does in master (and is used by postgres) and then make CreateParallelContextForExternalFunction simple wrapper around that. It's somewhat ugly though. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
Petr Jelinek writes: > So this is what I came up with on plane. Generalized the loading > functionality into load_library_function which that can load either > known postgres functions or call load_external_function. I've had more than enough of seeing buildfarm failures from culicidae, so I whacked this around until I was happy with it and pushed it. Further adjustments welcome of course. > I am not quite sure if fmgr.c is best place to put it, but I didn't want > to include stuff from executor in bgworker.c. I really did not care for putting those references into fmgr.c. Aside from the layering violations involved, there was a type cheat, in that you had functions with entirely different signatures in the same list. I eventually decided that the least ugly solution was to create two different pointer arrays and frontend functions, one for bgworkers and one for parallel workers. That ends up requiring only one extra export/import, to make ParallelQueryMain accessible in parallel.c. Maybe we need to rethink the division of labor between parallel.c and execParallel.c, but that would depend on somebody explaining what the difference is in the first place. > As with previous patch, 9.6 will need quite different patch as we need > to keep compatibility there, I don't really understand why 9.6 needs a significantly different patch. AFAICS, it simply does not work (with any reliability) for a loadable module to call CreateParallelContext directly in 9.6. The only thing that actually works is to call the undocumented CreateParallelContextForExternalFunction. I suggest that we could back-patch this as-is, if we add CreateParallelContextForExternalFunction as a trivial wrapper around CreateParallelContext. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Tue, Apr 4, 2017 at 2:08 PM, Petr Jelinek wrote: > So this is what I came up with on plane. Generalized the loading > functionality into load_library_function which that can load either > known postgres functions or call load_external_function. > > I am not quite sure if fmgr.c is best place to put it, but I didn't want > to include stuff from executor in bgworker.c. I was thinking about > putting it to dfmgr.c (as that's where load_external_function) but again > seemed weird to including executor there. > > As with previous patch, 9.6 will need quite different patch as we need > to keep compatibility there, but since I am traveling I think it's > better to share what I have so far. - * Register a new background worker while processing shared_preload_libraries. + * Register a new static background worker. * - * This can only be called in the _PG_init function of a module library - * that's loaded by shared_preload_libraries; otherwise it has no effect. + * This can only be called directly from postmaster or in the _PG_init + * function of a module library that's loaded by shared_preload_libraries; + * otherwise it will have no effect. This does seem to need updating, but that's a somewhat separate issue from fixing the problem at issue. + * Serialize extension entrypoint information. It's unsafe to pass + * function pointers across parallel processes as the function pointer + * may be different in new process in EXEC_BACKEND builds so we + * always pass library and function name. The word "extension" should be removed. This basically looks OK otherwise, but we'll need something else for 9.6. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
So this is what I came up with on plane. Generalized the loading functionality into load_library_function which that can load either known postgres functions or call load_external_function. I am not quite sure if fmgr.c is best place to put it, but I didn't want to include stuff from executor in bgworker.c. I was thinking about putting it to dfmgr.c (as that's where load_external_function) but again seemed weird to including executor there. As with previous patch, 9.6 will need quite different patch as we need to keep compatibility there, but since I am traveling I think it's better to share what I have so far. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/access/transam/README.parallel b/src/backend/access/transam/README.parallel index db9ac3d..b360887 100644 --- a/src/backend/access/transam/README.parallel +++ b/src/backend/access/transam/README.parallel @@ -198,7 +198,7 @@ pattern looks like this: EnterParallelMode(); /* prohibit unsafe state changes */ - pcxt = CreateParallelContext(entrypoint, nworkers); + pcxt = CreateParallelContext("library_name", "function_name", nworkers); /* Allow space for application-specific data here. */ shm_toc_estimate_chunk(&pcxt->estimator, size); diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index b3d3853..326d4f9 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -61,7 +61,7 @@ #define PARALLEL_KEY_TRANSACTION_SNAPSHOT UINT64CONST(0x0006) #define PARALLEL_KEY_ACTIVE_SNAPSHOT UINT64CONST(0x0007) #define PARALLEL_KEY_TRANSACTION_STATE UINT64CONST(0x0008) -#define PARALLEL_KEY_EXTENSION_TRAMPOLINE UINT64CONST(0x0009) +#define PARALLEL_KEY_ENTRYPOINTUINT64CONST(0x0009) /* Fixed-size parallel state. */ typedef struct FixedParallelState @@ -77,9 +77,6 @@ typedef struct FixedParallelState pid_t parallel_master_pid; BackendId parallel_master_backend_id; - /* Entrypoint for parallel workers. */ - parallel_worker_main_type entrypoint; - /* Mutex protects remaining fields. */ slock_t mutex; @@ -109,7 +106,6 @@ static dlist_head pcxt_list = DLIST_STATIC_INIT(pcxt_list); /* Private functions. */ static void HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg); -static void ParallelExtensionTrampoline(dsm_segment *seg, shm_toc *toc); static void WaitForParallelWorkersToExit(ParallelContext *pcxt); @@ -119,7 +115,8 @@ static void WaitForParallelWorkersToExit(ParallelContext *pcxt); * destroyed before exiting the current subtransaction. */ ParallelContext * -CreateParallelContext(parallel_worker_main_type entrypoint, int nworkers) +CreateParallelContext(const char *library_name, const char *function_name, + int nworkers) { MemoryContext oldcontext; ParallelContext *pcxt; @@ -152,7 +149,8 @@ CreateParallelContext(parallel_worker_main_type entrypoint, int nworkers) pcxt = palloc0(sizeof(ParallelContext)); pcxt->subid = GetCurrentSubTransactionId(); pcxt->nworkers = nworkers; - pcxt->entrypoint = entrypoint; + pcxt->library_name = pstrdup(library_name); + pcxt->function_name = pstrdup(function_name); pcxt->error_context_stack = error_context_stack; shm_toc_initialize_estimator(&pcxt->estimator); dlist_push_head(&pcxt_list, &pcxt->node); @@ -164,33 +162,6 @@ CreateParallelContext(parallel_worker_main_type entrypoint, int nworkers) } /* - * Establish a new parallel context that calls a function provided by an - * extension. This works around the fact that the library might get mapped - * at a different address in each backend. - */ -ParallelContext * -CreateParallelContextForExternalFunction(char *library_name, - char *function_name, - int nworkers) -{ - MemoryContext oldcontext; - ParallelContext *pcxt; - - /* We might be running in a very short-lived memory context. */ - oldcontext = MemoryContextSwitchTo(TopTransactionContext); - - /* Create the context. */ - pcxt = CreateParallelContext(ParallelExtensionTrampoline, nworkers); - pcxt->library_name = pstrdup(library_name); - pcxt->function_name = pstrdup(function_name); - - /* Restore previous memory context. */ - MemoryContextSwitchTo(oldcontext); - - return pcxt; -} - -/* * Establish the dynamic shared memory segment for a parallel context and * copy state and other bookkeeping information that will be needed by * parallel workers into it. @@ -249,15 +220,10 @@ InitializeParallelDSM(ParallelContext *pcxt) pcxt->nworkers)); shm_toc_estimate_keys(&pcxt->estimator, 1); - /* Estimate how much we'll need for extension entrypoint info. */ - if (pcxt->library_name != NULL) - { - Assert(pcxt->entrypoint == ParallelExtensionTrampoline); - Assert(pcxt->function_name != NULL); - shm_toc_estimate_chunk(&pcxt->estimator, strlen(pcxt->library_name) -
Re: [HACKERS] logical replication launcher crash on buildfarm
On 01/04/17 04:19, Andres Freund wrote: > On 2017-03-31 21:30:17 -0400, Robert Haas wrote: >> On Fri, Mar 31, 2017 at 9:26 PM, Petr Jelinek >> wrote: Hmm, I don't know if there's any good reason not to just use strcmp(), but sure, OK. Committed and back-patched. >>> >>> Hmm culicidae still fails, this time only in parallel worker code. This >>> didn't happen on my machine which is strange. Looking at the code, we >>> are passing the fps->entrypoint as function pointer again so of course >>> it fails. We have some code to load libraries again but even that gets >>> initial entrypoint passed as function pointer >>> (ParallelExtensionTrampoline). I wonder if we'll have to generalize the >>> InternalBGWorkers even more to some kind of internal function name to >>> pointer map and add the parallel entry points there as well. >> >> Argh, I forgot about that. I think we need to use something like >> ParallelExtensionTrampoline all the time, not just for libraries. >> Since effectively, we've determined that postgres itself has the same >> problem. > > I wonder if we shouldn't just introduce a KnownFunctionPointer[] map, > that can be used by various facilities (bgworkers themselves, > parallelism, and probably more coming). Otherwise we'll introduce > mechanisms like 7d8f6986b8 in multiple places. > Yes that's what I meant with what I said above. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-31 21:30:17 -0400, Robert Haas wrote: > On Fri, Mar 31, 2017 at 9:26 PM, Petr Jelinek > wrote: > >> Hmm, I don't know if there's any good reason not to just use strcmp(), > >> but sure, OK. Committed and back-patched. > > > > Hmm culicidae still fails, this time only in parallel worker code. This > > didn't happen on my machine which is strange. Looking at the code, we > > are passing the fps->entrypoint as function pointer again so of course > > it fails. We have some code to load libraries again but even that gets > > initial entrypoint passed as function pointer > > (ParallelExtensionTrampoline). I wonder if we'll have to generalize the > > InternalBGWorkers even more to some kind of internal function name to > > pointer map and add the parallel entry points there as well. > > Argh, I forgot about that. I think we need to use something like > ParallelExtensionTrampoline all the time, not just for libraries. > Since effectively, we've determined that postgres itself has the same > problem. I wonder if we shouldn't just introduce a KnownFunctionPointer[] map, that can be used by various facilities (bgworkers themselves, parallelism, and probably more coming). Otherwise we'll introduce mechanisms like 7d8f6986b8 in multiple places. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-04-01 03:57:05 +0200, Petr Jelinek wrote: > On 01/04/17 03:44, Andres Freund wrote: > > On 2017-04-01 03:26:01 +0200, Petr Jelinek wrote: > >> On 01/04/17 02:53, Robert Haas wrote: > >>> On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek > >>> wrote: > Right, changed to BGW_MAXLEN. > >>> > >>> Hmm, I don't know if there's any good reason not to just use strcmp(), > >>> but sure, OK. Committed and back-patched. > > > > Cool! > > > > > >> Hmm culicidae still fails, this time only in parallel worker code. > >> This didn't happen on my machine which is strange. > > > > Interesting. Did you reproduce the failure before? I wonder if the > > issue is that you don't build a position independent executable? > > > > I did reproduce the original issue yes, that's how I tested the patch. I > just copy pasted CFLAGS from the buildfarm configuration of culicidae TBH. That's not necessarily sufficient, unless you also use a sufficiently new debian - they changed gcc to default to PIE being enabled by default. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 01/04/17 03:44, Andres Freund wrote: > On 2017-04-01 03:26:01 +0200, Petr Jelinek wrote: >> On 01/04/17 02:53, Robert Haas wrote: >>> On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek >>> wrote: Right, changed to BGW_MAXLEN. >>> >>> Hmm, I don't know if there's any good reason not to just use strcmp(), >>> but sure, OK. Committed and back-patched. > > Cool! > > >> Hmm culicidae still fails, this time only in parallel worker code. >> This didn't happen on my machine which is strange. > > Interesting. Did you reproduce the failure before? I wonder if the > issue is that you don't build a position independent executable? > I did reproduce the original issue yes, that's how I tested the patch. I just copy pasted CFLAGS from the buildfarm configuration of culicidae TBH. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-04-01 03:26:01 +0200, Petr Jelinek wrote: > On 01/04/17 02:53, Robert Haas wrote: > > On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek > > wrote: > >> Right, changed to BGW_MAXLEN. > > > > Hmm, I don't know if there's any good reason not to just use strcmp(), > > but sure, OK. Committed and back-patched. Cool! > Hmm culicidae still fails, this time only in parallel worker code. > This didn't happen on my machine which is strange. Interesting. Did you reproduce the failure before? I wonder if the issue is that you don't build a position independent executable? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Fri, Mar 31, 2017 at 9:26 PM, Petr Jelinek wrote: >> Hmm, I don't know if there's any good reason not to just use strcmp(), >> but sure, OK. Committed and back-patched. > > Hmm culicidae still fails, this time only in parallel worker code. This > didn't happen on my machine which is strange. Looking at the code, we > are passing the fps->entrypoint as function pointer again so of course > it fails. We have some code to load libraries again but even that gets > initial entrypoint passed as function pointer > (ParallelExtensionTrampoline). I wonder if we'll have to generalize the > InternalBGWorkers even more to some kind of internal function name to > pointer map and add the parallel entry points there as well. Argh, I forgot about that. I think we need to use something like ParallelExtensionTrampoline all the time, not just for libraries. Since effectively, we've determined that postgres itself has the same problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 01/04/17 02:53, Robert Haas wrote: > On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek > wrote: >> Right, changed to BGW_MAXLEN. > > Hmm, I don't know if there's any good reason not to just use strcmp(), > but sure, OK. Committed and back-patched. > Hmm culicidae still fails, this time only in parallel worker code. This didn't happen on my machine which is strange. Looking at the code, we are passing the fps->entrypoint as function pointer again so of course it fails. We have some code to load libraries again but even that gets initial entrypoint passed as function pointer (ParallelExtensionTrampoline). I wonder if we'll have to generalize the InternalBGWorkers even more to some kind of internal function name to pointer map and add the parallel entry points there as well. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Fri, Mar 31, 2017 at 8:32 PM, Petr Jelinek wrote: > Right, changed to BGW_MAXLEN. Hmm, I don't know if there's any good reason not to just use strcmp(), but sure, OK. Committed and back-patched. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 31/03/17 15:42, Robert Haas wrote: > On Tue, Mar 28, 2017 at 7:39 PM, Petr Jelinek > wrote: >> Sigh, forgot git add for the docs, so one more try... > > +if (strncmp(worker->bgw_library_name, "postgres", 8) != 0) > +return NULL; > > I think that's not right. You don't want to match postgresshdkgjsdglkjs. > Right, changed to BGW_MAXLEN. > Aside from that, these look good to me. > Cool. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 74b9f0d7063071dd443fa820984fba477f4446cc Mon Sep 17 00:00:00 2001 From: Petr Jelinek Date: Tue, 28 Mar 2017 23:24:34 +0200 Subject: [PATCH] Use library_name/function_name for loading main entry point of internal bgworkers --- src/backend/access/transam/parallel.c| 7 +-- src/backend/postmaster/bgworker.c| 76 ++-- src/include/access/parallel.h| 2 + src/test/modules/worker_spi/worker_spi.c | 6 ++- 4 files changed, 73 insertions(+), 18 deletions(-) diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index cde0ed3..d2bed72 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -109,7 +109,6 @@ static dlist_head pcxt_list = DLIST_STATIC_INIT(pcxt_list); /* Private functions. */ static void HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg); static void ParallelExtensionTrampoline(dsm_segment *seg, shm_toc *toc); -static void ParallelWorkerMain(Datum main_arg); static void WaitForParallelWorkersToExit(ParallelContext *pcxt); @@ -456,7 +455,9 @@ LaunchParallelWorkers(ParallelContext *pcxt) BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION; worker.bgw_start_time = BgWorkerStart_ConsistentState; worker.bgw_restart_time = BGW_NEVER_RESTART; - worker.bgw_main = ParallelWorkerMain; + worker.bgw_main = NULL; + sprintf(worker.bgw_library_name, "postgres"); + sprintf(worker.bgw_function_name, "ParallelWorkerMain"); worker.bgw_main_arg = UInt32GetDatum(dsm_segment_handle(pcxt->seg)); worker.bgw_notify_pid = MyProcPid; memset(&worker.bgw_extra, 0, BGW_EXTRALEN); @@ -928,7 +929,7 @@ AtEOXact_Parallel(bool isCommit) /* * Main entrypoint for parallel workers. */ -static void +void ParallelWorkerMain(Datum main_arg) { dsm_segment *seg; diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c index 52bc4e9..5ea5abf 100644 --- a/src/backend/postmaster/bgworker.c +++ b/src/backend/postmaster/bgworker.c @@ -16,6 +16,7 @@ #include "miscadmin.h" #include "libpq/pqsignal.h" +#include "access/parallel.h" #include "postmaster/bgworker_internals.h" #include "postmaster/postmaster.h" #include "storage/barrier.h" @@ -94,6 +95,25 @@ struct BackgroundWorkerHandle static BackgroundWorkerArray *BackgroundWorkerData; /* + * List of internal background workers. These are used for mapping the + * function name to actual function when building with EXEC_BACKEND and also + * to allow these to be loaded outside of shared_preload_libraries. + */ +typedef struct InternalBGWorkerMain +{ + char *bgw_function_name; + bgworker_main_type bgw_main; +} InternalBGWorkerMain; + +static const InternalBGWorkerMain InternalBGWorkers[] = { + {"ParallelWorkerMain", ParallelWorkerMain}, + /* Dummy entry marking end of the array. */ + {NULL, NULL} +}; + +static bgworker_main_type GetInternalBgWorkerMain(BackgroundWorker *worker); + +/* * Calculate shared memory needed. */ Size @@ -695,22 +715,27 @@ StartBackgroundWorker(void) #endif } + /* For internal workers set the entry point to known function address. */ + entrypt = GetInternalBgWorkerMain(worker); + /* - * If bgw_main is set, we use that value as the initial entrypoint. - * However, if the library containing the entrypoint wasn't loaded at - * postmaster startup time, passing it as a direct function pointer is not - * possible. To work around that, we allow callers for whom a function - * pointer is not available to pass a library name (which will be loaded, - * if necessary) and a function name (which will be looked up in the named + * Otherwise, if bgw_main is set, we use that value as the initial + * entrypoint. This does not work well EXEC_BACKEND outside Windows but + * we keep the logic for backwards compatibility. In other cases use + * the entry point specified by library name (which will be loaded, if + * necessary) and a function name (which will be looked up in the named * library). */ - if (worker->bgw_main != NULL) - entrypt = worker->bgw_main; - else - entrypt = (bgworker_main_type) - load_external_function(worker->bgw_library_name, - worker->bgw_function_name, - true, NULL); + if (entrypt == NULL) + { + if (worker->bgw_main != NULL) + entrypt = worker->bgw_main; + else + entrypt = (bgworker_main_type) +load_external_function(worker->bgw_library_name, +
Re: [HACKERS] logical replication launcher crash on buildfarm
On Tue, Mar 28, 2017 at 7:39 PM, Petr Jelinek wrote: > Sigh, forgot git add for the docs, so one more try... +if (strncmp(worker->bgw_library_name, "postgres", 8) != 0) +return NULL; I think that's not right. You don't want to match postgresshdkgjsdglkjs. Aside from that, these look good to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 29/03/17 01:29, Petr Jelinek wrote: > On 28/03/17 18:05, Petr Jelinek wrote: >> On 28/03/17 17:55, Robert Haas wrote: >>> On Mon, Mar 27, 2017 at 11:20 PM, Petr Jelinek >>> wrote: On 28/03/17 04:46, Robert Haas wrote: > On Mon, Mar 27, 2017 at 10:04 PM, Andres Freund > wrote: >>> Btw now that I look at the code, I guess we'll want to get rid of >>> bgw_main completely in HEAD given that we can't guarantee it will be >>> valid even for shared_preload_library libraries. For older branches I >>> would leave things as they are in this regard as there don't seem to be >>> any immediate issue for standard binaries. >> >> As long as you fix it so culicidae is happy (in 9.6) ;). I think it's >> fine to just introduce bgw_builtin_id or such, and leave the bgw_main >> code in place in < HEAD. > > I wasn't thinking of introducing bgw_builtin_id. My idea was just > along the lines of > > if (bgw_library_name == NULL && bgw_function_name != NULL) > { > if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) >ParallelQueryMain(blah); > else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) >LogicalReplicationMain(blah); > } > > I think something like that is certainly better for the back-branches, > because it doesn't cause an ABI break. But I think it would also be > fine for master. > I had something slightly more complex like the attached in mind. >>> >>> Seems broadly reasonable on a quick look, but I think we should leave >>> bgw_main intact in 9.6. It may be working for fine for people who >>> don't care about Windows, and I'd rather not break it gratuitously. >>> Can we have two patches, one of which converts the internal stuff to >>> use the new mechanism and a second of which removes bgw_main? The >>> second one, at least, also needs to update the documentation. (A good >>> practice when removing an identifier is to run 'git grep >>> thing_i_am_removing' after removing it...) >>> >> >> Yes I agree (and I said it in the thread) I plan to submit the 9.6 too, >> I just sent what I had before having it all ready mainly because you >> started talking about code already and I didn't want you to waste time >> with it needlessly, but it was already 5AM for me ;). >> > > Okay finally got to it, the 9.6 adds similar mechanism as HEAD (it could > be simpler there but it seems like same code is better) but does not > remove the bgw_main. > Sigh, forgot git add for the docs, so one more try... -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services Use-library_name-function_name-for-loading-main-entr-96.patch Description: binary/octet-stream Use-library_name-function_name-for-loading-main-entr-HEAD.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 28/03/17 18:05, Petr Jelinek wrote: > On 28/03/17 17:55, Robert Haas wrote: >> On Mon, Mar 27, 2017 at 11:20 PM, Petr Jelinek >> wrote: >>> On 28/03/17 04:46, Robert Haas wrote: On Mon, Mar 27, 2017 at 10:04 PM, Andres Freund wrote: >> Btw now that I look at the code, I guess we'll want to get rid of >> bgw_main completely in HEAD given that we can't guarantee it will be >> valid even for shared_preload_library libraries. For older branches I >> would leave things as they are in this regard as there don't seem to be >> any immediate issue for standard binaries. > > As long as you fix it so culicidae is happy (in 9.6) ;). I think it's > fine to just introduce bgw_builtin_id or such, and leave the bgw_main > code in place in < HEAD. I wasn't thinking of introducing bgw_builtin_id. My idea was just along the lines of if (bgw_library_name == NULL && bgw_function_name != NULL) { if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) ParallelQueryMain(blah); else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) LogicalReplicationMain(blah); } I think something like that is certainly better for the back-branches, because it doesn't cause an ABI break. But I think it would also be fine for master. >>> >>> I had something slightly more complex like the attached in mind. >> >> Seems broadly reasonable on a quick look, but I think we should leave >> bgw_main intact in 9.6. It may be working for fine for people who >> don't care about Windows, and I'd rather not break it gratuitously. >> Can we have two patches, one of which converts the internal stuff to >> use the new mechanism and a second of which removes bgw_main? The >> second one, at least, also needs to update the documentation. (A good >> practice when removing an identifier is to run 'git grep >> thing_i_am_removing' after removing it...) >> > > Yes I agree (and I said it in the thread) I plan to submit the 9.6 too, > I just sent what I had before having it all ready mainly because you > started talking about code already and I didn't want you to waste time > with it needlessly, but it was already 5AM for me ;). > Okay finally got to it, the 9.6 adds similar mechanism as HEAD (it could be simpler there but it seems like same code is better) but does not remove the bgw_main. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services Use-library_name-function_name-for-loading-main-entr-HEAD.patch Description: binary/octet-stream Use-library_name-function_name-for-loading-main-entr-96.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 28/03/17 17:55, Robert Haas wrote: > On Mon, Mar 27, 2017 at 11:20 PM, Petr Jelinek > wrote: >> On 28/03/17 04:46, Robert Haas wrote: >>> On Mon, Mar 27, 2017 at 10:04 PM, Andres Freund wrote: > Btw now that I look at the code, I guess we'll want to get rid of > bgw_main completely in HEAD given that we can't guarantee it will be > valid even for shared_preload_library libraries. For older branches I > would leave things as they are in this regard as there don't seem to be > any immediate issue for standard binaries. As long as you fix it so culicidae is happy (in 9.6) ;). I think it's fine to just introduce bgw_builtin_id or such, and leave the bgw_main code in place in < HEAD. >>> >>> I wasn't thinking of introducing bgw_builtin_id. My idea was just >>> along the lines of >>> >>> if (bgw_library_name == NULL && bgw_function_name != NULL) >>> { >>> if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) >>>ParallelQueryMain(blah); >>> else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) >>>LogicalReplicationMain(blah); >>> } >>> >>> I think something like that is certainly better for the back-branches, >>> because it doesn't cause an ABI break. But I think it would also be >>> fine for master. >>> >> >> I had something slightly more complex like the attached in mind. > > Seems broadly reasonable on a quick look, but I think we should leave > bgw_main intact in 9.6. It may be working for fine for people who > don't care about Windows, and I'd rather not break it gratuitously. > Can we have two patches, one of which converts the internal stuff to > use the new mechanism and a second of which removes bgw_main? The > second one, at least, also needs to update the documentation. (A good > practice when removing an identifier is to run 'git grep > thing_i_am_removing' after removing it...) > Yes I agree (and I said it in the thread) I plan to submit the 9.6 too, I just sent what I had before having it all ready mainly because you started talking about code already and I didn't want you to waste time with it needlessly, but it was already 5AM for me ;). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Mon, Mar 27, 2017 at 11:20 PM, Petr Jelinek wrote: > On 28/03/17 04:46, Robert Haas wrote: >> On Mon, Mar 27, 2017 at 10:04 PM, Andres Freund wrote: Btw now that I look at the code, I guess we'll want to get rid of bgw_main completely in HEAD given that we can't guarantee it will be valid even for shared_preload_library libraries. For older branches I would leave things as they are in this regard as there don't seem to be any immediate issue for standard binaries. >>> >>> As long as you fix it so culicidae is happy (in 9.6) ;). I think it's >>> fine to just introduce bgw_builtin_id or such, and leave the bgw_main >>> code in place in < HEAD. >> >> I wasn't thinking of introducing bgw_builtin_id. My idea was just >> along the lines of >> >> if (bgw_library_name == NULL && bgw_function_name != NULL) >> { >> if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) >>ParallelQueryMain(blah); >> else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) >>LogicalReplicationMain(blah); >> } >> >> I think something like that is certainly better for the back-branches, >> because it doesn't cause an ABI break. But I think it would also be >> fine for master. >> > > I had something slightly more complex like the attached in mind. Seems broadly reasonable on a quick look, but I think we should leave bgw_main intact in 9.6. It may be working for fine for people who don't care about Windows, and I'd rather not break it gratuitously. Can we have two patches, one of which converts the internal stuff to use the new mechanism and a second of which removes bgw_main? The second one, at least, also needs to update the documentation. (A good practice when removing an identifier is to run 'git grep thing_i_am_removing' after removing it...) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 28/03/17 04:46, Robert Haas wrote: > On Mon, Mar 27, 2017 at 10:04 PM, Andres Freund wrote: >>> Btw now that I look at the code, I guess we'll want to get rid of >>> bgw_main completely in HEAD given that we can't guarantee it will be >>> valid even for shared_preload_library libraries. For older branches I >>> would leave things as they are in this regard as there don't seem to be >>> any immediate issue for standard binaries. >> >> As long as you fix it so culicidae is happy (in 9.6) ;). I think it's >> fine to just introduce bgw_builtin_id or such, and leave the bgw_main >> code in place in < HEAD. > > I wasn't thinking of introducing bgw_builtin_id. My idea was just > along the lines of > > if (bgw_library_name == NULL && bgw_function_name != NULL) > { > if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) >ParallelQueryMain(blah); > else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) >LogicalReplicationMain(blah); > } > > I think something like that is certainly better for the back-branches, > because it doesn't cause an ABI break. But I think it would also be > fine for master. > I had something slightly more complex like the attached in mind. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services 0001-Remove-bgw_main-from-bgworker-and-always-use-library.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
Robert Haas writes: > I wasn't thinking of introducing bgw_builtin_id. My idea was just > along the lines of > if (bgw_library_name == NULL && bgw_function_name != NULL) > { > if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) >ParallelQueryMain(blah); > else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) >LogicalReplicationMain(blah); > } > I think something like that is certainly better for the back-branches, > because it doesn't cause an ABI break. But I think it would also be > fine for master. That seems perfectly reasonable from here: surely the cost of a couple of strcmp's is trivial in comparison to a process launch. We can redesign the API whenever this way starts getting unwieldy, but that's likely to be quite some time away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Mon, Mar 27, 2017 at 10:04 PM, Andres Freund wrote: >> Btw now that I look at the code, I guess we'll want to get rid of >> bgw_main completely in HEAD given that we can't guarantee it will be >> valid even for shared_preload_library libraries. For older branches I >> would leave things as they are in this regard as there don't seem to be >> any immediate issue for standard binaries. > > As long as you fix it so culicidae is happy (in 9.6) ;). I think it's > fine to just introduce bgw_builtin_id or such, and leave the bgw_main > code in place in < HEAD. I wasn't thinking of introducing bgw_builtin_id. My idea was just along the lines of if (bgw_library_name == NULL && bgw_function_name != NULL) { if (strcmp(bgw_function_name, "ParallelQueryMain") == 0) ParallelQueryMain(blah); else if (strcmp(bgw_function_name, "LogicalReplicationMain") == 0) LogicalReplicationMain(blah); } I think something like that is certainly better for the back-branches, because it doesn't cause an ABI break. But I think it would also be fine for master. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-28 03:47:50 +0200, Petr Jelinek wrote: > On 28/03/17 03:31, Petr Jelinek wrote: > > On 27/03/17 19:01, Robert Haas wrote: > >> On Mon, Mar 27, 2017 at 12:50 PM, Andres Freund wrote: > >>> Robert, Petr, either of you planning to fix this (as outlined elsewhere > >>> in the thred)? > >> > >> Oh, I didn't realize anybody was looking to me to fix this. I sort of > >> thought that it was fallout from the logical replication patch and > >> that Petr or Peter would deal with it. If that's not the case, I'm > >> not totally unwilling to take a whack at it, but I don't have much > >> personal enthusiasm for trying to figure out how to make dynamic > >> loading on the postgres binary itself work everywhere, so if it falls > >> to me to fix, it's likely to get a hard-coded check for some > >> hard-coded name. > >> > > > > It affects parallel workers same way, I'll write patch for HEAD soon, > > 9.6 probably with some delay. I'll expand the InternalBgWorkers thing > > that was added with logical replication to handle this in similar > > fashion how we do fmgrtab. > > > > Btw now that I look at the code, I guess we'll want to get rid of > bgw_main completely in HEAD given that we can't guarantee it will be > valid even for shared_preload_library libraries. For older branches I > would leave things as they are in this regard as there don't seem to be > any immediate issue for standard binaries. As long as you fix it so culicidae is happy (in 9.6) ;). I think it's fine to just introduce bgw_builtin_id or such, and leave the bgw_main code in place in < HEAD. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 28/03/17 03:31, Petr Jelinek wrote: > On 27/03/17 19:01, Robert Haas wrote: >> On Mon, Mar 27, 2017 at 12:50 PM, Andres Freund wrote: >>> Robert, Petr, either of you planning to fix this (as outlined elsewhere >>> in the thred)? >> >> Oh, I didn't realize anybody was looking to me to fix this. I sort of >> thought that it was fallout from the logical replication patch and >> that Petr or Peter would deal with it. If that's not the case, I'm >> not totally unwilling to take a whack at it, but I don't have much >> personal enthusiasm for trying to figure out how to make dynamic >> loading on the postgres binary itself work everywhere, so if it falls >> to me to fix, it's likely to get a hard-coded check for some >> hard-coded name. >> > > It affects parallel workers same way, I'll write patch for HEAD soon, > 9.6 probably with some delay. I'll expand the InternalBgWorkers thing > that was added with logical replication to handle this in similar > fashion how we do fmgrtab. > Btw now that I look at the code, I guess we'll want to get rid of bgw_main completely in HEAD given that we can't guarantee it will be valid even for shared_preload_library libraries. For older branches I would leave things as they are in this regard as there don't seem to be any immediate issue for standard binaries. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 27/03/17 19:01, Robert Haas wrote: > On Mon, Mar 27, 2017 at 12:50 PM, Andres Freund wrote: >> Robert, Petr, either of you planning to fix this (as outlined elsewhere >> in the thred)? > > Oh, I didn't realize anybody was looking to me to fix this. I sort of > thought that it was fallout from the logical replication patch and > that Petr or Peter would deal with it. If that's not the case, I'm > not totally unwilling to take a whack at it, but I don't have much > personal enthusiasm for trying to figure out how to make dynamic > loading on the postgres binary itself work everywhere, so if it falls > to me to fix, it's likely to get a hard-coded check for some > hard-coded name. > It affects parallel workers same way, I'll write patch for HEAD soon, 9.6 probably with some delay. I'll expand the InternalBgWorkers thing that was added with logical replication to handle this in similar fashion how we do fmgrtab. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
Robert Haas writes: > ... I don't have much > personal enthusiasm for trying to figure out how to make dynamic > loading on the postgres binary itself work everywhere, so if it falls > to me to fix, it's likely to get a hard-coded check for some > hard-coded name. +1. This seems like no time to be buying into brand new portability requirements without a very pressing need to do so; and this patch doesn't appear to create one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-27 13:01:11 -0400, Robert Haas wrote: > On Mon, Mar 27, 2017 at 12:50 PM, Andres Freund wrote: > > Robert, Petr, either of you planning to fix this (as outlined elsewhere > > in the thred)? > > Oh, I didn't realize anybody was looking to me to fix this. Well, it's borked in 9.6. I'm starting to get annoyed by culicidae's failures ;) > I sort of thought that it was fallout from the logical replication > patch and that Petr or Peter would deal with it. If that's not the > case, I'm not totally unwilling to take a whack at it, but I don't > have much personal enthusiasm for trying to figure out how to make > dynamic loading on the postgres binary itself work everywhere, so if > it falls to me to fix, it's likely to get a hard-coded check for some > hard-coded name. I'm all for that approach - there seems very little upside in the dynamic loading approach. Just defining a bgw_entry_points[enum BuiltinBGWorkerType] -> bgworker_main_type array seems to be simple enough - it's not like we're going to add new types of builtin bgworkers at runtime. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Mon, Mar 27, 2017 at 12:50 PM, Andres Freund wrote: > Robert, Petr, either of you planning to fix this (as outlined elsewhere > in the thred)? Oh, I didn't realize anybody was looking to me to fix this. I sort of thought that it was fallout from the logical replication patch and that Petr or Peter would deal with it. If that's not the case, I'm not totally unwilling to take a whack at it, but I don't have much personal enthusiasm for trying to figure out how to make dynamic loading on the postgres binary itself work everywhere, so if it falls to me to fix, it's likely to get a hard-coded check for some hard-coded name. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-16 10:13:37 +0100, Petr Jelinek wrote: > On 16/03/17 09:53, Andres Freund wrote: > > On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote: > >> On 16/03/17 04:42, Andres Freund wrote: > >>> On 2017-03-15 20:28:33 -0700, Andres Freund wrote: > Hi, > > I just unstuck a bunch of my buildfarm animals. That triggered some > spurious failures (on piculet, calliphoridae, mylodon), but also one > that doesn't really look like that: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 > > with the pertinent point being: > > == stack trace: > pgsql.build/src/test/regress/tmp_check/data/core == > [New LWP 1894] > [Thread debugging using libthread_db enabled] > Using host libthread_db library > "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `postgres: bgworker: logical replication launcher > '. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x55e265bff5e3 in ?? () > #0 0x55e265bff5e3 in ?? () > #1 0x55d3ccabed0d in StartBackgroundWorker () at > /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 > #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, > argv=0x55d3cdbb71c0) at > /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 > #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at > /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 > > it's possible that me killing things and upgrading caused this, but > given this is a backend running EXEC_BACKEND, I'm a bit suspicous that > it's more than that. The machine is a bit backed up at the moment, so > it'll probably be a while till it's at that animal/branch again, > otherwise I'd not have mentioned this. > >>> > >>> For some reason it ran again pretty soon. And I'm afraid it's indeed an > >>> issue: > >>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 > >>> > >> > >> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it > >> seems to work fine on my two machines. I don't see anything else > >> different on culicidae though. Sadly the backtrace is not that > >> informative either. I'll try to investigate more but it will take time... > > > > Worthwhile additional failure: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A55%3A01 > > > > Same animal, also EXEC_BACKEND, but 9.6. > > > > A quick look at the relevant line: > > /* > > * If bgw_main is set, we use that value as the initial entrypoint. > > * However, if the library containing the entrypoint wasn't loaded at > > * postmaster startup time, passing it as a direct function pointer is > > not > > * possible. To work around that, we allow callers for whom a function > > * pointer is not available to pass a library name (which will be > > loaded, > > * if necessary) and a function name (which will be looked up in the > > named > > * library). > > */ > > if (worker->bgw_main != NULL) > > entrypt = worker->bgw_main; > > > > makes the issue clear - we appear to be assuming that bgw_main is > > meaningful across processes. Which it isn't in the EXEC_BACKEND case > > when ASLR is in use... > > > > This kinda sounds familiar, but a quick google search doesn't find > > anything relevant. Robert, Petr, either of you planning to fix this (as outlined elsewhere in the thred)? > Hmm now that you mention it, I remember discussing something similar > with you last year in Dallas in regards to parallel query. IIRC Windows > should not have this problem but other systems with EXEC_BACKEND do. > Don't remember the details though. Don't think that's reliable, only works as long as the binary is compiled without position independent code. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 3/16/17 14:55, Andres Freund wrote: > I indeed think that's the right consequence. One question is what to > replace it with exactly - are we guaranteed we can dynamically lookup > symbols by name in the main binary on every platform? I think there is probably a way to do this on all platforms. But it seems that at least the Windows port of pg_dlopen would need to be updated to support this. > Alternatively we > can just hardcode a bunch of bgw_function_name values that are matched > to specific functions if bgw_library_name is NULL - I suspect that'd be > the easiest / least worrysome portability-wise. Basically a variant of fmgrtab, which addresses the same sort of problem. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Thu, Mar 16, 2017 at 2:55 PM, Andres Freund wrote: > I indeed think it's not safe, and it's going to get less and less safe > on windows (or EXEC_BACKEND). I don't think we can afford to disable > ASLR in the long run (I indeed supect that'll just be disallowed at some > point), and that's the only thing making it safe-ish in combination with > EXEC_BACKEND. Ugh. >> If it's not even safe there, then I guess we should remove it entirely >> as a useless foot-gun. > > I indeed think that's the right consequence. One question is what to > replace it with exactly - are we guaranteed we can dynamically lookup > symbols by name in the main binary on every platform? I don't know the answer to that question. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-16 09:27:59 -0400, Robert Haas wrote: > On Thu, Mar 16, 2017 at 5:13 AM, Petr Jelinek > wrote: > > Hmm now that you mention it, I remember discussing something similar > > with you last year in Dallas in regards to parallel query. IIRC Windows > > should not have this problem but other systems with EXEC_BACKEND do. > > Don't remember the details though. > > Generally, extension code can't use bgw_main safely, and must use > bgw_library_name and bgw_function_name. But bgw_main is supposedly > safe for core code. I indeed think it's not safe, and it's going to get less and less safe on windows (or EXEC_BACKEND). I don't think we can afford to disable ASLR in the long run (I indeed supect that'll just be disallowed at some point), and that's the only thing making it safe-ish in combination with EXEC_BACKEND. > If it's not even safe there, then I guess we should remove it entirely > as a useless foot-gun. I indeed think that's the right consequence. One question is what to replace it with exactly - are we guaranteed we can dynamically lookup symbols by name in the main binary on every platform? Alternatively we can just hardcode a bunch of bgw_function_name values that are matched to specific functions if bgw_library_name is NULL - I suspect that'd be the easiest / least worrysome portability-wise. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On Thu, Mar 16, 2017 at 5:13 AM, Petr Jelinek wrote: > Hmm now that you mention it, I remember discussing something similar > with you last year in Dallas in regards to parallel query. IIRC Windows > should not have this problem but other systems with EXEC_BACKEND do. > Don't remember the details though. Generally, extension code can't use bgw_main safely, and must use bgw_library_name and bgw_function_name. But bgw_main is supposedly safe for core code. If it's not even safe there, then I guess we should remove it entirely as a useless foot-gun. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 16/03/17 09:53, Andres Freund wrote: > On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote: >> On 16/03/17 04:42, Andres Freund wrote: >>> On 2017-03-15 20:28:33 -0700, Andres Freund wrote: Hi, I just unstuck a bunch of my buildfarm animals. That triggered some spurious failures (on piculet, calliphoridae, mylodon), but also one that doesn't really look like that: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 with the pertinent point being: == stack trace: pgsql.build/src/test/regress/tmp_check/data/core == [New LWP 1894] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `postgres: bgworker: logical replication launcher '. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x55e265bff5e3 in ?? () #0 0x55e265bff5e3 in ?? () #1 0x55d3ccabed0d in StartBackgroundWorker () at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 it's possible that me killing things and upgrading caused this, but given this is a backend running EXEC_BACKEND, I'm a bit suspicous that it's more than that. The machine is a bit backed up at the moment, so it'll probably be a while till it's at that animal/branch again, otherwise I'd not have mentioned this. >>> >>> For some reason it ran again pretty soon. And I'm afraid it's indeed an >>> issue: >>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 >>> >> >> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it >> seems to work fine on my two machines. I don't see anything else >> different on culicidae though. Sadly the backtrace is not that >> informative either. I'll try to investigate more but it will take time... > > Worthwhile additional failure: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A55%3A01 > > Same animal, also EXEC_BACKEND, but 9.6. > > A quick look at the relevant line: > /* >* If bgw_main is set, we use that value as the initial entrypoint. >* However, if the library containing the entrypoint wasn't loaded at >* postmaster startup time, passing it as a direct function pointer is > not >* possible. To work around that, we allow callers for whom a function >* pointer is not available to pass a library name (which will be > loaded, >* if necessary) and a function name (which will be looked up in the > named >* library). >*/ > if (worker->bgw_main != NULL) > entrypt = worker->bgw_main; > > makes the issue clear - we appear to be assuming that bgw_main is > meaningful across processes. Which it isn't in the EXEC_BACKEND case > when ASLR is in use... > > This kinda sounds familiar, but a quick google search doesn't find > anything relevant. Hmm now that you mention it, I remember discussing something similar with you last year in Dallas in regards to parallel query. IIRC Windows should not have this problem but other systems with EXEC_BACKEND do. Don't remember the details though. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote: > On 16/03/17 04:42, Andres Freund wrote: > > On 2017-03-15 20:28:33 -0700, Andres Freund wrote: > >> Hi, > >> > >> I just unstuck a bunch of my buildfarm animals. That triggered some > >> spurious failures (on piculet, calliphoridae, mylodon), but also one > >> that doesn't really look like that: > >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 > >> > >> with the pertinent point being: > >> > >> == stack trace: > >> pgsql.build/src/test/regress/tmp_check/data/core == > >> [New LWP 1894] > >> [Thread debugging using libthread_db enabled] > >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > >> Core was generated by `postgres: bgworker: logical replication launcher > >> '. > >> Program terminated with signal SIGSEGV, Segmentation fault. > >> #0 0x55e265bff5e3 in ?? () > >> #0 0x55e265bff5e3 in ?? () > >> #1 0x55d3ccabed0d in StartBackgroundWorker () at > >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 > >> #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) > >> at > >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 > >> #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at > >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 > >> > >> it's possible that me killing things and upgrading caused this, but > >> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that > >> it's more than that. The machine is a bit backed up at the moment, so > >> it'll probably be a while till it's at that animal/branch again, > >> otherwise I'd not have mentioned this. > > > > For some reason it ran again pretty soon. And I'm afraid it's indeed an > > issue: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 > > > > Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it > seems to work fine on my two machines. I don't see anything else > different on culicidae though. Sadly the backtrace is not that > informative either. I'll try to investigate more but it will take time... Worthwhile additional failure: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A55%3A01 Same animal, also EXEC_BACKEND, but 9.6. A quick look at the relevant line: /* * If bgw_main is set, we use that value as the initial entrypoint. * However, if the library containing the entrypoint wasn't loaded at * postmaster startup time, passing it as a direct function pointer is not * possible. To work around that, we allow callers for whom a function * pointer is not available to pass a library name (which will be loaded, * if necessary) and a function name (which will be looked up in the named * library). */ if (worker->bgw_main != NULL) entrypt = worker->bgw_main; makes the issue clear - we appear to be assuming that bgw_main is meaningful across processes. Which it isn't in the EXEC_BACKEND case when ASLR is in use... This kinda sounds familiar, but a quick google search doesn't find anything relevant. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 16/03/17 09:44, Andres Freund wrote: > On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote: >> On 16/03/17 04:42, Andres Freund wrote: >>> On 2017-03-15 20:28:33 -0700, Andres Freund wrote: Hi, I just unstuck a bunch of my buildfarm animals. That triggered some spurious failures (on piculet, calliphoridae, mylodon), but also one that doesn't really look like that: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 with the pertinent point being: == stack trace: pgsql.build/src/test/regress/tmp_check/data/core == [New LWP 1894] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `postgres: bgworker: logical replication launcher '. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x55e265bff5e3 in ?? () #0 0x55e265bff5e3 in ?? () #1 0x55d3ccabed0d in StartBackgroundWorker () at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 it's possible that me killing things and upgrading caused this, but given this is a backend running EXEC_BACKEND, I'm a bit suspicous that it's more than that. The machine is a bit backed up at the moment, so it'll probably be a while till it's at that animal/branch again, otherwise I'd not have mentioned this. >>> >>> For some reason it ran again pretty soon. And I'm afraid it's indeed an >>> issue: >>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 >>> >> >> Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it >> seems to work fine on my two machines. I don't see anything else >> different on culicidae though. Sadly the backtrace is not that >> informative either. I'll try to investigate more but it will take time... > > I can give you a login to that machine, it doesn't do anything but run > buildfarm animals... Will have to be my tomorrow however. > That would be helpful, thanks. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-16 09:40:48 +0100, Petr Jelinek wrote: > On 16/03/17 04:42, Andres Freund wrote: > > On 2017-03-15 20:28:33 -0700, Andres Freund wrote: > >> Hi, > >> > >> I just unstuck a bunch of my buildfarm animals. That triggered some > >> spurious failures (on piculet, calliphoridae, mylodon), but also one > >> that doesn't really look like that: > >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 > >> > >> with the pertinent point being: > >> > >> == stack trace: > >> pgsql.build/src/test/regress/tmp_check/data/core == > >> [New LWP 1894] > >> [Thread debugging using libthread_db enabled] > >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > >> Core was generated by `postgres: bgworker: logical replication launcher > >> '. > >> Program terminated with signal SIGSEGV, Segmentation fault. > >> #0 0x55e265bff5e3 in ?? () > >> #0 0x55e265bff5e3 in ?? () > >> #1 0x55d3ccabed0d in StartBackgroundWorker () at > >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 > >> #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) > >> at > >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 > >> #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at > >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 > >> > >> it's possible that me killing things and upgrading caused this, but > >> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that > >> it's more than that. The machine is a bit backed up at the moment, so > >> it'll probably be a while till it's at that animal/branch again, > >> otherwise I'd not have mentioned this. > > > > For some reason it ran again pretty soon. And I'm afraid it's indeed an > > issue: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 > > > > Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it > seems to work fine on my two machines. I don't see anything else > different on culicidae though. Sadly the backtrace is not that > informative either. I'll try to investigate more but it will take time... I can give you a login to that machine, it doesn't do anything but run buildfarm animals... Will have to be my tomorrow however. (Also need to fix config for older branches that don't work with the upgraded ssl. This is a really bad situation :() Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 16/03/17 04:42, Andres Freund wrote: > On 2017-03-15 20:28:33 -0700, Andres Freund wrote: >> Hi, >> >> I just unstuck a bunch of my buildfarm animals. That triggered some >> spurious failures (on piculet, calliphoridae, mylodon), but also one >> that doesn't really look like that: >> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 >> >> with the pertinent point being: >> >> == stack trace: >> pgsql.build/src/test/regress/tmp_check/data/core == >> [New LWP 1894] >> [Thread debugging using libthread_db enabled] >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". >> Core was generated by `postgres: bgworker: logical replication launcher >> '. >> Program terminated with signal SIGSEGV, Segmentation fault. >> #0 0x55e265bff5e3 in ?? () >> #0 0x55e265bff5e3 in ?? () >> #1 0x55d3ccabed0d in StartBackgroundWorker () at >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 >> #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 >> #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at >> /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 >> >> it's possible that me killing things and upgrading caused this, but >> given this is a backend running EXEC_BACKEND, I'm a bit suspicous that >> it's more than that. The machine is a bit backed up at the moment, so >> it'll probably be a while till it's at that animal/branch again, >> otherwise I'd not have mentioned this. > > For some reason it ran again pretty soon. And I'm afraid it's indeed an > issue: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 > Hmm, I tried with EXEC_BACKEND (and with --disable-spinlocks) and it seems to work fine on my two machines. I don't see anything else different on culicidae though. Sadly the backtrace is not that informative either. I'll try to investigate more but it will take time... -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical replication launcher crash on buildfarm
On 2017-03-15 20:28:33 -0700, Andres Freund wrote: > Hi, > > I just unstuck a bunch of my buildfarm animals. That triggered some > spurious failures (on piculet, calliphoridae, mylodon), but also one > that doesn't really look like that: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2002%3A40%3A03 > > with the pertinent point being: > > == stack trace: > pgsql.build/src/test/regress/tmp_check/data/core == > [New LWP 1894] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `postgres: bgworker: logical replication launcher > '. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x55e265bff5e3 in ?? () > #0 0x55e265bff5e3 in ?? () > #1 0x55d3ccabed0d in StartBackgroundWorker () at > /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c:792 > #2 0x55d3ccacf4fc in SubPostmasterMain (argc=3, argv=0x55d3cdbb71c0) at > /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4878 > #3 0x55d3cca443ea in main (argc=3, argv=0x55d3cdbb71c0) at > /home/andres/build/buildfarm-culicidae/HEAD/pgsql.build/../pgsql/src/backend/main/main.c:205 > > it's possible that me killing things and upgrading caused this, but > given this is a backend running EXEC_BACKEND, I'm a bit suspicous that > it's more than that. The machine is a bit backed up at the moment, so > it'll probably be a while till it's at that animal/branch again, > otherwise I'd not have mentioned this. For some reason it ran again pretty soon. And I'm afraid it's indeed an issue: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2017-03-16%2003%3A30%3A02 - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers