While testing last night, I discovered a serious case of brain fade in
parallel.c; the same conceptual mistake has also spread to
nodeGather.c.  parallel.c creates an array of ParallelWorkerInfo
structures, which are defined like this:

typedef struct ParallelWorkerInfo
        BackgroundWorkerHandle *bgwhandle;
        shm_mq_handle *error_mqh;
        int32           pid;
} ParallelWorkerInfo;

These structures are typically accessed as pcxt->worker[i].  The
trouble is that pcxt->worker[i].bgwhandle is the bgwhandler for the
i'th worker to be registered, while pcxt->worker[i].error_mqh is the
error_mqh for the i'th worker to attach to the dynamic shared memory
segment.  But those might be totally different processes.

This happens to mostly work, because the postmaster will probably
start the processes in the same order that they are registered, and
the operating system will probably schedule them in the same order the
postmaster starts them.  And if you only have one worker, then you're
fine!  It also seems to work out that if all workers exit cleanly, any
shuffling of the background worker handles relative to the error queue
handles is harmless.  But it's still pretty broken.

There seem to be two ways to fix this.  One is to keep the bgwhandle
objects in a separate array from the error_mqh objects and reconstruct
after the fact what the mapping between those two sets of indexes is.
This solution looks complicated to code and generally pretty annoying
to get right.  The other way to fix this is to pass down the index
that the leader assigns to any given worker, and have the worker use
that index instead of allocating its own separate index after
connecting to the DSM segment.  Unfortunately, there's not really a
good way to pass that additional information down to the worker right
now, but we could fix that pretty easily by adding an additional field
to the BackgroundWorker structure, which the worker would then be able
to access via MyBgworkerEntry.  I suggest something like this:

--- a/src/include/postmaster/bgworker.h
+++ b/src/include/postmaster/bgworker.h
@@ -74,6 +74,7 @@ typedef enum
 #define BGW_NEVER_RESTART                              -1
 #define BGW_MAXLEN                                             64
+#define BGW_EXTRALEN                                   128

 typedef struct BackgroundWorker
@@ -85,6 +86,7 @@ typedef struct BackgroundWorker
        char            bgw_library_name[BGW_MAXLEN];   /* only if
bgw_main is NULL */
        char            bgw_function_name[BGW_MAXLEN];  /* only if
bgw_main is NULL */
        Datum           bgw_main_arg;
+       char            bgw_extra[BGW_EXTRALEN];
        pid_t           bgw_notify_pid; /* SIGUSR1 this backend on start/stop */
 } BackgroundWorker;

The ability to pass down a little more information from the
registering process to the background worker seems like it would be
useful for more than just parallelism, so I imagine this change might
be generally welcomed.  Parallel workers would use the first four
bytes of bgw_extra to store the worker index, and other users of the
background worker facility could use it for whatever they like.


Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to