On Mon, Oct 22, 2018 at 9:36 AM Jeremy Finzel <finz...@gmail.com> wrote:

> Hello -
>
> I have an extension that uses background workers.  I pass a database oid
> as an argument in order to launch the worker using function
> BackgroundWorkerInitializeConnectionByOid.  In one of my regression tests
> that was written, I intentionally launch the worker with an invalid oid.
> In earlier PG versions the worker would successfully launch but then
> terminate asynchronously, with a message in the server log.  Now, it does
> not even successfully launch but immediately errors (hence failing my
> regression tests).
>
> I have recently installed all later point releases of all versions 9.5-11,
> so I assume this is due to some code change.  The behavior seems reasonable
> but I don't find any obvious release notes indicating a patch that would
> have changed this behavior.  Any thoughts?
>
> Thanks,
> Jeremy
>

I still haven't determined the source of this error, but I have determined
that it must not be related to a difference in point release versions as to
background worker error handling, because I am seeing different behavior
for identical postgres version on my machine vs. others.  I would
appreciate any ideas as to how this could possibly happen because I'm not
sure the right way now to build this regression test.

The test launches the background worker with an invalid database oid.

Here is what I am seeing running pg 11.1 on my system (same behavior I get
on 9.5-10 as well):

 SELECT _launch(9999999::OID) AS pid;
! ERROR:  could not start background process
! HINT:  More details may be available in the server log.

This is what others are seeing (the worker fails asynchronously and you see
it in the server log):

 SELECT _launch(9999999::OID) AS pid;
!   pid
! -------
!  18022
! (1 row)

I could share the C code but it's not that interesting.  It just calls
BackgroundWorkerInitializeConnectionByOid.  It is essentially a duplicate
of worker_spi.  Here is the relevant section:

sprintf(worker.bgw_function_name, "worker_spi_main");
snprintf(worker.bgw_name, BGW_MAXLEN, "worker_spi worker %d", i);
snprintf(worker.bgw_type, BGW_MAXLEN, "worker_spi");
worker.bgw_main_arg = Int32GetDatum(i);
/* set bgw_notify_pid so that we can use WaitForBackgroundWorkerStartup */
worker.bgw_notify_pid = MyProcPid;

if (!RegisterDynamicBackgroundWorker(&worker, &handle))
PG_RETURN_NULL();

status = WaitForBackgroundWorkerStartup(handle, &pid);

if (status == BGWH_STOPPED)
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_RESOURCES),
errmsg("could not start background process"),
errhint("More details may be available in the server log.")));

So on my machine, I am getting status == BGWH_STOPPED, whereas with others,
they are not getting that behavior.

Thanks,
Jeremy

Reply via email to