I have a report of a PL/Java crash in 9.6 where the stack trace
suggests it was trying to initialize in a background worker
process (not sure why that even happened, yet), and by my first
glance, it seems to have crashed dereferencing MyProcPort, which
I am guessing a BGW might not always have (?).

So, as I try to get up to speed on this PostgreSQL feature, it
seems to me that I have up to three different cases that I may
need to make PL/Java detect and respond appropriately to. (If
you see me veering into any misconceptions, please let me know.)

1. A worker explicitly created with Register... or RegisterDynamic...
   that has not called ...InitializeConnection... and so isn't
   any particular user or connected to any database.

2. A worker explicitly created that has called ...Initialize...
   and therefore is connected to some database as some user.
   (So, is there a MyProcPort in this case?)

3. A worker implicitly created for a parallel query plan (and therefore
   associated with a database and a user). Does this have a MyProcPort?

Case 1, I think I at most need to detect and ereport. It is hard to
imagine how it could even arise, as without a database connection
there's no pg_extension, pg_language, or pg_proc, but I suppose it
could happen if someone misguidedly puts libpljava in
shared_preload_libraries, or some other bgw code inexplicably loads
it. It's a non-useful case as PL/Java has nothing to do without
a database connection and sqlj schema.

Case 2 might be worth supporting, but I may need to account for
anything that differs in this environment from a normal connected

Case 3 seems most likely. It should only be possible by invoking
a declared Java function that somebody marked parallel-safe, right?
In the parallel-unsafe or -restricted cases, PL/Java can only find
itself invoked within the leader process?

Such a leader process can only be a normal backend? Or perhaps also
a case-2 explicitly created BGW that is executing a query?

My main question is, what state do I need to examine at startup
in order to distinguish these cases? Do I detect I'm in a BGW by
a non-null MyBgworkerEntry? If it's there, do I detect whether
I have a database and an identity by checking for a MyProcPort,
or some other way?

As for declaring functions parallel-unsafe, -restricted, or -safe,
I assume there should be no problems with PL/Java functions with
the default designation of unsafe. There should be no essential
problem if someone declares a function -restricted - provided PL/Java
itself can be audited to make sure it doesn't do any of the things
restricted functions can't do - as it will only be running in the
leader process anyway.

Even should somebody mark a PL/Java function safe, while hard to
imagine a good case for, shouldn't really break anything; as the
workers are separate processes, this should be safe. Any imagined
speed advantage of the parallel query is likely to evaporate while
the several processes load their own JVMs, but nothing should
outright break.

That leads me to:

Are BGWs for parallel queries born fresh for each query, or do they
get pooled and reused?

If pooled, can they be reused across backends/database connections/
identities, or only by the backend that created them?

If reusable across contexts, that's a dealbreaker and I'd have to
have PL/Java reject any parallel-safe declaration, but a pool tied
to a connection should be ok (and better yet, allow amortizing the
JVM startup cost).

If pooled, and tied to the backend that started them, do they need
to do anything special to detect when the leader has executed

If all of this is covered to death in some document I obviously
haven't read, please feel free to point me to it.


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to