Re: [HACKERS] 9.6, background worker processes, and PL/Java

2016-10-26 Thread Craig Ringer
On 27 October 2016 at 09:22, Chapman Flack  wrote:

> Hmm, IsUnderPostmaster is PGDLLIMPORTed but IsPostmasterEnvironment isn't,
> so I'm out of luck on Windows. Is there another way I can check?
>
>>> Do I detect I'm in a BGW by a non-null MyBgworkerEntry?
>>
>> Use IsBackgroundWorker, same place as above.
>
> Also not PGDLLIMPORTed. MyBgworkerEntry is, though. It does appear to be
> initialized to NULL. Can I get away with checking that, since I can't see
> IsBackgroundWorker?
>
> I now see what caused the reported crash. It was a parallel query that
> did not make any use of PL/Java functions, but the group leader had used
> them before so the library was loaded, so ParallelWorkerMain loaded it
> in the worker process, so _PG_init got called and was going to refer to
> stuff that wasn't set up yet, because the library loading comes pretty
> early in ParallelWorkerMain.
>
> I think I could easily fix that by having the library init code just bail
> right after defining the custom GUCs, if InitializingParallelWorker
> is true.
>
> Alas, InitializingParallelWorker isn't PGDLLIMPORTed either. This isn't
> my day. Is there a way I can successfully infer that on Windows?

Please submit a patch to make them all PGDLLIMPORT. They clearly
should be, for use in bgworkers.

I'd consider that a bugfix personally and hope it can be backpatched
to the stable branches. It's not going to break anything since nothing
external that runs on Windows can previously have been referring to
these symbols.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.6, background worker processes, and PL/Java

2016-10-26 Thread Chapman Flack
On 10/26/16 07:04, Amit Kapila wrote:
> No, parallel workers in parallel query doesn't have MyProcPort.

Ok ... it turns out I was using MyProcPort as a quick way to grab
database_name and user_name (very early in startup, for a purpose
analogous to setting a 'ps' process title), and that seemed more
lightweight than other methods of getting the database
and user Oids and mapping those to the corresponding names.

But I guess I can change that easily enough.

> ...
>> Are BGWs for parallel queries born fresh for each query, or do they
>> get pooled and reused?
>
> born fresh for each query.

Yikes. But ok, if there's ever a reason to try to make a "safe"
Java function, I see there is a parallel_setup_cost GUC that could
be used to inform the planner of the higher cost when BGWs have to
start JVMs, so it probably wouldn't make parallel plans often, but
still could if analysis showed a sufficient advantage.


On 10/26/16 07:15, Amit Kapila wrote:

> All the GUCs are synchronised between leader and worker backends.

Ah, thanks.  I have now found README.parallel, so I much better understand
what is synchronized, and what operations are allowed or not. :)

On 10/26/16 07:42, Craig Ringer wrote:
>
> For loaded in shared_preload_libraries, test
>
> IsPostmasterEnvironment && !IsUnderPostmaster

Hmm, IsUnderPostmaster is PGDLLIMPORTed but IsPostmasterEnvironment isn't,
so I'm out of luck on Windows. Is there another way I can check?

>> Do I detect I'm in a BGW by a non-null MyBgworkerEntry?
>
> Use IsBackgroundWorker, same place as above.

Also not PGDLLIMPORTed. MyBgworkerEntry is, though. It does appear to be
initialized to NULL. Can I get away with checking that, since I can't see
IsBackgroundWorker?

I now see what caused the reported crash. It was a parallel query that
did not make any use of PL/Java functions, but the group leader had used
them before so the library was loaded, so ParallelWorkerMain loaded it
in the worker process, so _PG_init got called and was going to refer to
stuff that wasn't set up yet, because the library loading comes pretty
early in ParallelWorkerMain.

I think I could easily fix that by having the library init code just bail
right after defining the custom GUCs, if InitializingParallelWorker
is true.

Alas, InitializingParallelWorker isn't PGDLLIMPORTed either. This isn't
my day. Is there a way I can successfully infer that on Windows?

I guess I can just bail from initialization early when in *any* kind
of background worker, and just leave the rest to be done when called
through the language handler, if ever.

This would be so much easier if Visual Studio were not a thing.

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.6, background worker processes, and PL/Java

2016-10-26 Thread Craig Ringer
On 26 October 2016 at 06:56, Chapman Flack  wrote:

> My main question is, what state do I need to examine at startup
> in order to distinguish these cases?

For loaded in shared_preload_libraries, test

IsPostmasterEnvironment && !IsUnderPostmaster

See src/backend/utils/init/globals.c

> Do I detect I'm in a BGW by
> a non-null MyBgworkerEntry?

Use IsBackgroundWorker, same place as above.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.6, background worker processes, and PL/Java

2016-10-26 Thread Amit Kapila
On Wed, Oct 26, 2016 at 7:39 AM, Chapman Flack  wrote:
> On 10/25/16 18:56, Chapman Flack wrote:
>
>> If pooled, and tied to the backend that started them, do they need
>> to do anything special to detect when the leader has executed
>> SET ROLE or SET SESSION AUTHORIZATION?
>
> Let me guess ... such information is *not* synchronized across workers,
> and that'd be why the manual says "functions must be marked PARALLEL
> RESTRICTED if they access ... client connection state ..."?
>

All the GUCs are synchronised between leader and worker backends.

> That's probably a resounding 'no' for declaring any PL/Java function
> SAFE, then.
>
> And if changing "the transaction state even temporarily (e.g. a PL/pgsql
> function which establishes an EXCEPTION block to catch errors)" is enough
> to require UNSAFE, then it may be that RESTRICTED is off limits too, as
> there are places PL/Java does that internally.
>
> I take it that example refers not to just any use of PG_TRY/PG_CATCH,
> but only to those uses where an internal subtransaction is used to
> allow execution to continue?
>
> If a person writes a function in some language (SQL, for example),
> declares it PARALLEL SAFE but is lying because it calls another
> function (in Java, say) that is PARALLEL UNSAFE or RESTRICTED,
> does PostgreSQL detect or prevent that, or is it just considered
> an unfortunate mistake by the goofball who declared the first
> function safe?
>

No, we don't detect that explicitly before initiating parallelism,
however there are checks in code which will report error if you do
something unsafe in worker, example perform any write operation in
worker.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.6, background worker processes, and PL/Java

2016-10-26 Thread Amit Kapila
On Wed, Oct 26, 2016 at 4:26 AM, Chapman Flack  wrote:
> Hi,
>
> I have a report of a PL/Java crash in 9.6 where the stack trace
> suggests it was trying to initialize in a background worker
> process (not sure why that even happened, yet), and by my first
> glance, it seems to have crashed dereferencing MyProcPort, which
> I am guessing a BGW might not always have (?).
>
> So, as I try to get up to speed on this PostgreSQL feature, it
> seems to me that I have up to three different cases that I may
> need to make PL/Java detect and respond appropriately to. (If
> you see me veering into any misconceptions, please let me know.)
>
> 1. A worker explicitly created with Register... or RegisterDynamic...
>that has not called ...InitializeConnection... and so isn't
>any particular user or connected to any database.
>
> 2. A worker explicitly created that has called ...Initialize...
>and therefore is connected to some database as some user.
>(So, is there a MyProcPort in this case?)
>
> 3. A worker implicitly created for a parallel query plan (and therefore
>associated with a database and a user). Does this have a MyProcPort?
>

No, parallel workers in parallel query doesn't have MyProcPort.

>
> Case 1, I think I at most need to detect and ereport. It is hard to
> imagine how it could even arise, as without a database connection
> there's no pg_extension, pg_language, or pg_proc, but I suppose it
> could happen if someone misguidedly puts libpljava in
> shared_preload_libraries, or some other bgw code inexplicably loads
> it. It's a non-useful case as PL/Java has nothing to do without
> a database connection and sqlj schema.
>
> Case 2 might be worth supporting, but I may need to account for
> anything that differs in this environment from a normal connected
> backend.
>
> Case 3 seems most likely. It should only be possible by invoking
> a declared Java function that somebody marked parallel-safe, right?
> In the parallel-unsafe or -restricted cases, PL/Java can only find
> itself invoked within the leader process?
>
> Such a leader process can only be a normal backend? Or perhaps also
> a case-2 explicitly created BGW that is executing a query?
>
> My main question is, what state do I need to examine at startup
> in order to distinguish these cases? Do I detect I'm in a BGW by
> a non-null MyBgworkerEntry? If it's there, do I detect whether
> I have a database and an identity by checking for a MyProcPort,
> or some other way?
>
> As for declaring functions parallel-unsafe, -restricted, or -safe,
> I assume there should be no problems with PL/Java functions with
> the default designation of unsafe. There should be no essential
> problem if someone declares a function -restricted - provided PL/Java
> itself can be audited to make sure it doesn't do any of the things
> restricted functions can't do - as it will only be running in the
> leader process anyway.
>
> Even should somebody mark a PL/Java function safe, while hard to
> imagine a good case for, shouldn't really break anything; as the
> workers are separate processes, this should be safe. Any imagined
> speed advantage of the parallel query is likely to evaporate while
> the several processes load their own JVMs, but nothing should
> outright break.
>
> That leads me to:
>
> Are BGWs for parallel queries born fresh for each query, or do they
> get pooled and reused?
>

born fresh for each query.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.6, background worker processes, and PL/Java

2016-10-25 Thread Chapman Flack
On 10/25/16 18:56, Chapman Flack wrote:

> If pooled, and tied to the backend that started them, do they need
> to do anything special to detect when the leader has executed
> SET ROLE or SET SESSION AUTHORIZATION?

Let me guess ... such information is *not* synchronized across workers,
and that'd be why the manual says "functions must be marked PARALLEL
RESTRICTED if they access ... client connection state ..."?

That's probably a resounding 'no' for declaring any PL/Java function
SAFE, then.

And if changing "the transaction state even temporarily (e.g. a PL/pgsql
function which establishes an EXCEPTION block to catch errors)" is enough
to require UNSAFE, then it may be that RESTRICTED is off limits too, as
there are places PL/Java does that internally.

I take it that example refers not to just any use of PG_TRY/PG_CATCH,
but only to those uses where an internal subtransaction is used to
allow execution to continue?

If a person writes a function in some language (SQL, for example),
declares it PARALLEL SAFE but is lying because it calls another
function (in Java, say) that is PARALLEL UNSAFE or RESTRICTED,
does PostgreSQL detect or prevent that, or is it just considered
an unfortunate mistake by the goofball who declared the first
function safe?

And if that's not already prevented, could it be worth adding
code in the PL/Java call handler to detect such a situation and
make sure it ends in a meaningful ereport and not something worse?

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers