Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-18 Thread Bruce Momjian
On Mon, Jan 11, 2016 at 10:44:42AM -0500, Tom Lane wrote:
> Commit 803716013dc1350f installed a safeguard against loading plpython2
> and plpython3 at the same time, stating
> 
> +   It is not allowed to use PL/Python based on Python 2 and PL/Python
> +   based on Python 3 in the same session, because the symbols in the
> +   dynamic modules would clash, which could result in crashes of the
> +   PostgreSQL server process.  There is a check that prevents mixing
> +   Python major versions in a session, which will abort the session if
> +   a mismatch is detected.  It is possible, however, to use both
> +   PL/Python variants in the same database, from separate sessions.
> 
> It turns out though that the freedom promised in the last sentence
> is fairly illusory, because if you have functions in both languages
> in one DB and you try a dump-and-restore, it will fail.
> 
> But it gets worse: a report today in pgsql-general points out that
> even if you have the two languages in use *in different databases*,
> pg_upgrade will fail, because pg_upgrade rather overenthusiastically
> loads every .so mentioned anywhere in the source installation into
> one session.

The fix for that would be for pg_upgrade to change
check_loadable_libraries() to start a new session for each LOAD command.
Would you like that done?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-18 Thread Tom Lane
Bruce Momjian  writes:
> On Mon, Jan 11, 2016 at 10:44:42AM -0500, Tom Lane wrote:
>> But it gets worse: a report today in pgsql-general points out that
>> even if you have the two languages in use *in different databases*,
>> pg_upgrade will fail, because pg_upgrade rather overenthusiastically
>> loads every .so mentioned anywhere in the source installation into
>> one session.

> The fix for that would be for pg_upgrade to change
> check_loadable_libraries() to start a new session for each LOAD command.
> Would you like that done?

Not necessary anymore, at least not for PL/Python; and that solution
sounds a tad expensive.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-18 Thread Bruce Momjian
On Mon, Jan 18, 2016 at 01:10:05PM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Mon, Jan 11, 2016 at 10:44:42AM -0500, Tom Lane wrote:
> >> But it gets worse: a report today in pgsql-general points out that
> >> even if you have the two languages in use *in different databases*,
> >> pg_upgrade will fail, because pg_upgrade rather overenthusiastically
> >> loads every .so mentioned anywhere in the source installation into
> >> one session.
> 
> > The fix for that would be for pg_upgrade to change
> > check_loadable_libraries() to start a new session for each LOAD command.
> > Would you like that done?
> 
> Not necessary anymore, at least not for PL/Python; and that solution
> sounds a tad expensive.

Yes, it would certainly be inefficient.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-11 Thread Tom Lane
I wrote:
> In further experimentation, I found out that even with this patch in
> place, the plpython transform extensions still present a dump/reload
> hazard:
> ...
> But I wonder why we couldn't make it do
> LOAD 'plpython3';

I've verified that the attached additional patch makes everything behave
nicely in this corner of the world.  AFAICS this adds no assumptions
that aren't already built into pg_pltemplate.

Last call for objections ...

regards, tom lane

diff --git a/contrib/hstore_plperl/hstore_plperl--1.0.sql b/contrib/hstore_plperl/hstore_plperl--1.0.sql
index ea0ad76..a4fd7c2 100644
*** a/contrib/hstore_plperl/hstore_plperl--1.0.sql
--- b/contrib/hstore_plperl/hstore_plperl--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '' LANGUAGE plperl;
  SELECT NULL::hstore;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plperl';
  SELECT NULL::hstore;
  
  
diff --git a/contrib/hstore_plperl/hstore_plperlu--1.0.sql b/contrib/hstore_plperl/hstore_plperlu--1.0.sql
index 46ad35c..2c2e3e3 100644
*** a/contrib/hstore_plperl/hstore_plperlu--1.0.sql
--- b/contrib/hstore_plperl/hstore_plperlu--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '' LANGUAGE plperlu;
  SELECT NULL::hstore;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plperl';
  SELECT NULL::hstore;
  
  
diff --git a/contrib/hstore_plpython/hstore_plpython2u--1.0.sql b/contrib/hstore_plpython/hstore_plpython2u--1.0.sql
index c998de5..a793dc9 100644
*** a/contrib/hstore_plpython/hstore_plpython2u--1.0.sql
--- b/contrib/hstore_plpython/hstore_plpython2u--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '1' LANGUAGE plpython2u;
  SELECT NULL::hstore;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plpython2';
  SELECT NULL::hstore;
  
  
diff --git a/contrib/hstore_plpython/hstore_plpython3u--1.0.sql b/contrib/hstore_plpython/hstore_plpython3u--1.0.sql
index 61d0e47..a85c85d 100644
*** a/contrib/hstore_plpython/hstore_plpython3u--1.0.sql
--- b/contrib/hstore_plpython/hstore_plpython3u--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '1' LANGUAGE plpython3u;
  SELECT NULL::hstore;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plpython3';
  SELECT NULL::hstore;
  
  
diff --git a/contrib/hstore_plpython/hstore_plpythonu--1.0.sql b/contrib/hstore_plpython/hstore_plpythonu--1.0.sql
index 6acb97a..e2e4721 100644
*** a/contrib/hstore_plpython/hstore_plpythonu--1.0.sql
--- b/contrib/hstore_plpython/hstore_plpythonu--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '1' LANGUAGE plpythonu;
  SELECT NULL::hstore;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plpython2';  -- change to plpython3 if that ever becomes the default
  SELECT NULL::hstore;
  
  
diff --git a/contrib/ltree_plpython/ltree_plpython2u--1.0.sql b/contrib/ltree_plpython/ltree_plpython2u--1.0.sql
index 29a12d4..f040bd3 100644
*** a/contrib/ltree_plpython/ltree_plpython2u--1.0.sql
--- b/contrib/ltree_plpython/ltree_plpython2u--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '1' LANGUAGE plpython2u;
  SELECT NULL::ltree;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plpython2';
  SELECT NULL::ltree;
  
  
diff --git a/contrib/ltree_plpython/ltree_plpython3u--1.0.sql b/contrib/ltree_plpython/ltree_plpython3u--1.0.sql
index 1300a78..7afe51f 100644
*** a/contrib/ltree_plpython/ltree_plpython3u--1.0.sql
--- b/contrib/ltree_plpython/ltree_plpython3u--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '1' LANGUAGE plpython3u;
  SELECT NULL::ltree;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plpython3';
  SELECT NULL::ltree;
  
  
diff --git a/contrib/ltree_plpython/ltree_plpythonu--1.0.sql b/contrib/ltree_plpython/ltree_plpythonu--1.0.sql
index 1d1af28..50f35dd 100644
*** a/contrib/ltree_plpython/ltree_plpythonu--1.0.sql
--- b/contrib/ltree_plpython/ltree_plpythonu--1.0.sql
***
*** 1,5 
  -- make sure the prerequisite libraries are loaded
! DO '1' LANGUAGE plpythonu;
  SELECT NULL::ltree;
  
  
--- 1,5 
  -- make sure the prerequisite libraries are loaded
! LOAD 'plpython2';  -- change to plpython3 if that ever becomes the default
  SELECT NULL::ltree;
  
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-11 Thread Jim Nasby

On 1/11/16 11:51 AM, Tom Lane wrote:

We could ameliorate the first of these cases by putting the can't-run-
with-two-pythons error back up to FATAL rather than ERROR, but I'm not
sure that that would be a net improvement --- FATAL errors aren't very
friendly.  In any case errors of the second type seem unpreventable
unless we stick with the immediate-FATAL-error approach.


Something that's always concerned me about functions in other languages 
is that any kind of snafu in the function/language can hose the backend, 
which you may or may not detect. I've used other databases that (by 
default) spin up a separate process for executing functions, maybe we 
could do something like that? If we treated 2 and 3 as different 
languages you could actually use both at the same time in a single 
backend. The only thing that's not clear to me is how you'd be able to 
re-enter the process during recursive/nested calls.


Obviously this is a lot more work than what you're proposing though. :(
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-11 Thread Tom Lane
I wrote:
> I think we might be able to improve this if we don't do the check in
> plpython's _PG_init(), but delay it until we have a need to call into
> libpython; which we could skip doing at _PG_init() time, at the cost
> of perhaps one more flag check per plpython function call to see if
> we've initialized libpython yet.

> The behavior would then be that if you load both language .so's
> into one session, neither one would be willing to do anything
> anymore --- not run a function, and not syntax-check one either,
> because any attempted call into either libpython might crash.
> But restoring a dump has no need to do either of those things;
> it just wants to be able to LOAD the .so.  Also, the error for
> not being able to do something wouldn't have to be FATAL.

Attached is a draft patch along this line.  It seems to work as intended.

I can think of at least a couple of ways in which this test might be
inadequate.  One is that a plpython2 function might call a plpython3
function or vice versa.  The inner plpython_call_handler would correctly
throw an error, but if the outer function trapped the error and continued
execution, it would be at risk.  Another, more hypothetical risk is that
once we've executed anything out of one libpython, it might have changed
process state (eg, installed hooks) in such a way that it can get control
back even without an explicit call to plpython.  In that case, a
subsequent load of another Python version would put things at risk for
such execution in the first libpython.

We could ameliorate the first of these cases by putting the can't-run-
with-two-pythons error back up to FATAL rather than ERROR, but I'm not
sure that that would be a net improvement --- FATAL errors aren't very
friendly.  In any case errors of the second type seem unpreventable
unless we stick with the immediate-FATAL-error approach.

Since plpython only comes in an untrusted (hence superuser-only) flavor,
holes like this wouldn't be security issues but only "well, you shouldn't
do that" problems.  I'm inclined to think it's a good tradeoff for being
able to dump/restore/upgrade mixed installations, which are surely
likely to get more popular.

Thoughts?

regards, tom lane

diff --git a/src/pl/plpython/plpy_main.c b/src/pl/plpython/plpy_main.c
index 3c2ebfa..059adfa 100644
*** a/src/pl/plpython/plpy_main.c
--- b/src/pl/plpython/plpy_main.c
*** static void PLy_init_interp(void);
*** 63,69 
  static PLyExecutionContext *PLy_push_execution_context(void);
  static void PLy_pop_execution_context(void);
  
! static const int plpython_python_version = PY_MAJOR_VERSION;
  
  /* initialize global variables */
  PyObject   *PLy_interp_globals = NULL;
--- 63,70 
  static PLyExecutionContext *PLy_push_execution_context(void);
  static void PLy_pop_execution_context(void);
  
! static int *plpython_version_bitmask_ptr = NULL;
! static int	plpython_version_bitmask = 0;
  
  /* initialize global variables */
  PyObject   *PLy_interp_globals = NULL;
*** static PLyExecutionContext *PLy_executio
*** 75,102 
  void
  _PG_init(void)
  {
! 	/* Be sure we do initialization only once (should be redundant now) */
! 	static bool inited = false;
  	const int **version_ptr;
  
! 	if (inited)
! 		return;
  
! 	/* Be sure we don't run Python 2 and 3 in the same session (might crash) */
  	version_ptr = (const int **) find_rendezvous_variable("plpython_python_version");
! 	if (!(*version_ptr))
! 		*version_ptr = _python_version;
! 	else
! 	{
! 		if (**version_ptr != plpython_python_version)
! 			ereport(FATAL,
! 	(errmsg("Python major version mismatch in session"),
! 	 errdetail("This session has previously used Python major version %d, and it is now attempting to use Python major version %d.",
! 			   **version_ptr, plpython_python_version),
! 	 errhint("Start a new session to use a different Python major version.")));
! 	}
  
! 	pg_bindtextdomain(TEXTDOMAIN);
  
  #if PY_MAJOR_VERSION >= 3
  	PyImport_AppendInittab("plpy", PyInit_plpy);
--- 76,144 
  void
  _PG_init(void)
  {
! 	int		  **bitmask_ptr;
  	const int **version_ptr;
  
! 	/*
! 	 * Set up a shared bitmask variable telling which Python version(s) are
! 	 * loaded into this process's address space.  If there's more than one, we
! 	 * cannot call into libpython for fear of causing crashes.  But postpone
! 	 * the actual failure for later, so that operations like pg_restore can
! 	 * load more than one plpython library so long as they don't try to do
! 	 * anything much with the language.
! 	 */
! 	bitmask_ptr = (int **) find_rendezvous_variable("plpython_version_bitmask");
! 	if (!(*bitmask_ptr))		/* am I the first? */
! 		*bitmask_ptr = _version_bitmask;
! 	/* Retain pointer to the agreed-on shared variable ... */
! 	plpython_version_bitmask_ptr = *bitmask_ptr;
! 	/* ... and announce my presence */
! 	*plpython_version_bitmask_ptr |= (1 << PY_MAJOR_VERSION);
  
! 	/*
! 	 * This should 

Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-11 Thread Tom Lane
Jim Nasby  writes:
> Something that's always concerned me about functions in other languages 
> is that any kind of snafu in the function/language can hose the backend, 
> which you may or may not detect. I've used other databases that (by 
> default) spin up a separate process for executing functions, maybe we 
> could do something like that? If we treated 2 and 3 as different 
> languages you could actually use both at the same time in a single 
> backend. The only thing that's not clear to me is how you'd be able to 
> re-enter the process during recursive/nested calls.

There's at least one PL/Java implementation that does that.  The
interprocess communication overhead is pretty awful, IIRC.  Don't know
what they do about nested calls.

> Obviously this is a lot more work than what you're proposing though. :(

Yeah.  I think what I'm suggesting is a back-patchable fix, which that
certainly wouldn't be.

I realized that the patch I put up before would do the Wrong Thing if
an updated library were loaded followed by a not-updated library for
the other python version.  Ideally that couldn't happen, but considering
that the two plpythons are likely to get packaged in separate subpackages
(certainly Red Hat does things that way), it's not too hard to envision
cases where it would.  So the attached update corrects that and fixes the
docs too.  We can lose the whole test in HEAD, but I think it's necessary
in the back branches.

The question of whether to do ERROR or FATAL remains open.  I'm not sure
I have a strong preference either way.

regards, tom lane

diff --git a/doc/src/sgml/plpython.sgml b/doc/src/sgml/plpython.sgml
index 015bbad..03f6bcc 100644
*** a/doc/src/sgml/plpython.sgml
--- b/doc/src/sgml/plpython.sgml
***
*** 176,183 
 based on Python 3 in the same session, because the symbols in the
 dynamic modules would clash, which could result in crashes of the
 PostgreSQL server process.  There is a check that prevents mixing
!Python major versions in a session, which will abort the session if
!a mismatch is detected.  It is possible, however, to use both
 PL/Python variants in the same database, from separate sessions.

   
--- 176,183 
 based on Python 3 in the same session, because the symbols in the
 dynamic modules would clash, which could result in crashes of the
 PostgreSQL server process.  There is a check that prevents mixing
!Python major versions within a session.
!It is possible, however, to use both
 PL/Python variants in the same database, from separate sessions.

   
diff --git a/src/pl/plpython/plpy_main.c b/src/pl/plpython/plpy_main.c
index 3c2ebfa..6805148 100644
*** a/src/pl/plpython/plpy_main.c
--- b/src/pl/plpython/plpy_main.c
*** static void PLy_init_interp(void);
*** 63,68 
--- 63,71 
  static PLyExecutionContext *PLy_push_execution_context(void);
  static void PLy_pop_execution_context(void);
  
+ /* static state for Python library conflict detection */
+ static int *plpython_version_bitmask_ptr = NULL;
+ static int	plpython_version_bitmask = 0;
  static const int plpython_python_version = PY_MAJOR_VERSION;
  
  /* initialize global variables */
*** static PLyExecutionContext *PLy_executio
*** 75,102 
  void
  _PG_init(void)
  {
! 	/* Be sure we do initialization only once (should be redundant now) */
! 	static bool inited = false;
  	const int **version_ptr;
  
! 	if (inited)
! 		return;
  
! 	/* Be sure we don't run Python 2 and 3 in the same session (might crash) */
  	version_ptr = (const int **) find_rendezvous_variable("plpython_python_version");
  	if (!(*version_ptr))
  		*version_ptr = _python_version;
  	else
  	{
! 		if (**version_ptr != plpython_python_version)
  			ereport(FATAL,
  	(errmsg("Python major version mismatch in session"),
  	 errdetail("This session has previously used Python major version %d, and it is now attempting to use Python major version %d.",
  			   **version_ptr, plpython_python_version),
  	 errhint("Start a new session to use a different Python major version.")));
  	}
  
! 	pg_bindtextdomain(TEXTDOMAIN);
  
  #if PY_MAJOR_VERSION >= 3
  	PyImport_AppendInittab("plpy", PyInit_plpy);
--- 78,155 
  void
  _PG_init(void)
  {
! 	int		  **bitmask_ptr;
  	const int **version_ptr;
  
! 	/*
! 	 * Set up a shared bitmask variable telling which Python version(s) are
! 	 * loaded into this process's address space.  If there's more than one, we
! 	 * cannot call into libpython for fear of causing crashes.  But postpone
! 	 * the actual failure for later, so that operations like pg_restore can
! 	 * load more than one plpython library so long as they don't try to do
! 	 * anything much with the language.
! 	 */
! 	bitmask_ptr = (int **) find_rendezvous_variable("plpython_version_bitmask");
! 	if (!(*bitmask_ptr))		/* am I the first? */
! 		*bitmask_ptr = _version_bitmask;

Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-11 Thread Jim Nasby

On 1/11/16 1:00 PM, Tom Lane wrote:

There's at least one PL/Java implementation that does that.  The
interprocess communication overhead is pretty awful, IIRC.  Don't know
what they do about nested calls.


You'd think that pipes wouldn't be that much overhead...


>Obviously this is a lot more work than what you're proposing though.:(

Yeah.  I think what I'm suggesting is a back-patchable fix, which that
certainly wouldn't be.


Yeah, and it sounds like we need one.


The question of whether to do ERROR or FATAL remains open.  I'm not sure
I have a strong preference either way.


If they both get loaded is there risk of bad data happening? Personally, 
I'll take a traceable FATAL (or even PANIC) over data corruption every 
time. But I'm guessing that if you tried to use both you'd pretty 
immediately end up crashing the backend.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making plpython 2 and 3 coexist a bit better

2016-01-11 Thread Tom Lane
Jim Nasby  writes:
> On 1/11/16 1:00 PM, Tom Lane wrote:
>> The question of whether to do ERROR or FATAL remains open.  I'm not sure
>> I have a strong preference either way.

> If they both get loaded is there risk of bad data happening? Personally, 
> I'll take a traceable FATAL (or even PANIC) over data corruption every 
> time. But I'm guessing that if you tried to use both you'd pretty 
> immediately end up crashing the backend.

Yeah, the conservative solution would definitely be to use FATAL.
It's still a usability improvement over what we have now.

In further experimentation, I found out that even with this patch in
place, the plpython transform extensions still present a dump/reload
hazard:

test=# create extension plpython2u;
CREATE EXTENSION
test=# create extension ltree_plpython3u cascade;
NOTICE:  installing required extension "ltree"
NOTICE:  installing required extension "plpython3u"
ERROR:  multiple Python libraries are present in session
DETAIL:  Only one Python major version can be used in one session.

AFAICS, the reason for that is this code in the transform extension
scripts:

-- make sure the prerequisite libraries are loaded
DO '1' LANGUAGE plpython3u;

It does appear that we need something for this, because if you just
remove it you get failures like

Symbol not found: _PLyUnicode_FromStringAndSize

But I wonder why we couldn't make it do

LOAD 'plpython3';

instead.  If I change the script like that, it seems to go through fine
even with both Python libraries loaded, because CREATE TRANSFORM does not
actually call into the language implementation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers