On Wed, Jan 27, 2010 at 01:14:16AM -0500, Tom Lane wrote: > Andrew Dunstan <and...@dunslane.net> writes: > > Tim Bunce wrote: > >> - Added plperl.on_perl_init GUC for DBA use (PGC_SIGHUP) > >> SPI functions are not available when the code is run. > >> > >> - Added normal interpreter destruction behaviour > >> END blocks, if any, are run then objects are > >> destroyed, calling their DESTROY methods, if any. > >> SPI functions will die if called at this time. > > > So, are there still objections to applying this patch? > > Yes.
To focus the discussion I've looked back through all the messages from you that relate to this issue so I can summarize and try to address your objections. Some I've split or presented out of order, most relate to earlier (less restricted) versions of the patch before it was split out, and naturally they are lacking some context, so I've included archive URLs. Please forgive and correct me if I misrepresent you or your intent here. Regarding the utility of plperl.on_perl_init and END: http://archives.postgresql.org/message-id/18338.1260033...@sss.pgh.pa.us The question is not about whether we think it's useful; the question is about whether it's safe. I agree. Regarding visibility of changes to plperl.on_perl_init: http://archives.postgresql.org/message-id/28618.1259952...@sss.pgh.pa.us What is to happen if the admin changes the value when the system is already up? If a GUC could be defined as PGC_BACKEND and only settable by superuser, perhaps that would be a good fit. [GucContext seems to conflate some things.] Meanwhile the _init name is meant to convey the fact that it's a before-first-use GUC, like temp_buffers. I'm happy to accept whatever you'd recommend by way of PGC_* GUC selection. Documentation can note any caveats associated with combining plperl.on_perl_init with shared_preload_libraries. http://archives.postgresql.org/message-id/4516.1263168...@sss.pgh.pa.us However, I think PGC_SIGHUP would be enough to address my basic worry, which is that people shouldn't be depending on the ability to set these things within an individual session. The patch uses PGC_SIGHUP for plperl.on_perl_init. http://archives.postgresql.org/message-id/8950.1259994...@sss.pgh.pa.us > Tom, what's your objection to Shlib load time being user-visible? It's not really designed to be user-visible. Let me give you just two examples: * We call a plperl function for the first time in a session, causing plperl.so to be loaded. Later the transaction fails and is rolled back. If loading plperl.so caused some user-visible things to happen, should those be rolled back? If so, how do we get perl to play along? If not, how do we get postgres to play along? I believe that's addressed by spi functions being disabled when init code runs. * We call a plperl function for the first time in a session, causing plperl.so to be loaded. This happens in the context of a superuser calling a non-superuser security definer function, or perhaps vice versa. Whose permissions apply to whatever the on_load code tries to do? (Hint: every answer is wrong.) I think that related to on_*trusted_init not plperl.on_perl_init, and is also addressed by spi functions being disabled when init code runs. That doesn't even begin to cover the problems with allowing any of this to happen inside the postmaster. Recall that the postmaster does not have any database access. I believe that's addressed by spi functions being disabled when init code runs. Furthermore, it is a very long established reliability principle around here that the postmaster process should do as little as possible, because every thing that it does creates another opportunity to have a nonrecoverable failure. The postmaster can recover if a child crashes, but the other way round, not so much. I understand that concern. Ultimately, though, that comes down to the judgement of DBAs and the trust placed in them. They can already load arbitrary code via shared_preload_libraries. http://archives.postgresql.org/message-id/18338.1260033...@sss.pgh.pa.us > I think if we do this the on_perl_init setting should probably be > PGC_POSTMASTER, which would remove any issue about it changing > underneath us. Yes, if the main intended usage is in combination with preloading perl at postmaster start, it would be pointless to imagine that PGC_SIGHUP is useful anyway. http://archives.postgresql.org/message-id/17793.1260031...@sss.pgh.pa.us Yeah, in the shower this morning I was thinking that not loading SPI till after the on_init code runs would alleviate the concerns about transactionality and permissions --- that would ensure that whatever on_init does affects only the Perl world and not the database world. That's included in the current patch (and also applies to END blocks). However, we're not out of the woods yet. In a trusted interpreter (plperl not plperlu), is the on_init code executed before we lock down the interpreter with Safe? I would think it has to be since the main point AFAICS is to let you preload code via "use". But then what is left of the security guarantees of plperl? I can hardly imagine DBAs wanting to vet a few thousand lines of random Perl code to see if it contains anything that could be subverted. plperl.on_perl_init code, set by the DBA, runs before the Safe compartment is created. Without explicitextra steps the Safe compartment has no access to code loaded by plperl.on_perl_init. The Safe compartment (plperl) could get access to loaded code in one of these ways: 1. by using SQL to call a plperlu function that accesses the code. 2. by the DBA 'sharing' a specific subroutine with the compartment. 3. by the DBA loading a module into the compartment. There's no formal interface for 2. and 3. at the moment, so the only official option is 1. (The final patch in the series includes some building blocks towards an interface for 2 & 3, but doesn't add one.) If you're willing to also confine the feature to plperlu, then maybe the risk level could be decreased from insane to merely unreasonable. I think you could reasonably describe plperl.on_perl_init as effectively confined to plperlu (because plperl has no access to any new code). http://archives.postgresql.org/message-id/26766.1263149...@sss.pgh.pa.us For the record, I think it's a bad idea to run arbitrary user-defined code in the postmaster, and I think it's a worse idea to run arbitrary user-defined code at backend shutdown (the END-blocks bit). I do not care in the least what applications you think this might enable --- the negative consequences for overall system stability seem to me to outweigh any possible arguments on that side. - What happens when the supplied code has errors, For on_perl_init it throws an exception that propagates to the user statement that triggered the initialization of perl. It also ensures that perl is left in a non-initialized state, so any further uses also fail. For END blocks an error triggers an exception that's caught by perl. (As noted above, there's no access to postgres from init or END code.) - takes an unreasonable amount of time to run, Unreasonable is in the eye of the DBA, of course, and they have the discretion to set on_perl_init to fit their needs. For END blocks, I don't see how this issue is any different from "users might do something dumb", like DO 'while(1){}' language plperl; (or plpython , pltcl, or plpgsql for that matter). - does something unsafe, Such as? The code can't do anything more unsafe than is already possible. - depends on the backend not being in an error state already, The code has no access to postgress, whatever the state. - etc. etc? I'd welcome more concrete examples of potential issues. Tim. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers