Re: [HACKERS] [patch] plproxy v2

2008-07-24 Thread Hannu Krosing
On Tue, 2008-07-22 at 11:25 -0400, Tom Lane wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
  And user can execute only pre-determines queries/functions on system2.
 
 If that were actually the case then the security issue wouldn't loom
 quite so large, but the dynamic_query example in the plproxy regression
 tests provides a perfect example of how to ruin your security.

The idea is to allow the pl/proxy user only access to the needed
functions and nothing else on the remote db side.

dynamic_query ruins your security, if your pl/proxy remote user has too
much privileges.

  Do you still see a big hole?
 
 Truck-sized, at least.
 
 The complaint here is not that it's impossible to use plproxy securely;
 the complaint is that it's so very easy to use it insecurely.

You mean easy like it is very easy to always use your OS as root ?

On Unix this is fixed by stating it as a bad idea in docs (and numerous
books), on windows you have a privileged checkbox when creating new
users.

---
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-23 Thread Marko Kreen
On 7/22/08, Marko Kreen [EMAIL PROTECTED] wrote:
 On 7/22/08, Tom Lane [EMAIL PROTECTED] wrote:
   Marko Kreen [EMAIL PROTECTED] writes:
 Also, plroxy does
 _nothing_ with passwords.  That means the password for remote
 connection must be in postgres user's .pgpass,
  
  
   That seems *exactly* backwards, because putting the password in postgres
user's .pgpass is as good as disabling password auth altogether.
Consider that it would also hand all the keys to the kingdom over to
someone who had access to dblink on the same machine (not even the same
cluster, so long as it was run by the same postgres user!).

 Good point.

I happened to take a look at dblink and I'm not convinced anymore.

Any untrusted PL can be used for remote calls and in all cases,
including PL/Proxy, the superuser must create the function that
specifies both connect string and query.  To allow regular user
specify either connect string or query, the superuser must take
explicit action by coding the function that way.

But for dblink, the *normal* mode of action is that regular user
can (heh, *must*) specify both connect string and query...
And the require-password-hack for dblink makes things only worse
as it obfuscates the security implications of using it.

So indeed, there is a hole a truck can drive through and
it's called 'dblink'.  And I don't I should make life miserable
for PL/Proxy users just because core has merged insecure module.

Currently the PL/Proxy acts the same as any other libpq-using PL,
using .pgpass is quite natural for them and I don't see any reason
to lock it down further for no good reason.

I know you had a fiasco with dblink security already, but PL/Proxy
is not in dblink camp.  It's in untrusted-PL camp.

So, now we have clear technical argument for immediate merge -
as a replacement for dblink, as it's more secure and better designed.


And only thing PL/Proxy really needs is more documentation, there needs
to be list of various ways to set it up and security implications of each.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-22 Thread Andrew Sullivan
On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote:
 Marko Kreen [EMAIL PROTECTED] writes:

  2.  If cluster connection strings do not have 'user=' key,
  ' user=' || current_username() is appended to it.
 
 Cool, I missed that.  At minimum the documentation has to explain this
 point and emphasize the security implications.  Is it a good idea
 to allow user= in the cluster strings at all?

I wondered about this myself.  Is there anything at all preventing me
from doing 'user=' for some other user?  If not. . .

  Also, plroxy does
  _nothing_ with passwords.  That means the password for remote
  connection must be in postgres user's .pgpass,
 
 That seems *exactly* backwards, because putting the password in postgres
 user's .pgpass is as good as disabling password auth altogether.

. . .this means that any user on system1 for which there is at least
one user on system2 with plproxy access automatically also has that
access on system2.  (Plus what Tom noted).

 We regularly get beat up about any aspect of our security apparatus
 that isn't secure by default.  This definitely isn't, and from
 a PR point of view (if nothing else) that doesn't seem a good idea.

I'm less worried about the PR, and more worried about the truck-sized
hole this opens in any authentication controls.  It seems to me that
it's a fairly serious problem.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-22 Thread Marko Kreen
On 7/22/08, Andrew Sullivan [EMAIL PROTECTED] wrote:
 On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote:
   Marko Kreen [EMAIL PROTECTED] writes:
   2.  If cluster connection strings do not have 'user=' key,
' user=' || current_username() is appended to it.
  
   Cool, I missed that.  At minimum the documentation has to explain this
   point and emphasize the security implications.  Is it a good idea
   to allow user= in the cluster strings at all?


 I wondered about this myself.  Is there anything at all preventing me
  from doing 'user=' for some other user?  If not. . .

For that you need to overwrite the plproxy.get_cluster_partitions()
function or the data it operates on.

I don't see any hole in this, unless explicitly created.

Also, plroxy does
_nothing_ with passwords.  That means the password for remote
connection must be in postgres user's .pgpass,
  
   That seems *exactly* backwards, because putting the password in postgres
   user's .pgpass is as good as disabling password auth altogether.


 . . .this means that any user on system1 for which there is at least
  one user on system2 with plproxy access automatically also has that
  access on system2.  (Plus what Tom noted).

For that the system2 needs to be added as partion to a cluster.
Or specified explicitly in CONNECT statement.

And user can execute only pre-determines queries/functions on system2.

   We regularly get beat up about any aspect of our security apparatus
   that isn't secure by default.  This definitely isn't, and from
   a PR point of view (if nothing else) that doesn't seem a good idea.

 I'm less worried about the PR, and more worried about the truck-sized
  hole this opens in any authentication controls.  It seems to me that
  it's a fairly serious problem.

Do you still see a big hole?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-22 Thread Tom Lane
Andrew Sullivan [EMAIL PROTECTED] writes:
 On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
 2.  If cluster connection strings do not have 'user=' key,
 ' user=' || current_username() is appended to it.
 
 Cool, I missed that.  At minimum the documentation has to explain this
 point and emphasize the security implications.  Is it a good idea
 to allow user= in the cluster strings at all?

 I wondered about this myself.  Is there anything at all preventing me
 from doing 'user=' for some other user?  If not. . .

I think the assumption is that the cluster connection info would be set
up by a superuser.  However, if there's any way for a non-superuser to
subvert the info returned by the plproxy configuration functions, you
got trouble.  So a lot would depend on how carefully those are coded.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-22 Thread Marko Kreen
On 7/22/08, Tom Lane [EMAIL PROTECTED] wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
   On 7/21/08, Tom Lane [EMAIL PROTECTED] wrote:
  I looked through this a bit, and my principal reaction was what are
   the security implications?
  There are 2 aspects to it:
   1.  Function can be created only by superuser.

 What I'm concerned about is who they can be *called* by.  I'd be happier
  if the default behavior was that there was no public execute privilege
  for plproxy functions.

  I think right now that could be enforced by having plproxy's validator
  procedure replace any null proacl entry with something that explicitly
  refuses public execute.  That's a bit of a hack though.  Maybe it'd be
  worth inventing per-PL default ACLs, instead of having a
  one-size-fits-all policy?

Note1 that if user (admin) wants he can also do user filtering/mapping
in plproxy.* functions.

Note2 - instead of restricting privileges on actual functions, we could
instead restrict privilege on the 2 functions under 'plproxy' schema,
or directly on schema.  Seems simpler.  Eg. create simple default
installation with REVOKE ALL FROM PUBLIC.

   2.  If cluster connection strings do not have 'user=' key,
   ' user=' || current_username() is appended to it.


 Cool, I missed that.  At minimum the documentation has to explain this
  point and emphasize the security implications.

Ok.

  Is it a good idea
  to allow user= in the cluster strings at all?

I think so - if the plproxy database itself already is main point of
authentication, both can-connect and can-execute sense, then it's
good to avoid complicating setup and send queries away under single
user that has minimal rights on partition dbs and can only execute
requested functions.

   Also, plroxy does
   _nothing_ with passwords.  That means the password for remote
   connection must be in postgres user's .pgpass,


 That seems *exactly* backwards, because putting the password in postgres
  user's .pgpass is as good as disabling password auth altogether.
  Consider that it would also hand all the keys to the kingdom over to
  someone who had access to dblink on the same machine (not even the same
  cluster, so long as it was run by the same postgres user!).

Good point.  Some ideas for password handling:

1. Require that user always provider both username and password in
   plproxy.get_cluster_partitions().  We could add separate function
   for that or add fields to plproxy.get_partitions(), although
   this is not necessary - user can add them simply to connect string.

   Main problems with this is that maybe you don't want to show the
   passwords to anyone who can execute plproxy.* functions?

2. Let PL/Proxy fetch user password hash from pg_shadow, add API to libpq
   to use the hash on authentication instead plaintext password.
   This ofcourse expects that remote server uses same auth method as
   current one.

   Despite appearance it does not have security problems - the hashes
   are already equivalent to plaintext password.

   But I don't think plproxy can and should protect dumb admins who
   create remote_exec(sql) function and allow untrusted users to
   execute it.

 We regularly get beat up about any aspect of our security apparatus
  that isn't secure by default.  This definitely isn't, and from
  a PR point of view (if nothing else) that doesn't seem a good idea.

  I repeat that I don't feel comfortable in the least with plproxy's
  security model.

Btw, I'm very thankful for your review.  I would really like improve the
security of plproxy whatever the merge decision will be, so hopefully
we can discuss it further.



To make discussion easier, here are list of possible problems/fixes
discussed thus far (as I see):

Problems:

- restrict users who can access remote dbs by default.
- avoid spreading passwords too far.
  - .pgpass gives them to any libpq client
  - inside connect string they are visible to calling user
(although only his own?)

Documentation/default setup fixes:

1. Restrict access to 'plproxy' schema or functions under that schema.
   Only users that have grants can use plproxy functions.

2. Create default setup that does user filtering/mapping by default.
   Have the permissions on the functions and tables carefully tuned
   to allow minimal access.

3. Make the default setup also handle passwords from tables.
   So instead user adding password handling insecurely, he can use it
   or remove it from already secure setup.

Code fixes:

4. Create plproxy functions without execute permissions by default.
   (Seems unnecessary as 1, 2 already give that?)

5. Let plproxy use user password hash directly from pg_shadow.
   (Unless user= or password= is given on connection string?)

Seems like restricting access is easy, but only 5) gives secure
password handling.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] [patch] plproxy v2

2008-07-22 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 And user can execute only pre-determines queries/functions on system2.

If that were actually the case then the security issue wouldn't loom
quite so large, but the dynamic_query example in the plproxy regression
tests provides a perfect example of how to ruin your security.

 Do you still see a big hole?

Truck-sized, at least.

The complaint here is not that it's impossible to use plproxy securely;
the complaint is that it's so very easy to use it insecurely.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-21 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 [ plproxy ]

I looked through this a bit, and my principal reaction was what are
the security implications?  It seems like it'd be very easy to create
functions that allow untrusted users to execute arbitrary SQL on
other databases in the plproxy cluster.  As far as I saw there was
no privilege-checking within plproxy itself, you were just relying
on SQL-level permissions checking --- so even though plproxy functions
can only be *created* by superusers, by default they can be *used* by
anybody.  So it'd take a great deal of care to avoid making
unintentional security holes.

I'm not sure about a good solution to this problem, but I think it needs
to be solved before plproxy can be recommended for widespread use.
The first thought that comes to mind is to somehow propagate the userid
on the calling server to the execution on the remote, so that a user
can't get more privilege on the remote than if he were executing there
directly.  I'm imagining that the definition of a cluster would include
a map from local to remote userids (and thereby imply that anyone not
explicitly listed can't do remote access at all).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-21 Thread Marko Kreen
On 7/21/08, Tom Lane [EMAIL PROTECTED] wrote:
 Marko Kreen [EMAIL PROTECTED] writes:
   [ plproxy ]

  I looked through this a bit, and my principal reaction was what are
  the security implications?  It seems like it'd be very easy to create
  functions that allow untrusted users to execute arbitrary SQL on
  other databases in the plproxy cluster.  As far as I saw there was
  no privilege-checking within plproxy itself, you were just relying
  on SQL-level permissions checking --- so even though plproxy functions
  can only be *created* by superusers, by default they can be *used* by
  anybody.  So it'd take a great deal of care to avoid making
  unintentional security holes.

  I'm not sure about a good solution to this problem, but I think it needs
  to be solved before plproxy can be recommended for widespread use.
  The first thought that comes to mind is to somehow propagate the userid
  on the calling server to the execution on the remote, so that a user
  can't get more privilege on the remote than if he were executing there
  directly.  I'm imagining that the definition of a cluster would include
  a map from local to remote userids (and thereby imply that anyone not
  explicitly listed can't do remote access at all).

There are 2 aspects to it:

1.  Function can be created only by superuser.
2.  If cluster connection strings do not have 'user=' key,
' user=' || current_username() is appended to it.  Note that
connections are per-backend, not shared.  Also, plroxy does
_nothing_ with passwords.  That means the password for remote
connection must be in postgres user's .pgpass, or there
is pooler between plproxy and remote database who handles
passwords.

What else do you see is needed?  I'm not sure a map is a good idea,
is seems to create unnecessary coplexity.  Ofcourse, it can be done.

But I don't think plproxy can and should protect dumb admins who
create remote_exec(sql) function and allow untrusted users to
execute it.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-21 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 On 7/21/08, Tom Lane [EMAIL PROTECTED] wrote:
 I looked through this a bit, and my principal reaction was what are
 the security implications?

 There are 2 aspects to it:

 1.  Function can be created only by superuser.

What I'm concerned about is who they can be *called* by.  I'd be happier
if the default behavior was that there was no public execute privilege
for plproxy functions.

I think right now that could be enforced by having plproxy's validator
procedure replace any null proacl entry with something that explicitly
refuses public execute.  That's a bit of a hack though.  Maybe it'd be
worth inventing per-PL default ACLs, instead of having a
one-size-fits-all policy?

 2.  If cluster connection strings do not have 'user=' key,
 ' user=' || current_username() is appended to it.

Cool, I missed that.  At minimum the documentation has to explain this
point and emphasize the security implications.  Is it a good idea
to allow user= in the cluster strings at all?

 Also, plroxy does
 _nothing_ with passwords.  That means the password for remote
 connection must be in postgres user's .pgpass,

That seems *exactly* backwards, because putting the password in postgres
user's .pgpass is as good as disabling password auth altogether.
Consider that it would also hand all the keys to the kingdom over to
someone who had access to dblink on the same machine (not even the same
cluster, so long as it was run by the same postgres user!).

 But I don't think plproxy can and should protect dumb admins who
 create remote_exec(sql) function and allow untrusted users to
 execute it.

We regularly get beat up about any aspect of our security apparatus
that isn't secure by default.  This definitely isn't, and from
a PR point of view (if nothing else) that doesn't seem a good idea.

I repeat that I don't feel comfortable in the least with plproxy's
security model.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Simon Riggs

On Sat, 2008-06-28 at 16:36 +0300, Marko Kreen wrote:

 I mentioned that I planned to remove SELECT/CONNECT too.
 Now I've thought about it more and it seems to me that its better
 to keep them.  As they give additional flexibility.

I very much like PL/Proxy and support your vision. Including the
features of PL/Proxy in core seems like a great idea to me.

If we have just a couple of commands, would it be easier to include
those features by some additional attributes on pg_proc? That way we
could include the features in a more native way, similar to the way we
have integrated text search, without needing a plugin language at all. 

  CREATE CLUSTER foo ...

  CREATE FUNCTION bar() CLUSTER foo RUN ON ANY ...

If we did that, we might also include a similar proxy feature for
tables, making the feature exciting for more users than just those who
can specify implementing all logic through functions. It would also
remove the need for a specific SELECT command in PL/Proxy.

  CREATE TABLE bar CLUSTER foo RUN ON ANY ...

If we're running a SELECT and all tables accessed run on the same
cluster we ship the whole SQL statement according to the RUN ON clause. 
It would effectively bring some parts of dblink into core.

If all tables not on same cluster we throw an error in this release, but
in later releases we might introduce distributed join features and full
distributed DML support.

Having the PL/Proxy features available via the catalog will allow a
clear picture of what runs where without parsing the function text. It
will also allow things like a pg_dump of all objects relating to a
cluster.

Adding this feature for tables would be interesting with Hot Standby,
since it would allow you to offload SELECT statements onto the standby
automatically.

This would be considerably easier to integrate than text search was.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Joshua D. Drake


On Tue, 2008-07-08 at 10:21 +0100, Simon Riggs wrote:
 On Sat, 2008-06-28 at 16:36 +0300, Marko Kreen wrote:
 
 I very much like PL/Proxy and support your vision. Including the
 features of PL/Proxy in core seems like a great idea to me.

 Adding this feature for tables would be interesting with Hot Standby,
 since it would allow you to offload SELECT statements onto the standby
 automatically.
 
 This would be considerably easier to integrate than text search was.

First let me say that I too enjoy PL/Proxy quite a bit. However, I don't
think it needs to be in core. I wouldn't mind seeing it in contrib (or
better yet modules/ should we ever get around to that).

Sincerely,

Joshua D. Drake



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Marko Kreen
On 7/8/08, Simon Riggs [EMAIL PROTECTED] wrote:
  On Sat, 2008-06-28 at 16:36 +0300, Marko Kreen wrote:
   I mentioned that I planned to remove SELECT/CONNECT too.
   Now I've thought about it more and it seems to me that its better
   to keep them.  As they give additional flexibility.

 I very much like PL/Proxy and support your vision. Including the
  features of PL/Proxy in core seems like a great idea to me.

  If we have just a couple of commands, would it be easier to include
  those features by some additional attributes on pg_proc? That way we
  could include the features in a more native way, similar to the way we
  have integrated text search, without needing a plugin language at all.

   CREATE CLUSTER foo ...

   CREATE FUNCTION bar() CLUSTER foo RUN ON ANY ...

  If we did that, we might also include a similar proxy feature for
  tables, making the feature exciting for more users than just those who
  can specify implementing all logic through functions. It would also
  remove the need for a specific SELECT command in PL/Proxy.

   CREATE TABLE bar CLUSTER foo RUN ON ANY ...

  If we're running a SELECT and all tables accessed run on the same
  cluster we ship the whole SQL statement according to the RUN ON clause.
  It would effectively bring some parts of dblink into core.

  If all tables not on same cluster we throw an error in this release, but
  in later releases we might introduce distributed join features and full
  distributed DML support.

  Having the PL/Proxy features available via the catalog will allow a
  clear picture of what runs where without parsing the function text. It
  will also allow things like a pg_dump of all objects relating to a
  cluster.

  Adding this feature for tables would be interesting with Hot Standby,
  since it would allow you to offload SELECT statements onto the standby
  automatically.

  This would be considerably easier to integrate than text search was.

Interesting proposal.

First I want to say - we can forget the SELECT/CONNECT statements
when discussing this approach.  They are in because they were easy
to add and gave some additional flexibility.  But they are not important.
If they don't fit some new approach, there is no problem dropping them.

So that leaves functions in form:

CLUSTER expr;
RUN ON expr;

and potentially SPREAD BY as discussed in:

http://lists.pgfoundry.org/pipermail/plproxy-users/2008-June/93.html

which sends different arguments to different partitions.  I'm not yet
sure it's worthwhile addition, but I work mostly on OLTP databases
and that feature would target OLAP ones.  So I let others decide.

Now few technical points about your proposal:

- One feature that current function-based configuration approach gives
  is that we can manage cluster configuration centrally and replicate
  to actual proxy databases.  And this is something I would like to keep.

  This can be solved by using also plain table or functions behind
  the scenes.

- How about CREATE REMOTE FUNCTION / TABLE .. ; for syntax?

- Currently both hash and cluster selection expressions can be
  quite free-form.  So parsing them out to some pg_proc field
  would not be much help actually.

And some philosophical points:

- PL/Proxy main use-case is complex read-write transactions
  in OLTP setting.  But remote table/views target simple
  read-only transactions with free-form queries.

- PL/Proxy has concrete argument list and free-form cluster
  and partition selection.  Remote tables have free-form
  arguments, maybe they want more rigid cluster / partition
  selection?

If the syntax and backend implementation can be merged, its good,
but it should not be forced.  So before we start adding syntax
to core, maybe it would be good to have concrete idea how the remote
tables will look like and what representation they want for a cluster?

Especially if you want to do stuff like distributed joins.

OTOH, if you say that current PL/Proxy approach fits remote tables
as well, I'm not against doing it SQL level.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Marko Kreen
On 7/8/08, Joshua D. Drake [EMAIL PROTECTED] wrote:
  On Tue, 2008-07-08 at 10:21 +0100, Simon Riggs wrote:
   On Sat, 2008-06-28 at 16:36 +0300, Marko Kreen wrote:

  I very much like PL/Proxy and support your vision. Including the
   features of PL/Proxy in core seems like a great idea to me.


  Adding this feature for tables would be interesting with Hot Standby,
   since it would allow you to offload SELECT statements onto the standby
   automatically.
  
   This would be considerably easier to integrate than text search was.


 First let me say that I too enjoy PL/Proxy quite a bit. However, I don't
  think it needs to be in core. I wouldn't mind seeing it in contrib (or
  better yet modules/ should we ever get around to that).

I'm not against contrib/ considering that the docs are now nicely
integrated, but then, whats the difference between src/pl/ and contrib/?

OTOH, if you argue LANGUAGE plproxy vs. CREATE REMOTE FUNCTION,
then thats different matter.  It seems to be we should do REMOTE FUNCTION
after, not before REMOTE TABLE as the table/view implementation
needs to dictate the actual details.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Simon Riggs

On Tue, 2008-07-08 at 18:29 +0300, Marko Kreen wrote:
 and potentially SPREAD BY as discussed in:
 
 
 http://lists.pgfoundry.org/pipermail/plproxy-users/2008-June/93.html
 

That *sounds* cool, but its just the first part of the implementation of
a massively parallel executor. You'll quickly end up wanting to do
something else as well. Redistributing data is the hard part of a hard
problem. I'd steer clear of that. Skytools are good cause they do simple
things well.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Joshua D. Drake


On Tue, 2008-07-08 at 18:43 +0300, Marko Kreen wrote:
 On 7/8/08, Joshua D. Drake [EMAIL PROTECTED] wrote:
  First let me say that I too enjoy PL/Proxy quite a bit. However, I don't
   think it needs to be in core. I wouldn't mind seeing it in contrib (or
   better yet modules/ should we ever get around to that).
 
 I'm not against contrib/ considering that the docs are now nicely
 integrated, but then, whats the difference between src/pl/ and contrib/?
 

I am actually against adding to the grammar which is what Simon was
suggesting. If it wants to go into src/pl I wouldn't have a problem with
that. Sorry for not being more clear.

Joshua D. Drake



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Simon Riggs

On Tue, 2008-07-08 at 08:58 -0700, Joshua D. Drake wrote:
 
 On Tue, 2008-07-08 at 18:43 +0300, Marko Kreen wrote:
  On 7/8/08, Joshua D. Drake [EMAIL PROTECTED] wrote:
   First let me say that I too enjoy PL/Proxy quite a bit. However, I don't
think it needs to be in core. I wouldn't mind seeing it in contrib (or
better yet modules/ should we ever get around to that).
  
  I'm not against contrib/ considering that the docs are now nicely
  integrated, but then, whats the difference between src/pl/ and contrib/?
  
 
 I am actually against adding to the grammar 

Why? 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Marko Kreen
On 7/8/08, Simon Riggs [EMAIL PROTECTED] wrote:
  On Tue, 2008-07-08 at 18:29 +0300, Marko Kreen wrote:
   and potentially SPREAD BY as discussed in:
  
   http://lists.pgfoundry.org/pipermail/plproxy-users/2008-June/93.html

 That *sounds* cool, but its just the first part of the implementation of
  a massively parallel executor. You'll quickly end up wanting to do
  something else as well. Redistributing data is the hard part of a hard
  problem. I'd steer clear of that. Skytools are good cause they do simple
  things well.

Well, for PL/Proxy it would be the _last_ part.  Yes, now the user
can build parallel OLAP executor, but all of this will be up to user.
PL/Proxy itself will stay dumb and simple.  It would not need do to
any guesswork, all the data will be provided by user.

The amount of code needed to make the SPREAD work would be minimal,
mostly reactoring of existing code is needed.  So it fits the
current design.

The point is - PL/Proxy already executes single query with same
arguments in parallel.  With the SPREAD feature it could execute
single query with different arguments in parallel.  And the
next step of executing different queries in parallel does not make
sense for PL/Proxy as it's main concept is function-calls not queries.

But ofcourse we can decide we don't want do go that way,
and that's ok also.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-08 Thread Marko Kreen
On 7/8/08, Joshua D. Drake [EMAIL PROTECTED] wrote:
  On Tue, 2008-07-08 at 18:43 +0300, Marko Kreen wrote:
   On 7/8/08, Joshua D. Drake [EMAIL PROTECTED] wrote:

   First let me say that I too enjoy PL/Proxy quite a bit. However, I don't
 think it needs to be in core. I wouldn't mind seeing it in contrib (or
 better yet modules/ should we ever get around to that).
  
   I'm not against contrib/ considering that the docs are now nicely
   integrated, but then, whats the difference between src/pl/ and contrib/?

 I am actually against adding to the grammar which is what Simon was
  suggesting. If it wants to go into src/pl I wouldn't have a problem with
  that. Sorry for not being more clear.

Current patch is for src/pl/.  Diffstat for patch v2:

 doc/src/sgml/filelist.sgml |1
 doc/src/sgml/plproxy.sgml  |  221 ++
 doc/src/sgml/postgres.sgml |1
 src/include/catalog/pg_pltemplate.h|1
 src/pl/Makefile|2
 src/pl/plproxy/Makefile|   89 ++
 src/pl/plproxy/cluster.c   |  469 +
 src/pl/plproxy/execute.c   |  724 +
 src/pl/plproxy/expected/plproxy_clustermap.out |   71 ++
 src/pl/plproxy/expected/plproxy_dynamic_record.out |   51 +
 src/pl/plproxy/expected/plproxy_errors.out |   66 +
 src/pl/plproxy/expected/plproxy_init.out   |2
 src/pl/plproxy/expected/plproxy_many.out   |  116 +++
 src/pl/plproxy/expected/plproxy_select.out |   37 +
 src/pl/plproxy/expected/plproxy_test.out   |  312 +
 src/pl/plproxy/function.c  |  479 +
 src/pl/plproxy/main.c  |  214 ++
 src/pl/plproxy/parser.y|  203 +
 src/pl/plproxy/plproxy.h   |  301 
 src/pl/plproxy/poll_compat.c   |  140 
 src/pl/plproxy/poll_compat.h   |   58 +
 src/pl/plproxy/query.c |  316 +
 src/pl/plproxy/result.c|  222 ++
 src/pl/plproxy/rowstamp.h  |   27
 src/pl/plproxy/scanner.l   |  320 +
 src/pl/plproxy/sql/plproxy_clustermap.sql  |   56 +
 src/pl/plproxy/sql/plproxy_dynamic_record.sql  |   43 +
 src/pl/plproxy/sql/plproxy_errors.sql  |   63 +
 src/pl/plproxy/sql/plproxy_init.sql|   57 +
 src/pl/plproxy/sql/plproxy_many.sql|   66 +
 src/pl/plproxy/sql/plproxy_select.sql  |   37 +
 src/pl/plproxy/sql/plproxy_test.sql|  200 +
 src/pl/plproxy/type.c  |  336 +
 33 files changed, 5299 insertions(+), 2 modifications(!)


-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] plproxy v2

2008-07-02 Thread Marko Kreen
I took the libery and changed the status from WIP to Pending Review,
because the pending cleanups do not affect reviewing nor committing.

And anyway, because of the patch size I expect reviewers requesting
furthers changes so I'd like to push the tiny stuff to final version
of the patch.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] plproxy v2

2008-06-28 Thread Marko Kreen
We have written a FAQ, where are common issues collected, there
are also discussion about design decisions and why concentrating
only to functions is good idea.

 http://plproxy.projects.postgresql.org/faq/faq.html

The patch itself is attached, and also made available here:

 http://plproxy.projects.postgresql.org/plproxy-v2.diff.gz

Changes since v1:
- dynamic record patch from Lei Yonghua.  That makes functions
  that require result types specified with AS clause work.
- accepts int2/4/8 from hash function
- removed get_cluster_config()
- draft sgml documentation
- disabled binary I/O by default.

- parser.y: remove use of alloca, define malloc/free to palloc/pfree
- scanner.l: fix crash if palloc() throws exception by reinitializing
  flex properly
- make installcheck work
- removed backwards compat code
- use bool instead bitfields to conform to Postgres coding style.

Todo:
- the docs should be more fleshed out
- should regtest moved to src/test/regress?
- make GUC vars for binary i/o and connection_lifetime?

I mentioned that I planned to remove SELECT/CONNECT too.
Now I've thought about it more and it seems to me that its better
to keep them.  As they give additional flexibility.

CONNECT - allows to test plproxy without creating plproxy schema.
Good for quick hacks, RPC-s.  Main problem is that as soon
user starts to have several environments - dev/test/live -
he will get hurt, but that could be solved by giving suggestions
in docs how to avoid that. Eg. suggest using service= or fake hostnames.

SELECT - main reason to keep it would be to allow calling
different functions, having different argument/result names
or types.

--
marko


plproxy-v2.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers