Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-20 Thread Bruce Momjian
Greg Stark wrote:
 On Mon, Aug 15, 2011 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  ... and that would be a seriously bad API. ?There are not SUSET
  restrictions on other resources such as work_mem. ?Why do we need
  one for this?
 
 I think a better analogy would be imposing a maximum number of rows a
 query can output. That might be a sane thing to have for some
 circumstances but it's not useful in general.
 
 Consider for instance my favourite recursive query application,
 displaying the lock dependency graph for pg_locks. What arbitrary
 maximum number of locks would you like to impose at which the query
 should error out?
 
 There is a situation though that I think is motivating this though
 where it would be nice to detect a problem: when the query is such
 that it *can't* produce a record because there's an infinite loop
 before the first record. Ideally you want some way to detect that
 you've recursed and haven't changed anything that could lead to a
 change in the recursion condition. But that seems like a pretty hard
 thing to detect, probably impossible.

Actually, using UNION instead of UNION ALL does prevent some infinite
loops:

WITH RECURSIVE source AS (
SELECT 'Hello'
UNION
SELECT 'Hello' FROM source
)
SELECT * FROM source;

Change that to UNION ALL and you have an infinite loop.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-20 Thread Peter Geoghegan
On 20 August 2011 15:34, Bruce Momjian br...@momjian.us wrote:
 Actually, using UNION instead of UNION ALL does prevent some infinite
 loops:

While that is worth pointing out, it cannot be recommended as a way of
preventing infinite recursion; after all, all 5 WITH RECURSIVE
examples in the docs use UNION ALL. It's just a different way of
specifying a terminating condition that isn't likely to be applicable
to more complicated rCTEs.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-16 Thread Magnus Hagander
On Mon, Aug 15, 2011 at 23:49, Greg Stark st...@mit.edu wrote:
 On Mon, Aug 15, 2011 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... and that would be a seriously bad API.  There are not SUSET
 restrictions on other resources such as work_mem.  Why do we need
 one for this?

 I think a better analogy would be imposing a maximum number of rows a
 query can output. That might be a sane thing to have for some
 circumstances but it's not useful in general.

Uh. You mean like LIMIT, which we already have?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-16 Thread Andrew Dunstan



On 08/16/2011 04:56 AM, Magnus Hagander wrote:

On Mon, Aug 15, 2011 at 23:49, Greg Starkst...@mit.edu  wrote:

On Mon, Aug 15, 2011 at 9:31 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

... and that would be a seriously bad API.  There are not SUSET
restrictions on other resources such as work_mem.  Why do we need
one for this?

I think a better analogy would be imposing a maximum number of rows a
query can output. That might be a sane thing to have for some
circumstances but it's not useful in general.

Uh. You mean like LIMIT, which we already have?


There is no LIMIT imposed on a query by a server setting, which would be 
the right analogy here.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 7:23 AM, Andrew Dunstan and...@dunslane.net wrote:
 There is no LIMIT imposed on a query by a server setting, which would be the
 right analogy here.

I am not sure I understand any of these analogies.  I think Peter's
point is that it's not very difficult to write (perhaps accidentally)
a CTE that goes into infinite recursion.  In general, we can't detect
that situation, because it's equivalent to the halting problem.  But
there's an old joke about a Turing test (where a computer program must
try to fool a human into believing that it is also human) where the
person asks the computer:

What would the following program do?
10 PRINT HELLO
20 GOTO 10

And gets back an infinite stream of HELLO HELLO HELLO HELLO HELLO

I don't think it's going to be feasible to implement a security
restriction that keeps untrusted users from hosing the machine with a
long running CTE; there are nearly infinitely many ways for an
untrusted user who can run queries to hose the machine, and plugging
one of them imperfectly is going to get us pretty much nowhere.  On
the other hand, there is perhaps a reasonable argument to be made that
we should cut off CTE processing at some point to prevent
*inadvertent* exhaustion of system resources.  Or even query
processing more generally.

In fact, we already have some things sort of like this: you can use
statement_timeout to kill queries that run for too long, and we just
recently added temp_file_limit to kill those that eat too much temp
file space.   I can see a good case for memory_limit and
query_cpu_limit and maybe some others.  cte_recursion_depth_limit
wouldn't be all that high on my personal list, I guess, but the
concept doesn't seem completely insane.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I don't think it's going to be feasible to implement a security
 restriction that keeps untrusted users from hosing the machine with a
 long running CTE; there are nearly infinitely many ways for an
 untrusted user who can run queries to hose the machine, and plugging
 one of them imperfectly is going to get us pretty much nowhere.  On
 the other hand, there is perhaps a reasonable argument to be made that
 we should cut off CTE processing at some point to prevent
 *inadvertent* exhaustion of system resources.  Or even query
 processing more generally.

Indeed: the real question here is why a recursive CTE is any worse
than, say, an accidentally unconstrained join (or three or four...).

However, we already have a perfectly suitable general mechanism for
that; it's called statement_timeout.

I think we've already had the discussion about whether there should be
a system-wide SUSET maximum statement_timeout, and rejected it on the
grounds that there was not a very clear need for it.

 In fact, we already have some things sort of like this: you can use
 statement_timeout to kill queries that run for too long, and we just
 recently added temp_file_limit to kill those that eat too much temp
 file space.   I can see a good case for memory_limit and
 query_cpu_limit and maybe some others.

temp_file_limit got accepted because it was constraining a resource not
closely related to run time.  I don't think that it provides a precedent
in support of any of these other ideas.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-16 Thread Robert Haas
On Tue, Aug 16, 2011 at 10:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 In fact, we already have some things sort of like this: you can use
 statement_timeout to kill queries that run for too long, and we just
 recently added temp_file_limit to kill those that eat too much temp
 file space.   I can see a good case for memory_limit and
 query_cpu_limit and maybe some others.

 temp_file_limit got accepted because it was constraining a resource not
 closely related to run time.  I don't think that it provides a precedent
 in support of any of these other ideas.

Well, CPU usage might be somewhat closely related to query runtime,
but memory usage sure isn't.

But we digress from $SUBJECT...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-16 Thread Peter Geoghegan
On 16 August 2011 14:43, Robert Haas robertmh...@gmail.com wrote:
 What would the following program do?
 10 PRINT HELLO
 20 GOTO 10

 And gets back an infinite stream of HELLO HELLO HELLO HELLO HELLO

heh, that's pretty funny. It also compliments my view, because the
Turing test is only failed because the human eventually thinks hmm,
he's taking way too long to get to the '...and so on infinitum' bit.

 I don't think it's going to be feasible to implement a security
 restriction that keeps untrusted users from hosing the machine with a
 long running CTE; there are nearly infinitely many ways for an
 untrusted user who can run queries to hose the machine, and plugging
 one of them imperfectly is going to get us pretty much nowhere.

Unless that happens to be the exact area that is a problem for you,
due perhaps to a poorly written application. We're protecting against
Murphy, not Machiavelli - if your users are malicious, or are
motivated by seeing if they can somehow hose the machine for kicks,
clearly all bets are off. This mindset happens to pretty well meet the
needs of industry, IMHO. That said, I admit the case for making a
separate SUSET GUC is the least compelling one I've made on this
thread, if only because of the glaring inconsistency with other areas.

 On the other hand, there is perhaps a reasonable argument to be made that
 we should cut off CTE processing at some point to prevent
 *inadvertent* exhaustion of system resources.  Or even query
 processing more generally.

 In fact, we already have some things sort of like this: you can use
 statement_timeout to kill queries that run for too long, and we just
 recently added temp_file_limit to kill those that eat too much temp
 file space.

statement_timeout is far too blunt an instrument to deal with this
problem. For one thing, it may vary based on many external factors,
whereas number of iterations is a consistent, useful metric for the
WITH query in isolation. For another, it prevents the DBA from
managing known problems with deployed apps per database - maybe they
have a reporting query that is expected to take a really long time.
Sure, they can increase statement_timeout when that it run, but that's
another thing to remember.

 I can see a good case for memory_limit and
 query_cpu_limit and maybe some others.  cte_recursion_depth_limit
 wouldn't be all that high on my personal list, I guess, but the
 concept doesn't seem completely insane.

I agree that those things would be much better than this. This is
still a useful, easy-to-implement feature though.

On 16 August 2011 15:26, Tom Lane t...@sss.pgh.pa.us wrote:
 Indeed: the real question here is why a recursive CTE is any worse
 than, say, an accidentally unconstrained join (or three or four...).

It's much worse because an unconstrained join query will not
all-of-a-sudden fail to have a terminating condition. It will, for the
most part, take forever or practically forever predictably and
consistently, even as the contents of tables changes over time.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Should we have an optional limit on the recursion depth of recursive CTEs?

2011-08-15 Thread Greg Stark
On Mon, Aug 15, 2011 at 9:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... and that would be a seriously bad API.  There are not SUSET
 restrictions on other resources such as work_mem.  Why do we need
 one for this?

I think a better analogy would be imposing a maximum number of rows a
query can output. That might be a sane thing to have for some
circumstances but it's not useful in general.

Consider for instance my favourite recursive query application,
displaying the lock dependency graph for pg_locks. What arbitrary
maximum number of locks would you like to impose at which the query
should error out?

There is a situation though that I think is motivating this though
where it would be nice to detect a problem: when the query is such
that it *can't* produce a record because there's an infinite loop
before the first record. Ideally you want some way to detect that
you've recursed and haven't changed anything that could lead to a
change in the recursion condition. But that seems like a pretty hard
thing to detect, probably impossible.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers