Re: CYCLE and SEARCH [was Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch]

2008-10-05 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 How hard would it be to add the infrastructure for CYCLE?

Personally I'm more interested in getting the recursive UNION (without
ALL) case to work.

I think that this might just be a matter of drawing on support that's
already there: have RecursiveUnion maintain a hash table of rows it's
already seen, and discard anything coming from either the non-recursive
term or the recursive term that is found to be already present in the
hash table.

Two objections to this are (a) it doesn't work for datatypes without
hash equality support, and (b) it would fail for recursion output
exceeding available memory.  However, the alternative of using
sort-and-uniq to detect duplicates seems pretty horrid in this situation
--- you'd need a fresh sort and mergejoin-like scan for every iteration
of the recursive term.  And it would become impractical to return rows
immediately after they're emitted by the subqueries.  So I'd be willing
to live with only supporting the hash implementation.

If no objections, I'll look at that next week ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


CYCLE and SEARCH [was Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch]

2008-10-04 Thread David Fetter
On Fri, Oct 03, 2008 at 03:55:56PM -0400, Tom Lane wrote:
 I've successfully taught the WITH patch to do single evaluation of
 WITH queries.  I've also been through all of the planner and
 executor code for it and am now feeling pretty happy with the whole
 thing.  There are still a small number of loose ends (see XXX in the
 patch), but I don't believe any of them represent significant work
 --- I just left them undone because they weren't in the way of
 testing anything else.

Great!

How hard would it be to add the infrastructure for CYCLE?  Here's the
kind of awful hack I've been doing lately in order to detect and
prevent cycles:

CREATE TABLE adjacency(
id INTEGER NOT NULL,
parent_id INTEGER
);

INSERT INTO adjacency VALUES(1,NULL),
(2,1),(3,1),(4,1),
(5,2),(6,2),(7,2),(8,3),(9,3),(10,4),
(11,5),(12,5),(13,6),(14,7),(15,8),
(9,1); /* Cycle! */

WITH RECURSIVE t(node, path) AS (
SELECT id, ARRAY[id]
FROM adjacency
WHERE parent_id IS NULL
UNION ALL
SELECT a1.id, t.path || a1.id
FROM adjacency a1 JOIN t ON (a1.parent_id = t.node)
WHERE a1.id  ANY(t.path) /* Remove cycle using awful hack :P */
)
SELECT
CASE WHEN array_upper(path,1)1 THEN '+-' ELSE '' END ||
REPEAT('--', array_upper(path,1)-2) ||
node AS Branch
FROM t
ORDER BY path;

I suspect that some kind of hash structure, instantiated only when a
CYCLE clause is specified, could help out with a much, much more
efficient implementation of cycle prevention.

Adding SEARCH will be a lot more complicated, as DEPTH FIRST is
completely different from how the implementation works now.  Any ideas
on how this might be approached?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-02 Thread Greg Stark



On 2 Oct 2008, at 05:44 AM, Tom Lane [EMAIL PROTECTED] wrote:


Hitoshi Harada [EMAIL PROTECTED] writes:

Hmm, I've looked over the patch. Logically window functions can  
access

arbitrary rows that have been stored in a frame. Thus I had thought
tuplestore should hold all the positions and allow arbitrary random
access indicated by integer. Maybe those functionalities can be
abstracted by the window function API itself. For this matter it  
seems

that you'd better to look at my future patch.


Well, the problem with defining it as arbitrary random access is  
that

there's no way for the tuplestore to throw away old data.


And that there's no way to make it work if the tuplestore has spilled  
to disk.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Tom Lane
Hitoshi Harada [EMAIL PROTECTED] writes:
 It seems to me to share some ideas with the MemoryContext concept: what
 about a TupstoreContext associated with tuplestore, you get a common default
 one if you don't register your own, and use
 tuplestore_gettuple(MyTupstoreContext, ...);

 I'm just working on tuplestore recording multiple positions for my
 window function project. Attached patch is still in progress but seems
 it works in a situation.

 From my work, the setting/getting read position and delegate savig
 positions to the caller will probably have problems, because of memory
 control for saving positions and tuplestore status changing (memory -
 BufFile). Instead, I decided it'd better that we can indicate the row
 number by integer.

That seems unreasonably inefficient --- it takes lots more file I/O,
and there's really no need for random access, at least not in what I'm
doing.

I looked at the tuplestore code some more, and my idea of allowing
callers to store their current position externally to tuplestore indeed
doesn't work.  The problem is that if dumptuples() happens there's no
way to convert an index-based position to a file-based one.  The current
code can handle it because it has access to all the positions (the read,
write, and mark pointers) but any positions stored externally wouldn't
get converted.

So it seems like the appropriate generalization is to have an array of
read positions inside the tuplestore and allow callers to say read
using position N, plus some API to allow positions to be allocated to
different requestors.  We could get rid of the separate mark pointer by
implementing an API that allows position X to be copied to position Y.
But the actual value of a position (a tuple number or file position
info) would never be exposed to callers.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 On Wed, Oct 1, 2008 at 2:54 PM, Tom Lane [EMAIL PROTECTED] wrote:
 So it seems like the appropriate generalization is to have an array of
 read positions inside the tuplestore and allow callers to say read
 using position N, plus some API to allow positions to be allocated to
 different requestors.

 One other reason the tuplestore should know the position of all the
 readers is that ideally it would want to be able to discard any tuples
 older than the oldest read position. That also means it needs to know
 when all the call sites have allocated their position and don't need
 to reset it.

Good point.  So we'd need per-position capability flags, not
per-tuplestore.

I hadn't realized that this would be relevant to window functions.
Now that I know that, I propose fixing tuplestore for multiple
positions and committing it separately, before I go back to the CTE
patch.  Then Hitoshi-san will have something he can work with too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Greg Stark
On Wed, Oct 1, 2008 at 2:54 PM, Tom Lane [EMAIL PROTECTED] wrote:
 So it seems like the appropriate generalization is to have an array of
 read positions inside the tuplestore and allow callers to say read
 using position N, plus some API to allow positions to be allocated to
 different requestors.  We could get rid of the separate mark pointer by
 implementing an API that allows position X to be copied to position Y.
 But the actual value of a position (a tuple number or file position
 info) would never be exposed to callers.


That's basicaly what had done (though i had n readers which
encapsulated the current pointer and mark).

One other reason the tuplestore should know the position of all the
readers is that ideally it would want to be able to discard any tuples
older than the oldest read position. That also means it needs to know
when all the call sites have allocated their position and don't need
to reset it.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Hitoshi Harada
 One other reason the tuplestore should know the position of all the
 readers is that ideally it would want to be able to discard any tuples
 older than the oldest read position. That also means it needs to know
 when all the call sites have allocated their position and don't need
 to reset it.

 Good point.  So we'd need per-position capability flags, not
 per-tuplestore.

 I hadn't realized that this would be relevant to window functions.
 Now that I know that, I propose fixing tuplestore for multiple
 positions and committing it separately, before I go back to the CTE
 patch.  Then Hitoshi-san will have something he can work with too.


Yes, tuplestore multiple positioning will give the greate help to the
window function. Ideally, it is better that tuplestore'd have all the
positions and have some kind of capability to discard old rows so that
it can stay in TSS_MEM, which helps window function's sliding frame.
But it isn't really critical, just performance matter.

Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Tom Lane
Hitoshi Harada [EMAIL PROTECTED] writes:
 I hadn't realized that this would be relevant to window functions.
 Now that I know that, I propose fixing tuplestore for multiple
 positions and committing it separately, before I go back to the CTE
 patch.  Then Hitoshi-san will have something he can work with too.

 Yes, tuplestore multiple positioning will give the greate help to the
 window function. Ideally, it is better that tuplestore'd have all the
 positions and have some kind of capability to discard old rows so that
 it can stay in TSS_MEM, which helps window function's sliding frame.

Okay, there's a patch in CVS HEAD that works this way.  Let me know if
it needs further tweaking for your purposes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Hitoshi Harada
2008/10/2 Tom Lane [EMAIL PROTECTED]:
 Hitoshi Harada [EMAIL PROTECTED] writes:
 I hadn't realized that this would be relevant to window functions.
 Now that I know that, I propose fixing tuplestore for multiple
 positions and committing it separately, before I go back to the CTE
 patch.  Then Hitoshi-san will have something he can work with too.

 Yes, tuplestore multiple positioning will give the greate help to the
 window function. Ideally, it is better that tuplestore'd have all the
 positions and have some kind of capability to discard old rows so that
 it can stay in TSS_MEM, which helps window function's sliding frame.

 Okay, there's a patch in CVS HEAD that works this way.  Let me know if
 it needs further tweaking for your purposes.

regards, tom lane


Hmm, I've looked over the patch. Logically window functions can access
arbitrary rows that have been stored in a frame. Thus I had thought
tuplestore should hold all the positions and allow arbitrary random
access indicated by integer. Maybe those functionalities can be
abstracted by the window function API itself. For this matter it seems
that you'd better to look at my future patch.
Everything else is great deal. By improving tuplestore_trim(), sliding
frame will be performed better than I'd thought.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-10-01 Thread Tom Lane
Hitoshi Harada [EMAIL PROTECTED] writes:
 2008/10/2 Tom Lane [EMAIL PROTECTED]:
 Okay, there's a patch in CVS HEAD that works this way.  Let me know if
 it needs further tweaking for your purposes.

 Hmm, I've looked over the patch. Logically window functions can access
 arbitrary rows that have been stored in a frame. Thus I had thought
 tuplestore should hold all the positions and allow arbitrary random
 access indicated by integer. Maybe those functionalities can be
 abstracted by the window function API itself. For this matter it seems
 that you'd better to look at my future patch.

Well, the problem with defining it as arbitrary random access is that
there's no way for the tuplestore to throw away old data.

The scheme that I have in mind here is that you keep (at least) two read
pointers, one that is where you're actually reading the data and one
that is nailing down the oldest point you might need to return to.
This is a generalization of the previous mark/restore logic to allow any
number of pairs of mark and restore points.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-30 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 On 24 Sep 2008, at 02:45, Tom Lane [EMAIL PROTECTED] wrote:
 The next big
 thing seems to be to figure out exactly how to do multiple references
 to CTE outputs, so that we can de-bogotify the planner.

 I've looked and don't seem to still have the source tree where I  
 worked on this. I remember how I made the changes to tuplestore which  
 was mostly mechanical. The trick I think will be in adding a special  
 purpose executor method which passes the call site to the node below.  
 This depends on the observation that if we always memoize results then  
 each call site can only have one active call. That is, we don't need  
 to maintain a full stack tree. 

I looked at the tuplestore code a bit and decided that this actually
doesn't need to be hard at all.  Tuplestore already has a notion of a
write position and an independent read position, and we don't need more
than one write position.  So what I think we should do is add get read
position and set read position functions to tuplestore.c, and have
each of the reader nodes remember its own read position.  That is,
each reader has to do

set_read_position(tupstore, local_read_position);
tuple = tuplestore_gettuple(tupstore, ...);
get_read_position(tupstore, local_read_position);

rather than just tuplestore_gettuple.  The set/get functions will be
cheap enough that this is no big deal.  (Or maybe we should just
provide a wrapper function that does this sequence?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-30 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

Le 30 sept. 08 à 20:03, Tom Lane a écrit :

set_read_position(tupstore, local_read_position);
tuple = tuplestore_gettuple(tupstore, ...);
get_read_position(tupstore, local_read_position);

rather than just tuplestore_gettuple.  The set/get functions will be
cheap enough that this is no big deal.  (Or maybe we should just
provide a wrapper function that does this sequence?)


It seems to me to share some ideas with the MemoryContext concept:  
what about a TupstoreContext associated with tuplestore, you get a  
common default one if you don't register your own, and use

tuplestore_gettuple(MyTupstoreContext, ...);

Maybe some other API would benefit from the idea?

Regards,
- --
dim


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkjigF4ACgkQlBXRlnbh1bkycQCgqs/+JBOd0SiN4xvKwLgEgi9F
BOYAoLm0Se6zs8cEAnoTlH6de7pLLh/l
=kzm1
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-30 Thread Hitoshi Harada
Hi,

2008/10/1 Dimitri Fontaine [EMAIL PROTECTED]:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi,

 Le 30 sept. 08 à 20:03, Tom Lane a écrit :

set_read_position(tupstore, local_read_position);
tuple = tuplestore_gettuple(tupstore, ...);
get_read_position(tupstore, local_read_position);

 rather than just tuplestore_gettuple.  The set/get functions will be
 cheap enough that this is no big deal.  (Or maybe we should just
 provide a wrapper function that does this sequence?)

 It seems to me to share some ideas with the MemoryContext concept: what
 about a TupstoreContext associated with tuplestore, you get a common default
 one if you don't register your own, and use
tuplestore_gettuple(MyTupstoreContext, ...);

 Maybe some other API would benefit from the idea?


I'm just working on tuplestore recording multiple positions for my
window function project. Attached patch is still in progress but seems
it works in a situation.

From my work, the setting/getting read position and delegate savig
positions to the caller will probably have problems, because of memory
control for saving positions and tuplestore status changing (memory -
BufFile). Instead, I decided it'd better that we can indicate the row
number by integer.

Regards,

-- 
Hitoshi Harada
*** a/src/backend/utils/sort/tuplestore.c
--- b/src/backend/utils/sort/tuplestore.c
***
*** 141,148  struct Tuplestorestate
--- 141,159 
  	int			markpos_current;	/* saved current */
  	int			markpos_file;	/* saved readpos_file */
  	off_t		markpos_offset; /* saved readpos_offset */
+ 
+ 	/*
+ 	 * When asked EXEC_FLAG_SETPOS, we need record multiple state-myfile pointer.
+ 	 * This pointer is used to optimize offset calculation closer to O(1).
+ 	 * npos_file is fixed layout file since only int + off_t will be needed to restore
+ 	 * tuple positions. int + off_t may get better by struct{int, off_t} to reduce call
+ 	 * BufFileWrite and BufFileRead?
+ 	 * This feature is off unless eflags is set.
+ 	 */
+ 	BufFile	   *npos_file;
  };
  
+ 
  #define COPYTUP(state,tup)	((*(state)-copytup) (state, tup))
  #define WRITETUP(state,tup) ((*(state)-writetup) (state, tup))
  #define READTUP(state,len)	((*(state)-readtup) (state, len))
***
*** 150,155  struct Tuplestorestate
--- 161,175 
  #define USEMEM(state,amt)	((state)-availMem -= (amt))
  #define FREEMEM(state,amt)	((state)-availMem += (amt))
  
+ #define RecordFilePointer(state,file,offset) do{\
+ 	BufFileTell((state)-myfile, \
+ (file), (offset)); \
+ 	BufFileWrite((state)-npos_file, \
+ (file), sizeof(int)); \
+ 	BufFileWrite((state)-npos_file, \
+ (offset), sizeof(off_t)); \
+ }while(0)
+ 
  /*
   *
   * NOTES about on-tape representation of tuples:
***
*** 200,205  static Tuplestorestate *tuplestore_begin_common(int eflags,
--- 220,226 
  		int maxKBytes);
  static void tuplestore_puttuple_common(Tuplestorestate *state, void *tuple);
  static void dumptuples(Tuplestorestate *state);
+ static void tuplestore_recordpos(Tuplestorestate *state);
  static void tuplestore_trim(Tuplestorestate *state, int ntuples);
  static unsigned int getlen(Tuplestorestate *state, bool eofOK);
  static void *copytup_heap(Tuplestorestate *state, void *tup);
***
*** 234,239  tuplestore_begin_common(int eflags, bool interXact, int maxKBytes)
--- 255,262 
  	state-eof_reached = false;
  	state-current = 0;
  
+ 	state-npos_file = NULL;
+ 
  	return state;
  }
  
***
*** 291,296  tuplestore_begin_heap(bool randomAccess, bool interXact, int maxKBytes)
--- 314,320 
   *		EXEC_FLAG_REWIND		need rewind to start
   *		EXEC_FLAG_BACKWARD		need backward fetch
   *		EXEC_FLAG_MARK			need mark/restore
+  *		EXEC_FLAG_SETPOS		need setpos
   * If tuplestore_set_eflags is not called, REWIND and MARK are allowed,
   * and BACKWARD is set per randomAccess in the tuplestore_begin_xxx call.
   */
***
*** 315,320  tuplestore_end(Tuplestorestate *state)
--- 339,346 
  
  	if (state-myfile)
  		BufFileClose(state-myfile);
+ 	if (state-npos_file)
+ 		BufFileClose(state-npos_file);
  	if (state-memtuples)
  	{
  		for (i = 0; i  state-memtupcount; i++)
***
*** 443,452  tuplestore_puttuple_common(Tuplestorestate *state, void *tuple)
--- 469,481 
  			 */
  			PrepareTempTablespaces();
  			state-myfile = BufFileCreateTemp(state-interXact);
+ 			if (state-eflags  EXEC_FLAG_SETPOS)
+ state-npos_file = BufFileCreateTemp(state-interXact);
  			state-status = TSS_WRITEFILE;
  			dumptuples(state);
  			break;
  		case TSS_WRITEFILE:
+ 			tuplestore_recordpos(state);
  			WRITETUP(state, tuple);
  			break;
  		case TSS_READFILE:
***
*** 461,466  tuplestore_puttuple_common(Tuplestorestate *state, void *tuple)
--- 490,496 
  			state-writepos_file, state-writepos_offset,
  			SEEK_SET) != 0)
  elog(ERROR, seek to EOF 

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-24 Thread Greg Stark



greg

On 24 Sep 2008, at 02:45, Tom Lane [EMAIL PROTECTED] wrote:

The next big
thing seems to be to figure out exactly how to do multiple references
to CTE outputs, so that we can de-bogotify the planner.


I've looked and don't seem to still have the source tree where I  
worked on this. I remember how I made the changes to tuplestore which  
was mostly mechanical. The trick I think will be in adding a special  
purpose executor method which passes the call site to the node below.  
This depends on the observation that if we always memoize results then  
each call site can only have one active call. That is, we don't need  
to maintain a full stack tree. 
  


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-24 Thread Tatsuo Ishii
Tom,

  WithClause node may need a location field, and almost certainly has to
  be handled somehow in exprLocation().
  
  The error reports in parse_cte.c *desperately* need error locations.

Included is a patch for this against your cte-0923.patch.gz. Most
errors now have error locations, but some do not. I'm going to think
more to enhance this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
*** pgsql/src/backend/parser/parse_cte.c2008-09-25 11:06:12.0 
+0900
--- pgsql.patched/src/backend/parser/parse_cte.c2008-09-25 
10:46:41.0 +0900
***
*** 239,245 
if (query-intoClause)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(subquery in WITH cannot have SELECT 
INTO)));
  
/* Compute the derived fields if not done yet */
if (!cte-cterecursive)
--- 239,247 
if (query-intoClause)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(subquery in WITH cannot have SELECT 
INTO),
!parser_errposition(pstate,
!   
exprLocation((Node *) query-intoClause;
  
/* Compute the derived fields if not done yet */
if (!cte-cterecursive)
***
*** 561,625 
 lresult != NON_RECURSIVE)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(Left hand side of 
UNION ALL must be a non-recursive term in a recursive query)));
  
else if (stmt-op == SETOP_INTERSECT)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(non-recursive term and 
recursive term must not be combined with INTERSECT)));
  
else if (stmt-op == SETOP_EXCEPT)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(non-recursive term and 
recursive term must not be combined with EXCEPT)));
  
else if (stmt-op == SETOP_UNION  stmt-all != true)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(non-recursive term and 
recursive term must be combined with UNION ALL)));
  
else if (stmt-op == SETOP_UNION  stmt-all == true 
 rarg-op == SETOP_UNION)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(Right hand side of 
UNION ALL must not contain UNION operation)));
  
else if (stmt-sortClause)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(ORDER BY in a 
recursive query not allowed)));
  
else if (stmt-limitOffset || stmt-limitCount)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(LIMIT OFFSET in a 
recursive query not allowed)));
  
else if (stmt-lockingClause)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(FOR UPDATE in a 
recursive query not allowed)));
  
else if (lresult == NON_RECURSIVE  rresult == 
RECURSIVE_SELF)
{
if (larg-distinctClause)
ereport(ERROR,

(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(DISTINCT in a 
non recursive term not allowed)));
  
if (rarg-distinctClause)
ereport(ERROR,

(errcode(ERRCODE_SYNTAX_ERROR),
!errmsg(DISTINCT in a 
recursive term not allowed)));
  
if (rarg-groupClause)
ereport(ERROR,

(errcode(ERRCODE_SYNTAX_ERROR),
!   

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-23 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 Here is a patch that is an initial attempt to reorganize the parse tree
 representation.

Oh dear, we seem to have spent yesterday doing the same work :-(

I'll go over this and try to merge it with my own WIP.

 * There are a couple of other rough points in places where it's hard to
 traverse up the parse tree or query tree.

Yeah, I'd been running into that issue too.  Adding an explicit pointer
to the CTE into the RTE doesn't work because it renders the parse tree
un-copiable (at least without something a lot more sophisticated than
copyObject; and saving/loading rule parsetrees would be tough too).

What I've got at the moment is that creation of an RTE_CTE RTE copies
the CTE's lists of output column types/typmods into the RTE.  This
eliminates the need for expandRTE and a couple of other places to be
able to find the CTE; everything they need is in the RTE.  So far as I
can see, everyplace else that might need to find the CTE from the RTE
is in places that either have a ParseState available, or have some
comparable structure that could provide a way to search upwards for
CTEs (eg, in ruleutils the deparse context will need to track
uplevel CTE lists as well as rtables).

It is a bit tedious though.  Can anyone think of another way that would
still preserve the notion of multiple RTEs being links to the same CTE
rather than independent subqueries?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-23 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Yeah, I'd been running into that issue too.  Adding an explicit pointer
 to the CTE into the RTE doesn't work because it renders the parse tree
 un-copiable (at least without something a lot more sophisticated than
 copyObject; and saving/loading rule parsetrees would be tough too).

Well the alternative to direct pointers is as you did with subqueries, turning
the set into a flat array and storing indexes into the array. I'm not sure if
that applies here or not.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-23 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Yeah, I'd been running into that issue too.  Adding an explicit pointer
 to the CTE into the RTE doesn't work because it renders the parse tree
 un-copiable (at least without something a lot more sophisticated than
 copyObject; and saving/loading rule parsetrees would be tough too).

 Well the alternative to direct pointers is as you did with subqueries, turning
 the set into a flat array and storing indexes into the array. I'm not sure if
 that applies here or not.

I think that just changes the problem into where can I find the array? ...

The real issue here is that simplicity of copying etc requires that
child nodes in a parse tree never have back-links leading up to their
parent.  If we were willing to drop that requirement for Query then
we'd not need any auxiliary data structures to chase up to upper-level
rtables or CTEs.  I'm not sure this cure is better than the disease
though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-23 Thread Tom Lane
Attached is the result of a couple of days hacking on the WITH RECURSIVE
patch.  This moves us a lot closer to having sanity in the parsing
phase, though I'm still quite unhappy with the second half of the
processing in parse_cte.c.  I added some infrastructure to make the
first half's search for CTE references reasonably bulletproof, but the
second half needs to be converted to use the same infrastructure, and
I didn't do that yet because I didn't understand what it was doing.
In particular, what the heck is the exception in findCteName that allows
some other CTE's non_recursive_term to be stored into the
non_recursive_term for the current one?  That seems mighty broken.

There are a number of unfinished corner cases (look for XXX in the
patch) but they aren't in the way of further progress.  The next big
thing seems to be to figure out exactly how to do multiple references
to CTE outputs, so that we can de-bogotify the planner.

regards, tom lane



bintORC4R6tx5.bin
Description: cte-0923.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-23 Thread Jeff Davis
I am re-sending this message to -hackers from yesterday, because the
first time it didn't appear to make it through. This time I gzipped the
patch. This is just for the archives (and to give context to the
replies), and this message is superseded by Tom's patch here:

http://archives.postgresql.org/pgsql-hackers/2008-09/msg01521.php


On Thu, 2008-09-18 at 12:55 +0900, Tatsuo Ishii wrote:
 Tom, thanks for the review.
 
 Here is an in-progress report. Patches against CVS HEAD attached.
 (uncommented items are work-in-progress).

Here is a patch that is an initial attempt to reorganize the parse tree
representation.

The goal of this patch is to separate the RTEs from the CTEs, so that we
can, for example, have multiple RTEs refer to the same CTE. This will
hopefully allow us to materialize a volatile query once, and have
several RTEs refer to that same value, which will meet the SQL standard.

Notes:

* It makes a p_cte_table in the ParseState, which is a list of
CteTblEntries. This replaces p_ctenamespace, which was a list of
RangeSubselects.

* It copies the p_cte_table into Query.cte_table

* I introduced a new type of RTE, RTE_CTE, which holds a cte_index and
cte_levelsup. This is used to find the CTE that the RTE references.

Weak points:

* It does not change the behavior of recursive queries. That is a little
more complicated, so I wanted to wait for feedback on my patch so far.

* I don't understand set_subquery_pathlist, or that general area of the
code. I made a new set_cte_pathlist, that is basically the same thing,
except I used a hack dummy_subquery variable in the RTE to pass along
a pointer to the subquery of the CTE. I think this dummy variable can be
removed, but I just don't understand that part of the code well enough
to know what should happen. And if it does require a subquery at that
point, I'll need to find a way of locating the right cte_table from
inside that function. Any advice here would be appreciated.

* There are a couple of other rough points in places where it's hard to
traverse up the parse tree or query tree.

I can probably work around these weak points, but I wanted to send the
patch to avoid a lot of conflicts or problems later. Tell me whether you
think this is moving in the right direction.

Regards,
Jeff Davis


cte_fixparse.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-20 Thread Tatsuo Ishii
 PlanState.has_recursivescan seems like a complete kluge.  Can't it just be
 removed?  It looks to me like it is working around bugs that hopefully aren't
 there anymore.  There is certainly no reason why a recursive CTE should be
 more in need of rescanning than any other kind of plan.

I don't think so. Recursion plan needs the hash table used by sublan
be re-created at each recursion loop stage. Remember that in each
evaluation of recursive plan, the recursive name is replaced by a
working table which is holding previous evalution result of recursion
stage. Thus the hash table corresponding to the work table needs to
be re-created.

 If it is needed then
 the current implementation is completely broken anyway, since it would only
 detect a RecursiveScan node that is directly underneath an agg or hash node.

Yeah, that's right. What I have in my mind is to implement something
similar to UpdateChangedParamSet family like mechanism which will
inherit working table change event to child node.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-20 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 PlanState.has_recursivescan seems like a complete kluge.  Can't it just be
 removed?  It looks to me like it is working around bugs that hopefully aren't
 there anymore.  There is certainly no reason why a recursive CTE should be
 more in need of rescanning than any other kind of plan.

 I don't think so. Recursion plan needs the hash table used by sublan
 be re-created at each recursion loop stage. Remember that in each
 evaluation of recursive plan, the recursive name is replaced by a
 working table which is holding previous evalution result of recursion
 stage. Thus the hash table corresponding to the work table needs to
 be re-created.

Oh, I see.  I keep getting confused about whether RecursiveScan is at the
top or the bottom of the recursion plan tree :-(.  Maybe it would help
to use a different name for it?  RecursionInjector or something like
that?

 If it is needed then
 the current implementation is completely broken anyway, since it would only
 detect a RecursiveScan node that is directly underneath an agg or hash node.

 Yeah, that's right. What I have in my mind is to implement something
 similar to UpdateChangedParamSet family like mechanism which will
 inherit working table change event to child node.

I think it could actually *be* UpdateChangedParamSet, if you just
associate some otherwise-unused Param with each RecursiveScan node,
and have the Recursion node signal a change of that Param when it
revises the work table.

In fact, why not combine that with getting rid of the klugy addition to
ExecutorState?  Make the param be actually useful: it could contain
some internal datastructure that passes the work table down to the
RecursiveScan node.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-18 Thread Tatsuo Ishii
 Why does set_recursion_pathlist think that the subquery might have
 useful pathkeys?  We know it must always be a UNION ALL, no?

Right. But someday we might implement UNION (without ALL) then we
have useful pathkeys...

Or shall I completely remove the step to generate patheys and do not
pass pathkeys to create_recursion_path?

pathkeys = convert_subquery_pathkeys(root, rel, 
subroot-query_pathkeys);
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-18 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 Why does set_recursion_pathlist think that the subquery might have
 useful pathkeys?  We know it must always be a UNION ALL, no?

 Right. But someday we might implement UNION (without ALL) then we
 have useful pathkeys...

Good point.  Might as well leave it in.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Pavel Stehule
2008/9/17 Tom Lane [EMAIL PROTECTED]:
 Tatsuo Ishii [EMAIL PROTECTED] writes:
 Do we really have to make RECURSIVE a fully reserved keyword?

 According to the standard, RECURSIVE is a reserved keyword, I believe.

 Sure, but our general rule is to make keywords no more reserved than
 is absolutely necessary to make the bison grammar unambiguous.  I
 haven't tested, but I'm thinking that if WITH is fully reserved then
 RECURSIVE shouldn't have to be.

I am not sure, if these rule is good. Somebody who develop on
postgresql should have a problems when they will be port to other
databases in future. Reserved words in standards should be respected.

regards
Pavel Stehule


regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Robert Haas
 I am not sure, if these rule is good. Somebody who develop on
 postgresql should have a problems when they will be port to other
 databases in future. Reserved words in standards should be respected.

I disagree.  I have never ported an app written for PostgreSQL to
another database system, and have no plans to start.  The fact that
some other database system might barf on a particular bit of SQL is
insufficient reason for PostgreSQL to do the same thing.

If people want to write code that will work on multiple databases,
they should of course avoid using any SQL reserved words for anything
other than their reserved purposes.  But there is no reason for the
database system to unilaterally shove that down everyone's throat.  It
is very easy to overdo the idea of protecting users from themselves.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Tom Lane
Robert Haas [EMAIL PROTECTED] writes:
 I am not sure, if these rule is good. Somebody who develop on
 postgresql should have a problems when they will be port to other
 databases in future. Reserved words in standards should be respected.

 I disagree.  I have never ported an app written for PostgreSQL to
 another database system, and have no plans to start.  The fact that
 some other database system might barf on a particular bit of SQL is
 insufficient reason for PostgreSQL to do the same thing.

 If people want to write code that will work on multiple databases,
 they should of course avoid using any SQL reserved words for anything
 other than their reserved purposes.

More than that, they have to actually test their SQL on each target DB.
Every DB (including us) is going to have some reserved words that are
not in the standard; so imagining that Postgres can all by itself
protect you from this type of problem is doomed to failure anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Kevin Grittner
 Tom Lane [EMAIL PROTECTED] wrote: 
 Robert Haas [EMAIL PROTECTED] writes:
 I am not sure, if these rule is good. Somebody who develop on
 postgresql should have a problems when they will be port to other
 databases in future. Reserved words in standards should be
respected.
 
 If people want to write code that will work on multiple databases,
 they should of course avoid using any SQL reserved words for
anything
 other than their reserved purposes.
 
 More than that, they have to actually test their SQL on each target
DB.
 Every DB (including us) is going to have some reserved words that
are
 not in the standard; so imagining that Postgres can all by itself
 protect you from this type of problem is doomed to failure anyway.
 
If someone wants portable code, they can use a development tool which
wraps ALL identifiers in quotes, every time.  That's what we do.  The
important thing is that, to the extent practicable, standard SQL code
is accepted and behaves in compliance with the standard.  I don't see
that it does anything to compromise that if you support additional,
non-standard syntax for extensions.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Tatsuo Ishii
 Is physical_tlist optimization sensible for RecursiveScan?  We seem
 to use it for every other Scan node type.

To enable physical_tlist optimization, it seems build_physical_tlist,
use_physical_tlist and disuse_physical_tlist need to be
changed. build_physical_tlist and use_physical_tlist have been already
patched and only disuse_physical_tlist needs to be patched. Any other
place I miss to enable the optimization?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 Is physical_tlist optimization sensible for RecursiveScan?  We seem
 to use it for every other Scan node type.

 To enable physical_tlist optimization, it seems build_physical_tlist,
 use_physical_tlist and disuse_physical_tlist need to be
 changed. build_physical_tlist and use_physical_tlist have been already
 patched and only disuse_physical_tlist needs to be patched. Any other
 place I miss to enable the optimization?

IIRC, the comment for build_physical_tlist hadn't been patched, but
yeah that seems like about it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Tatsuo Ishii
  To enable physical_tlist optimization, it seems build_physical_tlist,
  use_physical_tlist and disuse_physical_tlist need to be
  changed. build_physical_tlist and use_physical_tlist have been already
  patched and only disuse_physical_tlist needs to be patched. Any other
  place I miss to enable the optimization?
 
 IIRC, the comment for build_physical_tlist hadn't been patched, but
 yeah that seems like about it.

Yeah, I need to fix sloppy comments in the existing patches all over
the places:-)
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-17 Thread Tatsuo Ishii
Tom, thanks for the review.

Here is an in-progress report. Patches against CVS HEAD attached.
(uncommented items are work-in-progress).
--
Tatsuo Ishii
SRA OSS, Inc. Japan

 Tatsuo Ishii [EMAIL PROTECTED] writes:
  Included is the latest patches against CVS HEAD.
 
 I spent some time reading this patch.  Here are a few comments in
 no particular order:
 
 RangeRecursive node lacks copyfuncs/equalfuncs support.

Functions added.
 
 Query.recursive is missed in equalfuncs.c.  But rather than fix that,
 get rid of it entirely.  AFAICS the only use is in qual_is_pushdown_safe,
 and what is the value of that test?  The callers know perfectly well
 whether they are operating on a recursive RTE or not.  You might as well
 just delete all the useless qual-pushdown-attempt code from
 set_recursion_pathlist, and not need to touch qual_is_pushdown_safe
 at all.

Query.recursive removed and qual_is_pushdown_safe is untouched.

 Is physical_tlist optimization sensible for RecursiveScan?  We seem
 to use it for every other Scan node type.

Fixed and physical_tlist optimization is enabled for RecursiveScan I
believe.

 I dislike putting state into ExecutorState; that makes it impossible
 to have multiple recursion nodes in one plan tree.  It would probably
 be better for the Recursion and RecursiveScan nodes to talk to each
 other directly (compare HashJoin/Hash); although since they are not
 adjacent in the plan tree I admit I'm not sure how to do that.
 
 es_disallow_tuplestore doesn't seem to need to be in ExecutorState
 at all, it could as well be in RecursionState.

It was for a workaround to avoid an infinit recursion in some
cases. Discussion came to the conclusion that it's user's
responsibilty to avoid such that case (otherwise the semantics of our
recursive query becomes to be different from the one defined in the
standard) I believe.

es_disallow_tuplestore removed.

 I don't really like the way that Append nodes are being abused here.
 It would be better to allow nodeRecursion.c to duplicate a little code
 from nodeAppend.c, and have the child plans be direct children of
 the Recursion node.  BTW, is it actually possible to have more than
 two children?  I didn't spend enough time analyzing the restrictions
 in parse_cte to be sure.  If there are always just two then you could
 simplify the representation by treating it like a join node instead
 of an append.  (The RTE_RECURSIVE representation sure makes it look
 like there can be only two...)
 
 Mark/restore support seems useless ... note the comment on
 ExecSupportsMarkRestore (which should be updated if this code
 isn't removed).
 
 RecursiveScan claims to support backwards fetch, but does not in fact
 contain code to do so.  (Given that it will always be underneath
 Recursion, which can't do backwards fetch, I see little point in adding
 such code; fix execAmi.c instead.)
 
 ExecInitRecursion doesn't seem to be on the same page about whether
 it supports backward scan as execAmi.c, either.
 
 This comment in nodeRecursivescan.c seems bogus:
   /*
* Do not initialize scan tuple type, result tuple type and
* projection info in ExecInitRecursivescan. These types are
* initialized after initializing Recursion node.
*/
 because the code seems to be doing exactly what the comment says it
 doesn't.
 
 Numerous comments appear to have been copied-and-pasted and not modified
 from the original.  Somebody will have to go over all that text.
 
 ruleutils.c fails completely for non-recursive WITH.  It *must* regenerate
 such a query with a WITH clause, not as a flattened subquery which is what
 you seem to be doing.  This isn't negotiable because the semantics are
 different.  This will mean at least some change in the parsetree
 representation.  Perhaps we could add a bool to subquery RTEs to mark them
 as coming from a nonrecursive WITH?  The tests added for RTE_RECURSIVE
 seem a bit ugly too.  If it thinks that can't happen it should Assert so,
 not just fall through silently.
 
 commentary for ParseState.p_ctenamespace is gratuitously unlike the
 comment style for the other fields, and p_recursive_namespace isn't
 documented at all.
 
 ParseState.p_in_with_clause is unused, should be removed.

Done.

 The WithClause struct definition is poorly commented.  It should be
 stated that it is used only pre-parse-analysis (assuming that continues
 to be true after you get done fixing ruleutils.c...), and it doesn't
 say what the elements of the subquery list are (specifically, what
 node type).  A lot of the other added structs and fields could use
 better commenting too.
 
 For that matter subquery is a poor name for WithClause's list of CTEs,
 especially so since it's hard to search for.  It should be a plural name
 and I'd be inclined to use something like ctes not subqueries.
 The term subquery is too overloaded already, so any place you can
 refer to a WITH-list member as a CTE you should do so.
 
 WithClause node may need a 

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread David Fetter
On Mon, Sep 15, 2008 at 06:46:16PM +0900, Tatsuo Ishii wrote:
   * Single Evaluation:
   
 with
   foo(i) as (select random() as i)
 select * from foo union all select * from foo;
  i
 ---
  0.233165248762816
   0.62126633618027
 (2 rows)
   
 The standard specifies that non-recursive WITH should be
 evaluated once.
  
  What shall we do? I don't think there's an easy way to fix this as
  Tom suggested. Maybe we should not allow WITH clause without
  RECURISVE for 8.4?
 
 This is a still remaing issue...

I don't think that the spec explicitly forbids this.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Jeff Davis
On Mon, 2008-09-15 at 22:41 -0700, David Fetter wrote:
 On Mon, Sep 15, 2008 at 06:46:16PM +0900, Tatsuo Ishii wrote:
* Single Evaluation:

  with
foo(i) as (select random() as i)
  select * from foo union all select * from foo;
   i
  ---
   0.233165248762816
0.62126633618027
  (2 rows)

  The standard specifies that non-recursive WITH should be
  evaluated once.
   
   What shall we do? I don't think there's an easy way to fix this as
   Tom suggested. Maybe we should not allow WITH clause without
   RECURISVE for 8.4?
  
  This is a still remaing issue...
 
 I don't think that the spec explicitly forbids this.
 

This has been discussed before.

Regardless of the nuances of the spec (and whether it says so explicitly
or implicitly), there are people in the community that see single
evaluation as important, and important enough to be a showstopper.

Tom Lane has suggested that this is a reasonable amount of work to
complete, and minor in comparison to the rest of the patch.

I think the right approach is to try to complete it so that everyone is
happy. I will work on this, but unfortunately I don't have a lot of time
right now, so I can't make any promises.

The rest of the patch looks good so far (there are still a few things I
want to look at), so I think this is the biggest open issue.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 Included is the latest patches against CVS HEAD.

I spent some time reading this patch.  Here are a few comments in
no particular order:

RangeRecursive node lacks copyfuncs/equalfuncs support.

Query.recursive is missed in equalfuncs.c.  But rather than fix that,
get rid of it entirely.  AFAICS the only use is in qual_is_pushdown_safe,
and what is the value of that test?  The callers know perfectly well
whether they are operating on a recursive RTE or not.  You might as well
just delete all the useless qual-pushdown-attempt code from
set_recursion_pathlist, and not need to touch qual_is_pushdown_safe
at all.

Is physical_tlist optimization sensible for RecursiveScan?  We seem
to use it for every other Scan node type.

I dislike putting state into ExecutorState; that makes it impossible
to have multiple recursion nodes in one plan tree.  It would probably
be better for the Recursion and RecursiveScan nodes to talk to each
other directly (compare HashJoin/Hash); although since they are not
adjacent in the plan tree I admit I'm not sure how to do that.

es_disallow_tuplestore doesn't seem to need to be in ExecutorState
at all, it could as well be in RecursionState.

I don't really like the way that Append nodes are being abused here.
It would be better to allow nodeRecursion.c to duplicate a little code
from nodeAppend.c, and have the child plans be direct children of
the Recursion node.  BTW, is it actually possible to have more than
two children?  I didn't spend enough time analyzing the restrictions
in parse_cte to be sure.  If there are always just two then you could
simplify the representation by treating it like a join node instead
of an append.  (The RTE_RECURSIVE representation sure makes it look
like there can be only two...)

Mark/restore support seems useless ... note the comment on
ExecSupportsMarkRestore (which should be updated if this code
isn't removed).

RecursiveScan claims to support backwards fetch, but does not in fact
contain code to do so.  (Given that it will always be underneath
Recursion, which can't do backwards fetch, I see little point in adding
such code; fix execAmi.c instead.)

ExecInitRecursion doesn't seem to be on the same page about whether
it supports backward scan as execAmi.c, either.

This comment in nodeRecursivescan.c seems bogus:
/*
 * Do not initialize scan tuple type, result tuple type and
 * projection info in ExecInitRecursivescan. These types are
 * initialized after initializing Recursion node.
 */
because the code seems to be doing exactly what the comment says it
doesn't.

Numerous comments appear to have been copied-and-pasted and not modified
from the original.  Somebody will have to go over all that text.

ruleutils.c fails completely for non-recursive WITH.  It *must* regenerate
such a query with a WITH clause, not as a flattened subquery which is what
you seem to be doing.  This isn't negotiable because the semantics are
different.  This will mean at least some change in the parsetree
representation.  Perhaps we could add a bool to subquery RTEs to mark them
as coming from a nonrecursive WITH?  The tests added for RTE_RECURSIVE
seem a bit ugly too.  If it thinks that can't happen it should Assert so,
not just fall through silently.

commentary for ParseState.p_ctenamespace is gratuitously unlike the
comment style for the other fields, and p_recursive_namespace isn't
documented at all.

ParseState.p_in_with_clause is unused, should be removed.

The WithClause struct definition is poorly commented.  It should be
stated that it is used only pre-parse-analysis (assuming that continues
to be true after you get done fixing ruleutils.c...), and it doesn't
say what the elements of the subquery list are (specifically, what
node type).  A lot of the other added structs and fields could use
better commenting too.

For that matter subquery is a poor name for WithClause's list of CTEs,
especially so since it's hard to search for.  It should be a plural name
and I'd be inclined to use something like ctes not subqueries.
The term subquery is too overloaded already, so any place you can
refer to a WITH-list member as a CTE you should do so.

WithClause node may need a location field, and almost certainly has to
be handled somehow in exprLocation().

The error reports in parse_cte.c *desperately* need error locations.

Why does transformWithClause do parse_sub_analyze twice?
I'm not sure that's even safe, and it's surely unnecessary.
Also, what happens if a subquery isn't a SelectStmt?  Silently
doing nothing doesn't seem like a good plan there.

Why are we putting essentially the same information into both
p_recursive_namespace and p_ctenamespace?  Is there really a need
for both lists?  The code added to transformFromClauseItem
seems quite wrong since it searches both lists even if it found a
match in the first one.  This whole area looks like it needs
refactoring.

Costing is all bogus, 

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 I think the right approach is to try to complete it so that everyone is
 happy. I will work on this, but unfortunately I don't have a lot of time
 right now, so I can't make any promises.

I think there are two significant bits there:

* Fixing the parsetree representation so that the distinction between
a CTE and an RTE that references the CTE is preserved.

* Implementing some kind of multiple readout tuplestore to permit
multiple RTEs to scan a CTE plan without causing any row to be
evaluated more than once.

The first of these seems relatively straightforward: the WITH clause
has to be preserved explicitly in the Query representation, and RTEs
for CTEs should just carry indexes into the WITH list, not copies of
the subqueries.  (Hm, we might need both an index and a levelsup
counter, to deal with nested queries...)  This would solve ruleutils'
problem, too.

I haven't thought much about the multiple readout tuplestore though.
Anyone have a clear idea how to do that?  In particular, can the
existing tuplestore code be enhanced to do it (without sacrificing
performance in the simple case), or do we need something new?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Greg Stark
I had either on that a while back. I think the existing tuplestore can  
be made to work without bad performance penaltiesbin the usual case.  
The trick was to do it without making the code unreadable.


I can send the code I had but I doubt you want to use it as-is.

--
greg


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 I had either on that a while back. I think the existing tuplestore can  
 be made to work without bad performance penaltiesbin the usual case.  
 The trick was to do it without making the code unreadable.

 I can send the code I had but I doubt you want to use it as-is.

Sure.  Maybe someone else will see a way to make it more readable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Tatsuo Ishii
 Do we really have to make RECURSIVE a fully reserved keyword?

According to the standard, RECURSIVE is a reserved keyword, I believe.

 (Actually, the patch makes it worse than reserved, by failing to
 add it to the reserved_keywords list.)

Yes, it's a bug. Will fix...
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-16 Thread Tom Lane
Tatsuo Ishii [EMAIL PROTECTED] writes:
 Do we really have to make RECURSIVE a fully reserved keyword?

 According to the standard, RECURSIVE is a reserved keyword, I believe.

Sure, but our general rule is to make keywords no more reserved than
is absolutely necessary to make the bison grammar unambiguous.  I
haven't tested, but I'm thinking that if WITH is fully reserved then
RECURSIVE shouldn't have to be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-15 Thread Tatsuo Ishii
  * Single Evaluation:
  
with
  foo(i) as (select random() as i)
select * from foo union all select * from foo;
 i
---
 0.233165248762816
  0.62126633618027
(2 rows)
  
The standard specifies that non-recursive WITH should be evaluated
once.
 
 What shall we do? I don't think there's an easy way to fix this as Tom
 suggested. Maybe we should not allow WITH clause without RECURISVE for
 8.4?

This is a still remaing issue...

  * Binary recursion and subselect strangeness:
  
with recursive foo(i) as
  (values (1)
  union all
  select * from
(select i+1 from foo where i  10
union all
select i+1 from foo where i  X) t)
select * from foo;
  
Produces 10 rows of output regardless of what X is. This should be
  fixed for 8.4.
Also, this is non-linear recursion, which the standard seems to
  disallow.
 
 I will try to fix this. However detecting the query being not a
 non-linear one is not so easy.

I have implemented rejection of non-linear recursion and now this type
of query will not be executed anyway.

  * Multiple recursive references:
  
with recursive foo(i) as
  (values (1)
  union all
  select i+1 from foo where i  10
  union all
  select i+1 from foo where i  20)
select * from foo;
ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
  recursive query
  
If we're going to allow non-linear recursion (which the standard
does not), this seems like it should be a valid case.
 
 I will try to disallow this.

Non-linear recursion is not allowed now.

  * Strange result with except:
  
with recursive foo(i) as
  (values (1)
  union all
  select * from
  (select i+1 from foo where i  10
  except
  select i+1 from foo where i  5) t)
select * from foo;
ERROR:  table foo has 0 columns available but 1 columns specified
  
This query works if you replace except with union. This should be
  fixed for 8.4.
 
 I will try to fix this.

This is a non-linear recursion too and will not be executed anyway.

  * Aggregates allowed:
  
with recursive foo(i) as
  (values(1)
  union all
  select max(i)+1 from foo where i  10)
select * from foo;
  
Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.
 
 I will try to fix this.

Fixed.

  * DISTINCT should supress duplicates:
  
with recursive foo(i) as
  (select distinct * from (values(1),(2)) t
  union all
  select distinct i+1 from foo where i  10)
select * from foo;
  
This outputs a lot of duplicates, but they should be supressed
  according to the standard. This query is essentially the same as
  supporting UNION for recursive queries, so we should either fix both for
  8.4 or block both for consistency.
 
 I'm not sure if it's possible to fix this. Will look into.

Ok, now this type of DISTINCT is not allowed.

Included is the latest patches against CVS HEAD.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


recursive_query.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-10 Thread Jeff Davis
On Tue, 2008-09-09 at 09:47 -0400, Robert Haas wrote:
  3. I think this is a must fix because of the point about volatile
  functions --- changing it later will result in user-visible semantics
  changes, so we have to get it right the first time.
 
  I don't entirely agree with #3. It is user-visible, but only in the
  sense that someone is depending on undocumented multiple-evaluation
  behavior.
 
 What makes you think it's going to be undocumented?  Single versus
 multiple evaluation is a keep aspect of this feature and certainly
 needs to be documented one way or the other.  I can't understand why
 we would introduce a standard syntax with non-standard behavior, but
 if we do, it certainly had better be mentioned in the documentation.
 

I meant that -- hypothetically if we did accept the feature as-is -- the
number of evaluations would be documented to be undefined, not N. That
would avoid the backwards-compatibility problem.

This one point is probably not worth discussing now, because argument
#1 and #2 stand on their own.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-10 Thread Robert Haas
 I meant that -- hypothetically if we did accept the feature as-is -- the
 number of evaluations would be documented to be undefined, not N. That
 would avoid the backwards-compatibility problem.

 This one point is probably not worth discussing now, because argument
 #1 and #2 stand on their own.

Agreed.  Plus, both Tom and Pavel seem to think this is a relatively
solvable problem.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Jeff Davis
On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
 Thanks for the review.
 
The standard specifies that non-recursive WITH should be evaluated
once.
 
 What shall we do? I don't think there's a easy way to fix this. Maybe
 we should not allow WITH clause without RECURISVE?

My interpretation of 7.13: General Rules: 2.b is that it should be
single evaluation, even if RECURSIVE is present.

The previous discussion was here:

http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

The important arguments in the thread seemed to be:

1. People will generally expect single evaluation, so might be
disappointed if they can't use this feature for that purpose.

2. It's a spec violation in the case of volatile functions.

3. I think this is a must fix because of the point about volatile
functions --- changing it later will result in user-visible semantics
changes, so we have to get it right the first time.

I don't entirely agree with #3. It is user-visible, but only in the
sense that someone is depending on undocumented multiple-evaluation
behavior.

Tom Lane said that multiple evaluation is grounds for rejection:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php

Is there hope of correcting this before November?

 I will try to fix this. However detecting the query being not a
 non-linear one is not so easy.

If we don't allow mutual recursion, the only kind of non-linear
recursion that might exist would be multiple references to the same
recursive query name in a recursive query, is that correct?

  * DISTINCT should supress duplicates:
  
with recursive foo(i) as
  (select distinct * from (values(1),(2)) t
  union all
  select distinct i+1 from foo where i  10)
select * from foo;
  
This outputs a lot of duplicates, but they should be supressed
  according to the standard. This query is essentially the same as
  supporting UNION for recursive queries, so we should either fix both for
  8.4 or block both for consistency.
 
 I'm not sure if it's possible to fix this. Will look into.
 

Can't we just reject queries with top-level DISTINCT, similar to how
UNION is rejected?

  * outer joins on a recursive reference should be blocked:
  
with recursive foo(i) as
  (values(1)
  union all
  select i+1 from foo left join (values(1)) t on (i=column1))
select * from foo;
  
Causes an infinite loop, but the standard says using an outer join
in this situation should be prohibited. This should be fixed for 8.4.
 
 Not an issue, I think.

Agreed, Andrew Gierth corrected me here.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Robert Haas
 3. I think this is a must fix because of the point about volatile
 functions --- changing it later will result in user-visible semantics
 changes, so we have to get it right the first time.

 I don't entirely agree with #3. It is user-visible, but only in the
 sense that someone is depending on undocumented multiple-evaluation
 behavior.

What makes you think it's going to be undocumented?  Single versus
multiple evaluation is a keep aspect of this feature and certainly
needs to be documented one way or the other.  I can't understand why
we would introduce a standard syntax with non-standard behavior, but
if we do, it certainly had better be mentioned in the documentation.

I think that the most likely result of a CTE implementation that
doesn't guarantee single evaluation is that people simply won't use
it.  But anyone who does will expect that their queries will return
the same results in release N and release N+1, for all values of N.
The only way that an incompatible change of this type won't break
people's applications is if they're not using the feature in the first
place, in which case there is no point in committing it anyway.

I wonder if the whole approach to this patch is backward.  Instead of
worrying about how to implement WITH RECURSIVE, maybe it would be
better to implement a really solid, spec-compliant version of WITH,
and add the RECURSIVE functionality in a later patch/release.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Tatsuo Ishii
 On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
  Thanks for the review.
  
 The standard specifies that non-recursive WITH should be evaluated
 once.
  
  What shall we do? I don't think there's a easy way to fix this. Maybe
  we should not allow WITH clause without RECURISVE?
 
 My interpretation of 7.13: General Rules: 2.b is that it should be
 single evaluation, even if RECURSIVE is present.
 
 The previous discussion was here:
 
 http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
 
 The important arguments in the thread seemed to be:
 
 1. People will generally expect single evaluation, so might be
 disappointed if they can't use this feature for that purpose.
 
 2. It's a spec violation in the case of volatile functions.
 
 3. I think this is a must fix because of the point about volatile
 functions --- changing it later will result in user-visible semantics
 changes, so we have to get it right the first time.
 
 I don't entirely agree with #3. It is user-visible, but only in the
 sense that someone is depending on undocumented multiple-evaluation
 behavior.
 
 Tom Lane said that multiple evaluation is grounds for rejection:
 http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php
 
 Is there hope of correcting this before November?

According to Tom, to implement single evaluation we need to make big
infrastructure enhancement which is likely slip the schedule for 8.4
release which Tom does not want.

So as long as Tom and other people think that is a must fix, there
seems no hope probably.

Anyway I will continue to work on existing patches...
--
Tatsuo Ishii
SRA OSS, Inc. Japan

  I will try to fix this. However detecting the query being not a
  non-linear one is not so easy.
 
 If we don't allow mutual recursion, the only kind of non-linear
 recursion that might exist would be multiple references to the same
 recursive query name in a recursive query, is that correct?
 
   * DISTINCT should supress duplicates:
   
 with recursive foo(i) as
   (select distinct * from (values(1),(2)) t
   union all
   select distinct i+1 from foo where i  10)
 select * from foo;
   
 This outputs a lot of duplicates, but they should be supressed
   according to the standard. This query is essentially the same as
   supporting UNION for recursive queries, so we should either fix both for
   8.4 or block both for consistency.
  
  I'm not sure if it's possible to fix this. Will look into.
  
 
 Can't we just reject queries with top-level DISTINCT, similar to how
 UNION is rejected?
 
   * outer joins on a recursive reference should be blocked:
   
 with recursive foo(i) as
   (values(1)
   union all
   select i+1 from foo left join (values(1)) t on (i=column1))
 select * from foo;
   
 Causes an infinite loop, but the standard says using an outer join
 in this situation should be prohibited. This should be fixed for 8.4.
  
  Not an issue, I think.
 
 Agreed, Andrew Gierth corrected me here.
 
 Regards,
   Jeff Davis
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Pavel Stehule
Hello

2008/9/9 Tatsuo Ishii [EMAIL PROTECTED]:
 On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
  Thanks for the review.
 
 The standard specifies that non-recursive WITH should be evaluated
 once.
 
  What shall we do? I don't think there's a easy way to fix this. Maybe
  we should not allow WITH clause without RECURISVE?

 My interpretation of 7.13: General Rules: 2.b is that it should be
 single evaluation, even if RECURSIVE is present.

 The previous discussion was here:

 http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

 The important arguments in the thread seemed to be:

 1. People will generally expect single evaluation, so might be
 disappointed if they can't use this feature for that purpose.

 2. It's a spec violation in the case of volatile functions.

 3. I think this is a must fix because of the point about volatile
 functions --- changing it later will result in user-visible semantics
 changes, so we have to get it right the first time.

 I don't entirely agree with #3. It is user-visible, but only in the
 sense that someone is depending on undocumented multiple-evaluation
 behavior.

 Tom Lane said that multiple evaluation is grounds for rejection:
 http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php

 Is there hope of correcting this before November?

 According to Tom, to implement single evaluation we need to make big
 infrastructure enhancement which is likely slip the schedule for 8.4
 release which Tom does not want.

why? why don't use a materialisation?


 So as long as Tom and other people think that is a must fix, there
 seems no hope probably.

 Anyway I will continue to work on existing patches...
 --

I would to see your patch in core early. I am working on grouping sets
and I cannot finish my patch before your patch will be commited.

Regards
Pavel Stehule

 Tatsuo Ishii
 SRA OSS, Inc. Japan

  I will try to fix this. However detecting the query being not a
  non-linear one is not so easy.

 If we don't allow mutual recursion, the only kind of non-linear
 recursion that might exist would be multiple references to the same
 recursive query name in a recursive query, is that correct?

   * DISTINCT should supress duplicates:
  
 with recursive foo(i) as
   (select distinct * from (values(1),(2)) t
   union all
   select distinct i+1 from foo where i  10)
 select * from foo;
  
 This outputs a lot of duplicates, but they should be supressed
   according to the standard. This query is essentially the same as
   supporting UNION for recursive queries, so we should either fix both for
   8.4 or block both for consistency.
 
  I'm not sure if it's possible to fix this. Will look into.
 

 Can't we just reject queries with top-level DISTINCT, similar to how
 UNION is rejected?

   * outer joins on a recursive reference should be blocked:
  
 with recursive foo(i) as
   (values(1)
   union all
   select i+1 from foo left join (values(1)) t on (i=column1))
 select * from foo;
  
 Causes an infinite loop, but the standard says using an outer join
 in this situation should be prohibited. This should be fixed for 8.4.
 
  Not an issue, I think.

 Agreed, Andrew Gierth corrected me here.

 Regards,
   Jeff Davis


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Tatsuo Ishii
 Hello
 
 2008/9/9 Tatsuo Ishii [EMAIL PROTECTED]:
  On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
   Thanks for the review.
  
  The standard specifies that non-recursive WITH should be evaluated
  once.
  
   What shall we do? I don't think there's a easy way to fix this. Maybe
   we should not allow WITH clause without RECURISVE?
 
  My interpretation of 7.13: General Rules: 2.b is that it should be
  single evaluation, even if RECURSIVE is present.
 
  The previous discussion was here:
 
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
 
  The important arguments in the thread seemed to be:
 
  1. People will generally expect single evaluation, so might be
  disappointed if they can't use this feature for that purpose.
 
  2. It's a spec violation in the case of volatile functions.
 
  3. I think this is a must fix because of the point about volatile
  functions --- changing it later will result in user-visible semantics
  changes, so we have to get it right the first time.
 
  I don't entirely agree with #3. It is user-visible, but only in the
  sense that someone is depending on undocumented multiple-evaluation
  behavior.
 
  Tom Lane said that multiple evaluation is grounds for rejection:
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php
 
  Is there hope of correcting this before November?
 
  According to Tom, to implement single evaluation we need to make big
  infrastructure enhancement which is likely slip the schedule for 8.4
  release which Tom does not want.
 
 why? why don't use a materialisation?

See:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

 
  So as long as Tom and other people think that is a must fix, there
  seems no hope probably.
 
  Anyway I will continue to work on existing patches...
  --
 
 I would to see your patch in core early. I am working on grouping sets
 and I cannot finish my patch before your patch will be commited.
 
 Regards
 Pavel Stehule
 
  Tatsuo Ishii
  SRA OSS, Inc. Japan
 
   I will try to fix this. However detecting the query being not a
   non-linear one is not so easy.
 
  If we don't allow mutual recursion, the only kind of non-linear
  recursion that might exist would be multiple references to the same
  recursive query name in a recursive query, is that correct?
 
* DISTINCT should supress duplicates:
   
  with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i  10)
  select * from foo;
   
  This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both 
for
8.4 or block both for consistency.
  
   I'm not sure if it's possible to fix this. Will look into.
  
 
  Can't we just reject queries with top-level DISTINCT, similar to how
  UNION is rejected?
 
* outer joins on a recursive reference should be blocked:
   
  with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
  select * from foo;
   
  Causes an infinite loop, but the standard says using an outer join
  in this situation should be prohibited. This should be fixed for 8.4.
  
   Not an issue, I think.
 
  Agreed, Andrew Gierth corrected me here.
 
  Regards,
Jeff Davis
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Tatsuo Ishii
 Hello
 
 2008/9/9 Tatsuo Ishii [EMAIL PROTECTED]:
  On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
   Thanks for the review.
  
  The standard specifies that non-recursive WITH should be evaluated
  once.
  
   What shall we do? I don't think there's a easy way to fix this. Maybe
   we should not allow WITH clause without RECURISVE?
 
  My interpretation of 7.13: General Rules: 2.b is that it should be
  single evaluation, even if RECURSIVE is present.
 
  The previous discussion was here:
 
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
 
  The important arguments in the thread seemed to be:
 
  1. People will generally expect single evaluation, so might be
  disappointed if they can't use this feature for that purpose.
 
  2. It's a spec violation in the case of volatile functions.
 
  3. I think this is a must fix because of the point about volatile
  functions --- changing it later will result in user-visible semantics
  changes, so we have to get it right the first time.
 
  I don't entirely agree with #3. It is user-visible, but only in the
  sense that someone is depending on undocumented multiple-evaluation
  behavior.
 
  Tom Lane said that multiple evaluation is grounds for rejection:
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php
 
  Is there hope of correcting this before November?
 
  According to Tom, to implement single evaluation we need to make big
  infrastructure enhancement which is likely slip the schedule for 8.4
  release which Tom does not want.
 
 why? why don't use a materialisation?

See:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

  So as long as Tom and other people think that is a must fix, there
  seems no hope probably.
 
  Anyway I will continue to work on existing patches...
  --
 
 I would to see your patch in core early. I am working on grouping sets
 and I cannot finish my patch before your patch will be commited.
 
 Regards
 Pavel Stehule
 
  Tatsuo Ishii
  SRA OSS, Inc. Japan
 
   I will try to fix this. However detecting the query being not a
   non-linear one is not so easy.
 
  If we don't allow mutual recursion, the only kind of non-linear
  recursion that might exist would be multiple references to the same
  recursive query name in a recursive query, is that correct?
 
* DISTINCT should supress duplicates:
   
  with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i  10)
  select * from foo;
   
  This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both 
for
8.4 or block both for consistency.
  
   I'm not sure if it's possible to fix this. Will look into.
  
 
  Can't we just reject queries with top-level DISTINCT, similar to how
  UNION is rejected?
 
* outer joins on a recursive reference should be blocked:
   
  with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
  select * from foo;
   
  Causes an infinite loop, but the standard says using an outer join
  in this situation should be prohibited. This should be fixed for 8.4.
  
   Not an issue, I think.
 
  Agreed, Andrew Gierth corrected me here.
 
  Regards,
Jeff Davis
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Tatsuo Ishii
  * Aggregates allowed:
  
with recursive foo(i) as
  (values(1)
  union all
  select max(i)+1 from foo where i  10)
select * from foo;
  
Aggregates should be blocked according to the standard.
Also, causes an infinite loop. This should be fixed for 8.4.
 
 I will try to fix this.

We already reject:

select max(i) from foo where i  10)

But max(i)+1 seems to slip the check. I looked into this I found the
patch tried to detect the case before analyzing(see
parser/parse_cte.c) which is not a right thing I think.

I think we could detect the case by adding more checking in
parseCheckAggregates():

/*
 * Check if there's aggregate function in a recursive term.
 */
foreach(l, qry-rtable)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(l);

if (qry-hasAggs  rte-rtekind == RTE_RECURSIVE 
rte-self_reference)
{
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg(aggregate functions in a 
recursive term not allowed)));
}
}

What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Pavel Stehule
2008/9/9 Tatsuo Ishii [EMAIL PROTECTED]:
 Hello

 2008/9/9 Tatsuo Ishii [EMAIL PROTECTED]:
  On Tue, 2008-09-09 at 13:45 +0900, Tatsuo Ishii wrote:
   Thanks for the review.
  
  The standard specifies that non-recursive WITH should be evaluated
  once.
  
   What shall we do? I don't think there's a easy way to fix this. Maybe
   we should not allow WITH clause without RECURISVE?
 
  My interpretation of 7.13: General Rules: 2.b is that it should be
  single evaluation, even if RECURSIVE is present.
 
  The previous discussion was here:
 
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php
 

I am  blind, I didn't find any reason, why materialisation isn't useable.

Regards
Pavel

  The important arguments in the thread seemed to be:
 
  1. People will generally expect single evaluation, so might be
  disappointed if they can't use this feature for that purpose.
 
  2. It's a spec violation in the case of volatile functions.
 
  3. I think this is a must fix because of the point about volatile
  functions --- changing it later will result in user-visible semantics
  changes, so we have to get it right the first time.
 
  I don't entirely agree with #3. It is user-visible, but only in the
  sense that someone is depending on undocumented multiple-evaluation
  behavior.
 
  Tom Lane said that multiple evaluation is grounds for rejection:
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01318.php
 
  Is there hope of correcting this before November?
 
  According to Tom, to implement single evaluation we need to make big
  infrastructure enhancement which is likely slip the schedule for 8.4
  release which Tom does not want.

 why? why don't use a materialisation?

 See:
 http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

 
  So as long as Tom and other people think that is a must fix, there
  seems no hope probably.
 
  Anyway I will continue to work on existing patches...
  --

 I would to see your patch in core early. I am working on grouping sets
 and I cannot finish my patch before your patch will be commited.

 Regards
 Pavel Stehule

  Tatsuo Ishii
  SRA OSS, Inc. Japan
 
   I will try to fix this. However detecting the query being not a
   non-linear one is not so easy.
 
  If we don't allow mutual recursion, the only kind of non-linear
  recursion that might exist would be multiple references to the same
  recursive query name in a recursive query, is that correct?
 
* DISTINCT should supress duplicates:
   
  with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i  10)
  select * from foo;
   
  This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both 
for
8.4 or block both for consistency.
  
   I'm not sure if it's possible to fix this. Will look into.
  
 
  Can't we just reject queries with top-level DISTINCT, similar to how
  UNION is rejected?
 
* outer joins on a recursive reference should be blocked:
   
  with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
  select * from foo;
   
  Causes an infinite loop, but the standard says using an outer join
  in this situation should be prohibited. This should be fixed for 
8.4.
  
   Not an issue, I think.
 
  Agreed, Andrew Gierth corrected me here.
 
  Regards,
Jeff Davis
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Robert Haas
  My interpretation of 7.13: General Rules: 2.b is that it should be
  single evaluation, even if RECURSIVE is present.
 
  The previous discussion was here:
 
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

 I am  blind, I didn't find any reason, why materialisation isn't useable.

I believe it's because of these two (closely related) problems:

# The basic
# implementation clearly ought to be to dump the result of the subquery
# into a tuplestore and then have the upper level read out from that.
# However, we don't have any infrastructure for having multiple
# upper-level RTEs reference the same tuplestore.  (Perhaps the InitPlan
# infrastructure could be enhanced in that direction, but it's not ready
# for non-scalar outputs today.)  Also, I think we'd have to teach
# tuplestore how to support multiple readout cursors.  For example,
# consider
#   WITH foo AS (SELECT ...) SELECT ... FROM foo a, foo b WHERE ...
# If the planner chooses to do the join as a nested loop then each
# Scan node needs to keep track of its own place in the tuplestore,
# concurrently with the other node having a different place.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Pavel Stehule
2008/9/9 Robert Haas [EMAIL PROTECTED]:
  My interpretation of 7.13: General Rules: 2.b is that it should be
  single evaluation, even if RECURSIVE is present.
 
  The previous discussion was here:
 
  http://archives.postgresql.org/pgsql-hackers/2008-07/msg01292.php

 I am  blind, I didn't find any reason, why materialisation isn't useable.

 I believe it's because of these two (closely related) problems:

 # The basic
 # implementation clearly ought to be to dump the result of the subquery
 # into a tuplestore and then have the upper level read out from that.
 # However, we don't have any infrastructure for having multiple
 # upper-level RTEs reference the same tuplestore.  (Perhaps the InitPlan
 # infrastructure could be enhanced in that direction, but it's not ready
 # for non-scalar outputs today.)  Also, I think we'd have to teach
 # tuplestore how to support multiple readout cursors.  For example,
 # consider
 #   WITH foo AS (SELECT ...) SELECT ... FROM foo a, foo b WHERE ...
 # If the planner chooses to do the join as a nested loop then each
 # Scan node needs to keep track of its own place in the tuplestore,
 # concurrently with the other node having a different place.


hmm. I solve similar problem in grouping sets :( etc

SELECT ... FROM ... GROUP BY GROUPING SETS (a,b)

is almost same as

With foo AS (SELECT ... FROM) SELECT ... FROM foo GROUP BY a UNION ALL
SELECT ... FROM foo GROUP BY b;

Regards
Pavel





 ...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Jeff Davis
On Tue, 2008-09-09 at 18:51 +0200, Pavel Stehule wrote:
 hmm. I solve similar problem in grouping sets :( etc
 

How did you solve it?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Pavel Stehule
2008/9/9 Jeff Davis [EMAIL PROTECTED]:
 On Tue, 2008-09-09 at 18:51 +0200, Pavel Stehule wrote:
 hmm. I solve similar problem in grouping sets :( etc


I have special executor node - feeder, it hold one tuple and others
nodes read from this node. It's usable for hash aggregates.

Pavel

plan is like:

grouping sets
-- seq scan ...
-- hash aggg
-- feeder
-- hash agg
   -- feeder




 How did you solve it?

 Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-09 Thread Tom Lane
Robert Haas [EMAIL PROTECTED] writes:
 I am  blind, I didn't find any reason, why materialisation isn't useable.

 I believe it's because of these two (closely related) problems:

 # The basic
 # implementation clearly ought to be to dump the result of the subquery
 # into a tuplestore and then have the upper level read out from that.
 # However, we don't have any infrastructure for having multiple
 # upper-level RTEs reference the same tuplestore.  (Perhaps the InitPlan
 # infrastructure could be enhanced in that direction, but it's not ready
 # for non-scalar outputs today.)  Also, I think we'd have to teach
 # tuplestore how to support multiple readout cursors.  For example,
 # consider
 # WITH foo AS (SELECT ...) SELECT ... FROM foo a, foo b WHERE ...
 # If the planner chooses to do the join as a nested loop then each
 # Scan node needs to keep track of its own place in the tuplestore,
 # concurrently with the other node having a different place.

The amount of new code needed for that seems a pittance compared to the
size of the patch already, so I'm not seeing why Tatsuo-san considers
it infeasible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Jeff Davis

These are my initial comments about the Common Table Expressions (CTE)
patch, also known as WITH [RECURSIVE]. These comments are based on the
patch here:

http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php

This is a semantically complex feature, and the standard is fairly
complex as well. So I'm approaching this by making my own
interpretations from the standard first (I included my interpretations
and section references at the end of this email) and comparing to the
behavior of the patch.

The following examples may be inconsistent with the standard. Some
have already been mentioned, and I don't think they all need to be
fixed for 8.4, but I mention them here for completeness.

* Mutual Recursion:

  with recursive
foo(i) as (values(1) union all select i+1 from bar where i  10),
bar(i) as (values(1) union all select i+1 from foo where i  10)
  select * from foo;
  ERROR:  mutual recursive call is not supported

  The standard allows mutual recursion.

* Single Evaluation:

  with
foo(i) as (select random() as i)
  select * from foo union all select * from foo;
   i
  ---
   0.233165248762816
0.62126633618027
  (2 rows)

  The standard specifies that non-recursive WITH should be evaluated
  once.

* RHS only:

  with recursive
foo(i) as (select i+1 from foo where i  10 union all values(1))
  select * from foo;
  ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
recursive query

  The standard does not require that the recursive term be on the RHS.

* UNION ALL only:

  with recursive
foo(i) as (values(1) union select i+1 from foo where i  10)
  select * from foo;
  ERROR:  non-recursive term and recursive term must be combined with
UNION ALL

  The standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
  (when the recursive term is not on the RHS of the EXCEPT).

* Binary recursion and subselect strangeness:

  with recursive foo(i) as
(values (1)
union all
select * from
  (select i+1 from foo where i  10
  union all
  select i+1 from foo where i  X) t)
  select * from foo;

  Produces 10 rows of output regardless of what X is. This should be
fixed for 8.4.
  Also, this is non-linear recursion, which the standard seems to
disallow.

* Multiple recursive references:

  with recursive foo(i) as
(values (1)
union all
select i+1 from foo where i  10
union all
select i+1 from foo where i  20)
  select * from foo;
  ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
recursive query

  If we're going to allow non-linear recursion (which the standard
  does not), this seems like it should be a valid case.

* Strange result with except:

  with recursive foo(i) as
(values (1)
union all
select * from
(select i+1 from foo where i  10
except
select i+1 from foo where i  5) t)
  select * from foo;
  ERROR:  table foo has 0 columns available but 1 columns specified

  This query works if you replace except with union. This should be
fixed for 8.4.

* Aggregates allowed:

  with recursive foo(i) as
(values(1)
union all
select max(i)+1 from foo where i  10)
  select * from foo;

  Aggregates should be blocked according to the standard.
  Also, causes an infinite loop. This should be fixed for 8.4.

* DISTINCT should supress duplicates:

  with recursive foo(i) as
(select distinct * from (values(1),(2)) t
union all
select distinct i+1 from foo where i  10)
  select * from foo;

  This outputs a lot of duplicates, but they should be supressed
according to the standard. This query is essentially the same as
supporting UNION for recursive queries, so we should either fix both for
8.4 or block both for consistency.

* outer joins on a recursive reference should be blocked:

  with recursive foo(i) as
(values(1)
union all
select i+1 from foo left join (values(1)) t on (i=column1))
  select * from foo;

  Causes an infinite loop, but the standard says using an outer join
  in this situation should be prohibited. This should be fixed for 8.4.

* ORDER BY, LIMIT, and OFFSET are rejected for recursive queries. The
standard does not seem to say that these should be rejected.


The following are my interpretations of relevant parts of the SQL
standard (200n), and the associated sections. These are only my
interpretation, so let me know if you interpret the standard
differently.

Non-linear recursion forbidden:
  7.13: Syntax Rules: 2.g.ii
My interpretation of 2.g.ii.2 is that WQN[k] and WQN[l] may be the
same query name.
  7.13: Syntax Rules: 2.g.iv

EXCEPT can't be used for recursive queries if a recursive reference
appears on the RHS:
  7.13: Syntax Rules: 2.g.iii.1

INTERSECT ALL/EXCEPT ALL can't be used for recursive queries:
  7.13: Syntax Rules: 2.g.iii.5

recursive references must appear in FROM clause:
  7.13: Syntax Rules: 2.g.iii.3
My interpretation of this rule is that it does not allow a

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Andrew Gierth
 Jeff == Jeff Davis [EMAIL PROTECTED] writes:

 Jeff * Mutual Recursion:

This limitation isn't at all uncommon in other implementations; DB2
docs for example say:

If more than one common table expression is defined in the same
statement, cyclic references between the common table expressions are
not permitted. A cyclic reference occurs when two common table
expressions dt1 and dt2 are created such that dt1 refers to dt2 and
dt2 refers to dt1.

http://publib.boulder.ibm.com/infocenter/iadthelp/v7r0/index.jsp?topic=/com.ibm.etools.iseries.langref2.doc/rbafzmst295.htm

MSSQL's documentation is less clear, but it also appears not to allow
mutual recursion (not allowing forward references to CTEs).

A CTE can reference itself and previously defined CTEs in the same
WITH clause. Forward referencing is not allowed.

http://msdn.microsoft.com/en-us/library/ms175972.aspx

 Jeff * RHS only:

 Jeff   with recursive
 Jeff foo(i) as (select i+1 from foo where i  10 union all values(1))
 Jeff   select * from foo;
 Jeff   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
 Jeff recursive query

 Jeff   The standard does not require that the recursive term be on
 Jeff   the RHS.

Again, the standard may not, but existing implementations do:

MSSQL:

The recursive CTE definition must contain at least two CTE query
definitions, an anchor member and a recursive member. Multiple anchor
members and recursive members can be defined; however, all anchor
member query definitions must be put before the first recursive member
definition. All CTE query definitions are anchor members unless they
reference the CTE itself. 

DB2:

The following restrictions apply to a recursive common-table-expression:
* A list of column-names must be specified following the
  table-identifier.
* The UNION ALL set operator must be specified.
* The first fullselect of the first union (the initialization
  fullselect) must not include a reference to the
  common-table-expression itself in any FROM clause.


 Jeff * UNION ALL only:

 Jeff   with recursive
 Jeff foo(i) as (values(1) union select i+1 from foo where i  10)
 Jeff   select * from foo;
 Jeff   ERROR:  non-recursive term and recursive term must be combined with
 Jeff UNION ALL

 Jeff   The standard seems to allow UNION ALL, UNION, INTERSECT, and
 Jeff   EXCEPT (when the recursive term is not on the RHS of the
 Jeff   EXCEPT).

Again, existing implementations disagree. See above for DB2, and for
MSSQL:

Anchor members must be combined by one of these set operators: UNION
ALL, UNION, INTERSECT, or EXCEPT. UNION ALL is the only set operator
allowed between the last anchor member and first recursive member, and
when combining multiple recursive members.

 Jeff * Binary recursion and subselect strangeness:

 Jeff   with recursive foo(i) as
 Jeff (values (1)
 Jeff union all
 Jeff select * from
 Jeff   (select i+1 from foo where i  10
 Jeff   union all
 Jeff   select i+1 from foo where i  X) t)
 Jeff   select * from foo;

 Jeff   Produces 10 rows of output regardless of what X is. This
 Jeff should be fixed for 8.4.  Also, this is non-linear recursion,
 Jeff which the standard seems to disallow.

That looks like it should be disallowed somehow.

 Jeff * Multiple recursive references:
 [snip]

 Jeff   If we're going to allow non-linear recursion (which the
 Jeff   standard does not), this seems like it should be a valid
 Jeff   case.

We probably shouldn't allow it (as above).

[snip * Strange result with except: which looks like a bug]

 Jeff * Aggregates allowed: which

 Jeff   with recursive foo(i) as
 Jeff (values(1)
 Jeff union all
 Jeff select max(i)+1 from foo where i  10)
 Jeff   select * from foo;

 Jeff   Aggregates should be blocked according to the standard.
 Jeff   Also, causes an infinite loop. This should be fixed for 8.4.

Does the standard require anywhere that non-conforming statements must
be diagnosed? (seems impractical, since it would forbid extensions)

 Jeff * DISTINCT should supress duplicates:

 Jeff   with recursive foo(i) as
 Jeff (select distinct * from (values(1),(2)) t
 Jeff union all
 Jeff select distinct i+1 from foo where i  10)
 Jeff   select * from foo;

 Jeff   This outputs a lot of duplicates, but they should be
 Jeff supressed according to the standard. This query is essentially
 Jeff the same as supporting UNION for recursive queries, so we
 Jeff should either fix both for 8.4 or block both for consistency.

Yeah, though the standard's use of DISTINCT in this way is something
of a violation of the POLA.

 Jeff * outer joins on a recursive reference should be blocked:

 Jeff   with recursive foo(i) as
 Jeff (values(1)
 Jeff union all
 Jeff select i+1 from foo left join (values(1)) t on (i=column1))
 Jeff   select * from foo;

 Jeff   Causes an infinite loop, but the standard says using an outer
 Jeff   join in this situation should be prohibited. This 

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Gregory Stark

Jeff Davis [EMAIL PROTECTED] writes:

 * Mutual Recursion:

   with recursive
 foo(i) as (values(1) union all select i+1 from bar where i  10),
 bar(i) as (values(1) union all select i+1 from foo where i  10)
   select * from foo;
   ERROR:  mutual recursive call is not supported

   The standard allows mutual recursion.

This seems to be a point of confusion. I originally read the standard and
concluded that mutual recursion was an optional feature. Itagaki-san showed me
a copy of the spec where it seemed there was a clear blanket prohibition on
mutually recursive queries and in fact anything but simple linearly expandable
queries. I wonder if there are different versions of the spec floating around
on this point.

Take a second look at your spec and read on to where it defines linear and
expandable. If it doesn't define those terms then it's definitely different
from what I read. If it does, read on to see what it does with them. The main
reason to define them appeared to be to use them to say that supporting mutual
recursion is not required.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Jeff Davis
On Mon, 2008-09-08 at 21:13 +0100, Gregory Stark wrote:
 Jeff Davis [EMAIL PROTECTED] writes:
 
  * Mutual Recursion:
 
with recursive
  foo(i) as (values(1) union all select i+1 from bar where i  10),
  bar(i) as (values(1) union all select i+1 from foo where i  10)
select * from foo;
ERROR:  mutual recursive call is not supported
 
The standard allows mutual recursion.
 
 This seems to be a point of confusion. I originally read the standard and
 concluded that mutual recursion was an optional feature. Itagaki-san showed me
 a copy of the spec where it seemed there was a clear blanket prohibition on
 mutually recursive queries and in fact anything but simple linearly expandable
 queries. I wonder if there are different versions of the spec floating around
 on this point.
 
 Take a second look at your spec and read on to where it defines linear and
 expandable. If it doesn't define those terms then it's definitely different
 from what I read. If it does, read on to see what it does with them. The main
 reason to define them appeared to be to use them to say that supporting mutual
 recursion is not required.

I think we're reading the same version of the spec. I'm reading 200n.

My interpretation (Syntax Rule 2.h) is that expandable is only used to
determine whether it can contain a search or cycle.

That being said, I don't think it should be a requirement for 8.4. The
CTE patch is an important feature, and we shouldn't hold it up over
something comparatively obscure like mutual recursion. As Andrew Gierth
cited, other systems don't support it anyway. It should be kept in mind
for future work though.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Jeff Davis
On Mon, 2008-09-08 at 18:08 +0100, Andrew Gierth wrote:
  Jeff * Mutual Recursion:
 
 This limitation isn't at all uncommon in other implementations; DB2
 docs for example say:

As with some other things in my list, this doesn't need to be supported
in 8.4. I just wanted to lay out my interpretation of the standard, and
places that we might (currently) fall short of it.

The fact that other DBMSs don't support mutual recursion is a good
indication that it's not important immediately.

  Jeff   The standard does not require that the recursive term be on
  Jeff   the RHS.
 
 Again, the standard may not, but existing implementations do:
 

Again, I don't think we need this for 8.4.

However, I think it's probably more important than mutual recursion.

  Jeff * UNION ALL only:
 
  Jeff   with recursive
  Jeff foo(i) as (values(1) union select i+1 from foo where i  10)
  Jeff   select * from foo;
  Jeff   ERROR:  non-recursive term and recursive term must be combined with
  Jeff UNION ALL
 
  Jeff   The standard seems to allow UNION ALL, UNION, INTERSECT, and
  Jeff   EXCEPT (when the recursive term is not on the RHS of the
  Jeff   EXCEPT).
 
 Again, existing implementations disagree. See above for DB2, and for
 MSSQL:
 

And again, I agree that it's not important for 8.4.

At some point we need to determine what the goalposts are though. Are we
copying existing implementations, or are we implementing the standard?

  Jeff   Produces 10 rows of output regardless of what X is. This
  Jeff should be fixed for 8.4.  Also, this is non-linear recursion,
  Jeff which the standard seems to disallow.
 
 That looks like it should be disallowed somehow.

Agreed. I think it should just throw an error, probably.

 [snip * Strange result with except: which looks like a bug]
 
  Jeff * Aggregates allowed: which
 
  Jeff   with recursive foo(i) as
  Jeff (values(1)
  Jeff union all
  Jeff select max(i)+1 from foo where i  10)
  Jeff   select * from foo;
 
  Jeff   Aggregates should be blocked according to the standard.
  Jeff   Also, causes an infinite loop. This should be fixed for 8.4.
 
 Does the standard require anywhere that non-conforming statements must
 be diagnosed? (seems impractical, since it would forbid extensions)
 

2.g.iii.4.B explicitly says aggregates should be rejected, unless I have
misinterpreted.

  
 Yeah, though the standard's use of DISTINCT in this way is something
 of a violation of the POLA.
 

I agree that's kind of a funny requirement. But that's pretty typical of
the SQL standard. If DB2 or SQL Server follow the standard here, we
should, too. If not, it's open for discussion.

 No. This has already been discussed; it's neither possible nor desirable
 to diagnose all cases which can result in infinite loops, and there are
 important types of queries which would be unnecessarily forbidden.

I didn't say we should forbid all infinite loops. But we should forbid
ones that the standard tells us to forbid.

 Besides, you've misread the spec here: it prohibits the recursive
 reference ONLY on the nullable side of the join. You cite:
 

Thank you for the correction. It does properly reject the outer joins
that the standard says should be rejected.

  Jeff * ORDER BY, LIMIT, and OFFSET are rejected for recursive
  Jeff queries. The standard does not seem to say that these should be
  Jeff rejected.
 
 Note that supporting those in subqueries (including CTEs) is a separate
 optional feature of the standard.
 

I don't feel strongly about this either way, but I prefer that we are
consistent when possible. We do support these things in a subquery, so
shouldn't we support them in all subqueries?

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Andrew Gierth
 Jeff == Jeff Davis [EMAIL PROTECTED] writes:

 Jeff Aggregates should be blocked according to the standard.
 Jeff Also, causes an infinite loop. This should be fixed for 8.4.

  Does the standard require anywhere that non-conforming statements
  must be diagnosed? (seems impractical, since it would forbid
  extensions)

 Jeff 2.g.iii.4.B explicitly says aggregates should be rejected,
 Jeff unless I have misinterpreted.

Yes, you've misinterpreted. When the spec says that a query shall
not do such-and-such, it means that a conforming client isn't allowed
to do that, it does NOT mean that the server is required to produce an
error when it sees it.

Chapter and verse on this is given in the Framework doc, at 6.3.3.2:

  In the Syntax Rules, the term shall defines conditions that are
  required to be true of syntactically conforming SQL language. When such
  conditions depend on the contents of one or more schemas, they are
  required to be true just before the actions specified by the General
  Rules are performed. The treatment of language that does not conform to
  the SQL Formats and Syntax Rules is implementation-dependent.  If any
  condition required by Syntax Rules is not satisfied when the evaluation
  of Access or General Rules is attempted and the implementation is
  neither processing non-conforming SQL language nor processing
  conforming SQL language in a non-conforming manner, then an exception
  condition is raised: syntax error or access rule violation.

Including an aggregate violates a shall in a syntax rule, therefore the
query is non-conforming, therefore the server can either process it in an
implementation-dependent manner or reject it with an exception.

  Yeah, though the standard's use of DISTINCT in this way is something
  of a violation of the POLA.

 Jeff I agree that's kind of a funny requirement. But that's pretty
 Jeff typical of the SQL standard. If DB2 or SQL Server follow the
 Jeff standard here, we should, too. If not, it's open for discussion.

MSSQL does not:

The following items are not allowed in the CTE_query_definition of a
 recursive member:

* SELECT DISTINCT
* GROUP BY
* HAVING
* Scalar aggregation
* TOP
* LEFT, RIGHT, OUTER JOIN (INNER JOIN is allowed)
* Subqueries
* A hint applied to a recursive reference to a CTE inside a
  CTE_query_definition.


For DB2 the docs do not appear to specify either way; they don't seem to
forbid the use of SELECT DISTINCT inside a recursive CTE, but neither do
they seem to mention any unusual effect of including it.

 Jeff * ORDER BY, LIMIT, and OFFSET are rejected for recursive
 Jeff queries. The standard does not seem to say that these should be
 Jeff rejected.
 
  Note that supporting those in subqueries (including CTEs) is a
  separate optional feature of the standard.

 Jeff I don't feel strongly about this either way, but I prefer that we
 Jeff are consistent when possible. We do support these things in a
 Jeff subquery, so shouldn't we support them in all subqueries?

Ideally we should. DB2 appears to (other than OFFSET which it doesn't
seem to have at all). But it's not at all clear that it would be either
useful or easy to do so.

-- 
Andrew.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Jeff Davis
On Mon, 2008-09-08 at 22:53 +0100, Andrew Gierth wrote:
 Yes, you've misinterpreted. When the spec says that a query shall
 not do such-and-such, it means that a conforming client isn't allowed
 to do that, it does NOT mean that the server is required to produce an
 error when it sees it.
 

Interesting. Thanks for the clear reference.

So, we either need to reject it or define some implementation-dependent
behavior, right?

The patch currently rejects HAVING, which means that it's a little
difficult to use an aggregate effectively. I lean toward rejecting
aggregates if we reject HAVING, for consistency. If we allow it, we
should allow HAVING as well.

  Jeff I agree that's kind of a funny requirement. But that's pretty
  Jeff typical of the SQL standard. If DB2 or SQL Server follow the
  Jeff standard here, we should, too. If not, it's open for discussion.
 
 MSSQL does not:

Thanks again for a reference.

If MSSQL rejects SELECT DISTINCT for a recursive query expression,
then I think we should reject it. Right now the patch allows it but
provides a result that is inconsistent with the standard.

If we reject SELECT DISTINCT for recursive queries now, we can always
meet the standard later, or decide that the standard behavior is too
likely to cause confusion and just continue to block it.

 Ideally we should. DB2 appears to (other than OFFSET which it doesn't
 seem to have at all). But it's not at all clear that it would be either
 useful or easy to do so.

Ok. If it's easy to support it should probably be done.

As an aside, you seem to be an expert with the standard, have you had a
chance to look at the question I asked earlier?:

http://archives.postgresql.org/pgsql-hackers/2008-09/msg00487.php

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Tatsuo Ishii
Thanks for the review.

[I dropped [EMAIL PROTECTED] from the Cc list since he has left our
company and the email address is being deleted.]

I'm going to look into issues which are seem to be bug (of course if
you know what to fix, patches are always welcome:-).

 These are my initial comments about the Common Table Expressions (CTE)
 patch, also known as WITH [RECURSIVE]. These comments are based on the
 patch here:
 
 http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
 
 This is a semantically complex feature, and the standard is fairly
 complex as well. So I'm approaching this by making my own
 interpretations from the standard first (I included my interpretations
 and section references at the end of this email) and comparing to the
 behavior of the patch.
 
 The following examples may be inconsistent with the standard. Some
 have already been mentioned, and I don't think they all need to be
 fixed for 8.4, but I mention them here for completeness.
 
 * Mutual Recursion:
 
   with recursive
 foo(i) as (values(1) union all select i+1 from bar where i  10),
 bar(i) as (values(1) union all select i+1 from foo where i  10)
   select * from foo;
   ERROR:  mutual recursive call is not supported
 
   The standard allows mutual recursion.

The discussion seems to agree that let it leave for post 8.4.

 * Single Evaluation:
 
   with
 foo(i) as (select random() as i)
   select * from foo union all select * from foo;
i
   ---
0.233165248762816
 0.62126633618027
   (2 rows)
 
   The standard specifies that non-recursive WITH should be evaluated
   once.

What shall we do? I don't think there's a easy way to fix this. Maybe
we should not allow WITH clause without RECURISVE?

 * RHS only:
 
   with recursive
 foo(i) as (select i+1 from foo where i  10 union all values(1))
   select * from foo;
   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
 recursive query
 
   The standard does not require that the recursive term be on the RHS.

The discussion seems to agree that let it leave for post 8.4.

 * UNION ALL only:
 
   with recursive
 foo(i) as (values(1) union select i+1 from foo where i  10)
   select * from foo;
   ERROR:  non-recursive term and recursive term must be combined with
 UNION ALL
 
   The standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
   (when the recursive term is not on the RHS of the EXCEPT).

The discussion seems to agree that let it leave for post 8.4.

 * Binary recursion and subselect strangeness:
 
   with recursive foo(i) as
 (values (1)
 union all
 select * from
   (select i+1 from foo where i  10
   union all
   select i+1 from foo where i  X) t)
   select * from foo;
 
   Produces 10 rows of output regardless of what X is. This should be
 fixed for 8.4.
   Also, this is non-linear recursion, which the standard seems to
 disallow.

I will try to fix this. However detecting the query being not a
non-linear one is not so easy.

 * Multiple recursive references:
 
   with recursive foo(i) as
 (values (1)
 union all
 select i+1 from foo where i  10
 union all
 select i+1 from foo where i  20)
   select * from foo;
   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
 recursive query
 
   If we're going to allow non-linear recursion (which the standard
   does not), this seems like it should be a valid case.

I will try to disallow this.

 * Strange result with except:
 
   with recursive foo(i) as
 (values (1)
 union all
 select * from
 (select i+1 from foo where i  10
 except
 select i+1 from foo where i  5) t)
   select * from foo;
   ERROR:  table foo has 0 columns available but 1 columns specified
 
   This query works if you replace except with union. This should be
 fixed for 8.4.

I will try to fix this.

 * Aggregates allowed:
 
   with recursive foo(i) as
 (values(1)
 union all
 select max(i)+1 from foo where i  10)
   select * from foo;
 
   Aggregates should be blocked according to the standard.
   Also, causes an infinite loop. This should be fixed for 8.4.

I will try to fix this.

 * DISTINCT should supress duplicates:
 
   with recursive foo(i) as
 (select distinct * from (values(1),(2)) t
 union all
 select distinct i+1 from foo where i  10)
   select * from foo;
 
   This outputs a lot of duplicates, but they should be supressed
 according to the standard. This query is essentially the same as
 supporting UNION for recursive queries, so we should either fix both for
 8.4 or block both for consistency.

I'm not sure if it's possible to fix this. Will look into.

 * outer joins on a recursive reference should be blocked:
 
   with recursive foo(i) as
 (values(1)
 union all
 select i+1 from foo left join (values(1)) t on (i=column1))
   select * from foo;
 
   Causes an infinite loop, but the standard says using an outer join
   in this situation 

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Tatsuo Ishii
Thanks for the review.

[I dropped [EMAIL PROTECTED] from the Cc list since he has left our
company and the email address is being deleted.]

I'm going to look into issues which are seem to be bug (of course if
you know what to fix, patches are always welcome:-).

 These are my initial comments about the Common Table Expressions (CTE)
 patch, also known as WITH [RECURSIVE]. These comments are based on the
 patch here:
 
 http://archives.postgresql.org/pgsql-patches/2008-08/msg00021.php
 
 This is a semantically complex feature, and the standard is fairly
 complex as well. So I'm approaching this by making my own
 interpretations from the standard first (I included my interpretations
 and section references at the end of this email) and comparing to the
 behavior of the patch.
 
 The following examples may be inconsistent with the standard. Some
 have already been mentioned, and I don't think they all need to be
 fixed for 8.4, but I mention them here for completeness.
 
 * Mutual Recursion:
 
   with recursive
 foo(i) as (values(1) union all select i+1 from bar where i  10),
 bar(i) as (values(1) union all select i+1 from foo where i  10)
   select * from foo;
   ERROR:  mutual recursive call is not supported
 
   The standard allows mutual recursion.

The discussion seems to agree that let it leave for post 8.4.

 * Single Evaluation:
 
   with
 foo(i) as (select random() as i)
   select * from foo union all select * from foo;
i
   ---
0.233165248762816
 0.62126633618027
   (2 rows)
 
   The standard specifies that non-recursive WITH should be evaluated
   once.

What shall we do? I don't think there's an easy way to fix this as Tom
suggested. Maybe we should not allow WITH clause without RECURISVE for
8.4?

 * RHS only:
 
   with recursive
 foo(i) as (select i+1 from foo where i  10 union all values(1))
   select * from foo;
   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
 recursive query
 
   The standard does not require that the recursive term be on the RHS.

The discussion seems to agree that let it leave for post 8.4.

 * UNION ALL only:
 
   with recursive
 foo(i) as (values(1) union select i+1 from foo where i  10)
   select * from foo;
   ERROR:  non-recursive term and recursive term must be combined with
 UNION ALL
 
   The standard seems to allow UNION ALL, UNION, INTERSECT, and EXCEPT
   (when the recursive term is not on the RHS of the EXCEPT).

The discussion seems to agree that let it leave for post 8.4.

 * Binary recursion and subselect strangeness:
 
   with recursive foo(i) as
 (values (1)
 union all
 select * from
   (select i+1 from foo where i  10
   union all
   select i+1 from foo where i  X) t)
   select * from foo;
 
   Produces 10 rows of output regardless of what X is. This should be
 fixed for 8.4.
   Also, this is non-linear recursion, which the standard seems to
 disallow.

I will try to fix this. However detecting the query being not a
non-linear one is not so easy.

 * Multiple recursive references:
 
   with recursive foo(i) as
 (values (1)
 union all
 select i+1 from foo where i  10
 union all
 select i+1 from foo where i  20)
   select * from foo;
   ERROR:  Left hand side of UNION ALL must be a non-recursive term in a
 recursive query
 
   If we're going to allow non-linear recursion (which the standard
   does not), this seems like it should be a valid case.

I will try to disallow this.

 * Strange result with except:
 
   with recursive foo(i) as
 (values (1)
 union all
 select * from
 (select i+1 from foo where i  10
 except
 select i+1 from foo where i  5) t)
   select * from foo;
   ERROR:  table foo has 0 columns available but 1 columns specified
 
   This query works if you replace except with union. This should be
 fixed for 8.4.

I will try to fix this.

 * Aggregates allowed:
 
   with recursive foo(i) as
 (values(1)
 union all
 select max(i)+1 from foo where i  10)
   select * from foo;
 
   Aggregates should be blocked according to the standard.
   Also, causes an infinite loop. This should be fixed for 8.4.

I will try to fix this.

 * DISTINCT should supress duplicates:
 
   with recursive foo(i) as
 (select distinct * from (values(1),(2)) t
 union all
 select distinct i+1 from foo where i  10)
   select * from foo;
 
   This outputs a lot of duplicates, but they should be supressed
 according to the standard. This query is essentially the same as
 supporting UNION for recursive queries, so we should either fix both for
 8.4 or block both for consistency.

I'm not sure if it's possible to fix this. Will look into.

 * outer joins on a recursive reference should be blocked:
 
   with recursive foo(i) as
 (values(1)
 union all
 select i+1 from foo left join (values(1)) t on (i=column1))
   select * from foo;
 
   Causes an infinite loop, but the standard says using an outer 

Re: [HACKERS] Common Table Expressions (WITH RECURSIVE) patch

2008-09-08 Thread Robert Haas
 * Single Evaluation:

   with
 foo(i) as (select random() as i)
   select * from foo union all select * from foo;
i
   ---
0.233165248762816
 0.62126633618027
   (2 rows)

   The standard specifies that non-recursive WITH should be evaluated
   once.

 What shall we do? I don't think there's a easy way to fix this. Maybe
 we should not allow WITH clause without RECURISVE?

ISTM that kind of misses the point.  Even if it's WITH RECURSIVE
rather than simply WITH, one wouldn't expect multiple evaluations of
any non-recursive portion of the query.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers