Re: [HACKERS] SQL standard question about Common Table Expressions

Andrew Gierth Mon, 08 Sep 2008 18:03:06 -0700

>>>>> "Jeff" == Jeff Davis <[EMAIL PROTECTED]> writes:


 Jeff> I am looking into the SQL standard to try to determine
 Jeff> precisely how the CTE feature should behave.

 Jeff> Taking a simple case like:

 Jeff>   with recursive
 Jeff>     foo(i) as
 Jeff>       (values(1)
 Jeff>       union all
 Jeff>       select i+1 from foo where i < 5)
 Jeff>   select * from foo;

 Jeff> And looking at the SQL standard 200n 7.13: General Rules: 2.c,
 Jeff> it provides an algorithm for evaluating the recursive query.

 Jeff> In this algorithm, AQEk is a <query expression>. Syntactically,
 Jeff> I only see two <query expression>s, and one is the entire
 Jeff> query. The other is: "(values(1) union all select i+1 from foo
 Jeff> where i < 5)", so I'll assume that AQEk must be equal to that*.

The "contains" language in the spec is tricky. And I think there's some
issue here with the spec confusing <query expression> and <query
expression body>; some of the language refers to "If a <query expression>
AQEk not marked as recursive is immediately contained in a <query
expression body>" which is not possible as the syntax production for
<query expression body> does not contain <query expression>.

Working through the spec, we have here only one WLE and therefore only
one partition. So WLE1 is the only element, and WQN1 is "foo" and
WQE1 is "(values(1) union all select i+1 from foo where i < 5)". The
way I suspect it's _intended_ to work is like this (following the order
of steps in the spec):

AQE1 = "values(1) union all select i+1 from foo where i < 5"
AQE2 = "values (1)"
AQE3 = "select i+1 from foo where i < 5"

AQE1 and AQE3 are recursive but AQE2 is not.

AQE1 is associated with WQE1 and therefore RT1 and WT1 are the result
and working tables for WQN1.

AQE2 is not marked as recursive and is contained in a recursive union,
therefore it is marked iteration-ignorable.

None of the AQEn suppress duplicates.

Let RT2 and WT2 be the result of AQE2, so both now contain (1)

AQE2 now becomes "TABLE RTN2" (and hence AQE1/WQE1 are now "TABLE RTN2
union all select i+1 from foo where i < 5"

Evaluate WQE1:

  RT1 and WT1 become (1)
  RT2/WT2 unchanged
  RT3 and WT3 remain empty

AQE2 is iteration ignorable, so RT2 is emptied.

WT1 and WT2 are nonempty, so:

associate "foo" with WT1

evaluate WQE1:

  WT1 becomes (2)
  WT2 is empty
  WT3 becomes (2)

  RT1 becomes (RT1 UNION ALL WT1) i.e. (1),(2)
  RT3 becomes (RT3 UNION ALL WT3) i.e. (1),(2)

WT1 and WT2 are nonempty, so:

associate "foo" with WT1

evaluate WQE1:

  WT1 becomes (3)
  WT2 is empty
  WT3 becomes (3)

  RT1 becomes (RT1 UNION ALL WT1) i.e. (1),(2),(3)
  RT3 becomes (RT3 UNION ALL WT3) i.e. (1),(2),(3)

etc.

[The main differences between this and the logic used by the existing
patch are the fact that the spec's version is keeping around many more
working and result tables in order to do the duplicate-suppression logic,
which requires knowing not just whether you've already produced a given
row, but also _which branch of the query_ produced it. Within the
constraints of the patch, it's not possible to trigger the duplicate
elimination, and therefore RT2/WT2 and RT3/WT3 become redundant; also all
non-recursive clauses are iteration-ignorable.]

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL standard question about Common Table Expressions

Reply via email to