Re: SEARCH and CYCLE clauses

2021-02-26 Thread Peter Eisentraut

On 22.02.21 14:45, Vik Fearing wrote:

On 2/22/21 1:28 PM, Peter Eisentraut wrote:

On 22.02.21 11:05, Vik Fearing wrote:

This looks good to me, except that you forgot to add the feature stamp.
   Attached is a small diff to apply on top of your patch to fix that.


The feature code is from SQL:202x, whereas the table is relative to
SQL:2016.  We could add it, but probably with a comment.


OK.



done




Re: SEARCH and CYCLE clauses

2021-02-22 Thread Vik Fearing
On 2/22/21 1:28 PM, Peter Eisentraut wrote:
> On 22.02.21 11:05, Vik Fearing wrote:
>> This looks good to me, except that you forgot to add the feature stamp.
>>   Attached is a small diff to apply on top of your patch to fix that.
> 
> The feature code is from SQL:202x, whereas the table is relative to
> SQL:2016.  We could add it, but probably with a comment.

OK.
-- 
Vik Fearing




Re: SEARCH and CYCLE clauses

2021-02-22 Thread Peter Eisentraut

On 22.02.21 11:05, Vik Fearing wrote:

This looks good to me, except that you forgot to add the feature stamp.
  Attached is a small diff to apply on top of your patch to fix that.


The feature code is from SQL:202x, whereas the table is relative to 
SQL:2016.  We could add it, but probably with a comment.





Re: SEARCH and CYCLE clauses

2021-02-22 Thread Vik Fearing
On 2/22/21 9:44 AM, Peter Eisentraut wrote:
> On 22.05.20 14:32, Peter Eisentraut wrote:
>>> As an improvement over the spec, I think the vast majority of people
>>> will be using simple true/false values.  Can we make that optional?
>>>
>>>  CYCLE f, t SET is_cycle USING path
>>>
>>> would be the same as
>>>
>>>  CYCLE f, t SET is_cycle TO true DEFAULT false USING path
>>
>> I was also considering that.  It would be an easy change to make.
> 
> This change has been accepted into the SQL:202x draft.

Yay!

> Here is a patch for it.

This looks good to me, except that you forgot to add the feature stamp.
 Attached is a small diff to apply on top of your patch to fix that.
-- 
Vik Fearing
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index a24387c1e7..1ab5f685b9 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -425,6 +425,7 @@ T121	WITH (excluding RECURSIVE) in query expression			YES
 T122	WITH (excluding RECURSIVE) in subquery			YES	
 T131	Recursive query			YES	
 T132	Recursive query in subquery			YES	
+T133	Enhanced cycle mark values			YES	
 T141	SIMILAR predicate			YES	
 T151	DISTINCT predicate			YES	
 T152	DISTINCT predicate with negation			YES	


Re: SEARCH and CYCLE clauses

2021-02-22 Thread Peter Eisentraut

On 22.05.20 14:32, Peter Eisentraut wrote:

As an improvement over the spec, I think the vast majority of people
will be using simple true/false values.  Can we make that optional?

 CYCLE f, t SET is_cycle USING path

would be the same as

 CYCLE f, t SET is_cycle TO true DEFAULT false USING path


I was also considering that.  It would be an easy change to make.


This change has been accepted into the SQL:202x draft.  Here is a patch 
for it.


From 55c9050b4ec413e9b1b9eb9296196fc9c1957c8c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 22 Feb 2021 09:36:49 +0100
Subject: [PATCH] Enhanced cycle mark values

Per SQL:202x draft, in the CYCLE clause of a recursive query, the
cycle mark values can be of type boolean and can be omitted, in which
case they default to TRUE and FALSE.
---
 doc/src/sgml/queries.sgml  |   5 +-
 doc/src/sgml/ref/select.sgml   |   8 +-
 src/backend/parser/gram.y  |  11 +++
 src/backend/utils/adt/ruleutils.c  |  19 -
 src/test/regress/expected/with.out | 117 ++---
 src/test/regress/sql/with.sql  |  30 +---
 6 files changed, 143 insertions(+), 47 deletions(-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 4741506eb5..bc0b3cc9fe 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2349,14 +2349,13 @@ Cycle Detection
 SELECT g.id, g.link, g.data, sg.depth + 1
 FROM graph g, search_graph sg
 WHERE g.id = sg.link
-) CYCLE id SET is_cycle TO true DEFAULT false USING path
+) CYCLE id SET is_cycle USING path
 SELECT * FROM search_graph;
 
and it will be internally rewritten to the above form.  The
CYCLE clause specifies first the list of columns to
track for cycle detection, then a column name that will show whether a
-   cycle has been detected, then two values to use in that column for the yes
-   and no cases, and finally the name of another column that will track the
+   cycle has been detected, and finally the name of another column that will 
track the
path.  The cycle and path columns will implicitly be added to the output
rows of the CTE.
   
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index eb8b524951..ab91105599 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -74,7 +74,7 @@
 
 with_query_name [ ( 
column_name [, ...] ) ] AS [ [ NOT 
] MATERIALIZED ] ( select | 
values | insert | update | delete )
 [ SEARCH { BREADTH | DEPTH } FIRST BY 
column_name [, ...] SET 
search_seq_col_name ]
-[ CYCLE column_name [, ...] SET 
cycle_mark_col_name TO 
cycle_mark_value DEFAULT 
cycle_mark_default USING 
cycle_path_col_name ]
+[ CYCLE column_name [, ...] SET 
cycle_mark_col_name [ TO 
cycle_mark_value DEFAULT 
cycle_mark_default ] USING 
cycle_path_col_name ]
 
 TABLE [ ONLY ] table_name [ * ]
 
@@ -302,8 +302,10 @@ WITH Clause
 been detected.  cycle_mark_value and
 cycle_mark_default must be constants and they
 must be coercible to a common data type, and the data type must have an
-inequality operator.  (The SQL standard requires that they be character
-strings, but PostgreSQL does not require that.)  Furthermore, a column
+inequality operator.  (The SQL standard requires that they be Boolean
+constants or character strings, but PostgreSQL does not require that.)  By
+default, TRUE and FALSE (of type
+boolean) are used.  Furthermore, a column
 named cycle_path_col_name will be added to the
 result column list of the WITH query.  This column is
 used internally for tracking visited rows.  See location = @1;
$$ = (Node *) n;
}
+   | CYCLE columnList SET ColId USING ColId
+   {
+   CTECycleClause *n = makeNode(CTECycleClause);
+   n->cycle_col_list = $2;
+   n->cycle_mark_column = $4;
+   n->cycle_mark_value = makeBoolAConst(true, -1);
+   n->cycle_mark_default = makeBoolAConst(false, 
-1);
+   n->cycle_path_column = $6;
+   n->location = @1;
+   $$ = (Node *) n;
+   }
| /*EMPTY*/
{
$$ = NULL;
diff --git a/src/backend/utils/adt/ruleutils.c 
b/src/backend/utils/adt/ruleutils.c
index 4a9244f4f6..879288c139 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5208,10 +5208,21 @@ get_with_clause(Query *query, deparse_context *context)
}
 
appendStringInfo(buf, " SET %s", 
quote_identifier(cte->cycle_clause->cycle_mark_column));
-   appendStringInfoString(buf, " TO ");
-   

Re: SEARCH and CYCLE clauses

2021-02-01 Thread Pavel Stehule
po 1. 2. 2021 v 19:02 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 2020-11-25 20:35, Pavel Stehule wrote:
> > I checked this patch, and I didn't find any issue.
> >
> > make check-world passed
> > make doc passed
> >
> > I'll mark it as ready for committer
>
> This has been committed.  Thanks.
>

great!

Pavel


> --
> Peter Eisentraut
> 2ndQuadrant, an EDB company
> https://www.2ndquadrant.com/
>


Re: SEARCH and CYCLE clauses

2021-02-01 Thread Peter Eisentraut

On 2020-11-25 20:35, Pavel Stehule wrote:

I checked this patch, and I didn't find any issue.

make check-world passed
make doc passed

I'll mark it as ready for committer


This has been committed.  Thanks.

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/




Re: SEARCH and CYCLE clauses

2020-12-08 Thread Vik Fearing
On 6/15/20 11:49 AM, Peter Eisentraut wrote:
> To fix the star expansion I had to add a little bit of infrastructure
> that could also be used as a more general facility "don't include this
> column in star expansion", so this could perhaps use some consideration
> from a more general angle as well.

Could this work be salvaged to add the ability to ALTER a column to hide
it from star expansion?  That's a feature I've often seen requested,
especially from people working with PostGIS's geometry.

Totally off-topic for this thread, though.
-- 
Vik Fearing




Re: SEARCH and CYCLE clauses

2020-12-08 Thread Vik Fearing
On 5/22/20 12:40 PM, Vik Fearing wrote:
>>> 2)
>>> This query is an infinite loop, as expected:
>>>
>>>    with recursive a as (select 1 as b union all select b from a)
>>>    table a;
>>>
>>> But it becomes an error when you add a cycle clause to it:
>>>
>>>    with recursive a as (select 1 as b union all table a)
>>>  cycle b set c to true default false using p
>>>    table a;
>>>
>>>    ERROR:  each UNION query must have the same number of columns
>> table a expands to select * from a, and if you have a cycle clause, then
>> a has three columns, but the other branch of the union only has one, so
>> that won't work anymore, will it?
> It seems there was a copy/paste error here.  The first query should have
> been the same as the second but without the cycle clause.
> 
> It seems strange to me that adding a  would
> break a previously working query.  I would rather see the * expanded
> before adding the new columns.  This is a user's opinion, I don't know
> how hard that would be to implement.


After thinking about it quite a bit more, I have changed my mind on
this.  The transformation does add columns to the 
and so TABLE or SELECT * should see them.  Especially since they see
them from outside of the wle.
-- 
Vik Fearing




Re: SEARCH and CYCLE clauses

2020-11-25 Thread Pavel Stehule
st 25. 11. 2020 v 14:06 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 2020-10-10 07:25, Pavel Stehule wrote:
> > This patch is based on transformation CYCLE and SEARCH clauses to
> > specific expressions - it is in agreement with ANSI SQL
> >
> > There is not a problem with compilation
> > Nobody had objections in discussion
> > There are enough regress tests and documentation
> > check-world passed
> > doc build passed
> >
> > I'll mark this patch as ready for committer
> >
> > Possible enhancing for this feature (can be done in next steps)
> >
> > 1. support UNION DISTINCT
> > 2. better compatibility with Oracle and DB2 (USING clause can be
> optional)
>
> Here is an updated patch.  New since last time:
>
> - UNION DISTINCT is now supported (since hash_record() was added)
>
> - Some code has been cleaned up.
>
> - Some code has been moved from the rewriter to the parser so that
> certain errors are properly detected at parse time.
>
> - Added more syntax checks and more tests.
>
> - Support for dependency tracking was added (the type and operator for
> the cycle mark need to be added as dependencies).
>
> I found a bug that nested UNIONs (foo UNION bar UNION baz) were not
> handled (would crash) in the rewriter code.  For now, I have just
> changed that to error out.  This could be fixed, it would be a localized
> change in the rewriter code in any case.  Doesn't seem important for the
> first pass, though.
>

I checked this patch, and I didn't find any issue.

make check-world passed
make doc passed

I'll mark it as ready for committer

Regards

Pavel



> --
> Peter Eisentraut
> 2ndQuadrant, an EDB company
> https://www.2ndquadrant.com/
>


Re: SEARCH and CYCLE clauses

2020-11-25 Thread Peter Eisentraut

On 2020-10-10 07:25, Pavel Stehule wrote:
This patch is based on transformation CYCLE and SEARCH clauses to 
specific expressions - it is in agreement with ANSI SQL


There is not a problem with compilation
Nobody had objections in discussion
There are enough regress tests and documentation
check-world passed
doc build passed

I'll mark this patch as ready for committer

Possible enhancing for this feature (can be done in next steps)

1. support UNION DISTINCT
2. better compatibility with Oracle and DB2 (USING clause can be optional)


Here is an updated patch.  New since last time:

- UNION DISTINCT is now supported (since hash_record() was added)

- Some code has been cleaned up.

- Some code has been moved from the rewriter to the parser so that 
certain errors are properly detected at parse time.


- Added more syntax checks and more tests.

- Support for dependency tracking was added (the type and operator for 
the cycle mark need to be added as dependencies).


I found a bug that nested UNIONs (foo UNION bar UNION baz) were not 
handled (would crash) in the rewriter code.  For now, I have just 
changed that to error out.  This could be fixed, it would be a localized 
change in the rewriter code in any case.  Doesn't seem important for the 
first pass, though.


--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/
From e69e5cc77a226d646ad0fcd7bc2dd73be9b1abda Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 25 Nov 2020 14:03:45 +0100
Subject: [PATCH v4] SEARCH and CYCLE clauses

Discussion: 
https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5...@2ndquadrant.com
---
 doc/src/sgml/queries.sgml|  64 ++-
 doc/src/sgml/ref/select.sgml |  44 ++
 src/backend/catalog/dependency.c |  15 +
 src/backend/nodes/copyfuncs.c|  40 ++
 src/backend/nodes/equalfuncs.c   |  36 ++
 src/backend/nodes/nodeFuncs.c|  42 +-
 src/backend/nodes/outfuncs.c |  36 ++
 src/backend/nodes/readfuncs.c|  44 ++
 src/backend/parser/analyze.c |  47 +-
 src/backend/parser/gram.y|  58 +-
 src/backend/parser/parse_agg.c   |   7 +
 src/backend/parser/parse_cte.c   | 193 +++
 src/backend/parser/parse_expr.c  |   4 +
 src/backend/parser/parse_func.c  |   3 +
 src/backend/parser/parse_relation.c  |  54 +-
 src/backend/parser/parse_target.c|  17 +-
 src/backend/rewrite/Makefile |   1 +
 src/backend/rewrite/rewriteHandler.c |  18 +
 src/backend/rewrite/rewriteSearchCycle.c | 668 +++
 src/backend/utils/adt/ruleutils.c|  47 ++
 src/include/nodes/nodes.h|   2 +
 src/include/nodes/parsenodes.h   |  30 +-
 src/include/parser/analyze.h |   2 +
 src/include/parser/kwlist.h  |   2 +
 src/include/parser/parse_node.h  |   2 +
 src/include/rewrite/rewriteSearchCycle.h |  21 +
 src/test/regress/expected/with.out   | 558 +++
 src/test/regress/sql/with.sql| 279 ++
 28 files changed, 2301 insertions(+), 33 deletions(-)
 create mode 100644 src/backend/rewrite/rewriteSearchCycle.c
 create mode 100644 src/include/rewrite/rewriteSearchCycle.h

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index ca51204875..4741506eb5 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2218,6 +2218,39 @@ Search Order
  in any case.
 

+
+   
+There is built-in syntax to compute a depth- or breadth-first sort column.
+For example:
+
+
+WITH RECURSIVE search_tree(id, link, data) AS (
+SELECT t.id, t.link, t.data
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data
+FROM tree t, search_tree st
+WHERE t.id = st.link
+) SEARCH DEPTH FIRST BY id SET ordercol
+SELECT * FROM search_tree ORDER BY ordercol;
+
+WITH RECURSIVE search_tree(id, link, data) AS (
+SELECT t.id, t.link, t.data
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data
+FROM tree t, search_tree st
+WHERE t.id = st.link
+) SEARCH BREADTH FIRST BY id SET ordercol
+SELECT * FROM search_tree ORDER BY ordercol;
+
+This syntax is internally expanded to something similar to the above
+hand-written forms.  The SEARCH clause specifies whether
+depth- or breadth first search is wanted, the list of columns to track for
+sorting, and a column name that will contain the result data that can be
+used for sorting.  That column will implicitly be added to the output rows
+of the CTE.
+   
   
 
   
@@ -2305,10 +2338,39 @@ Cycle Detection

   
 
+  
+   There is built-in syntax to simplify cycle detection.  The above query can
+   also be written like this:
+
+WITH RECURSIVE search_graph(id, link, data, depth) AS (
+SELECT g.id, g.link, g.data, 1
+FROM graph g
+  UNION ALL
+SELECT g.id, g.link, g.data, sg.depth + 1
+FROM graph g

Re: SEARCH and CYCLE clauses

2020-10-27 Thread Anastasia Lubennikova

On 10.10.2020 08:25, Pavel Stehule wrote:

Hi

pá 9. 10. 2020 v 12:17 odesílatel Pavel Stehule 
mailto:pavel.steh...@gmail.com>> napsal:




pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut
mailto:peter.eisentr...@2ndquadrant.com>> napsal:

On 2020-09-22 20:29, Pavel Stehule wrote:
> The result is correct. When I tried to use UNION instead
UNION ALL, the
> pg crash

I fixed the crash, but UNION [DISTINCT] won't actually work
here because
row/record types are not hashable.  I'm leaving the partial
support in,
but I'm documenting it as currently not supported.

 I think so UNION is a common solution against the cycles. So
missing support for this specific case is not a nice thing. How
much work is needed for hashing rows. It should not be too much code.


> looks so clause USING in cycle detection is unsupported for
DB2 and
> Oracle - the examples from these databases doesn't work on
PG without
> modifications

Yeah, the path clause is actually not necessary from a user's
perspective, but it's required for internal bookkeeping.  We
could
perhaps come up with a mechanism to make it invisible coming
out of the
CTE (maybe give the CTE a target list internally), but that
seems like a
separate project.

The attached patch fixes the issues you have reported (also
the view
issue from the other email).  I have also moved the whole rewrite
support to a new file to not blow up rewriteHandler.c so much.

-- 
Peter Eisentraut http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training &
Services


This patch is based on transformation CYCLE and SEARCH clauses to 
specific expressions - it is in agreement with ANSI SQL


There is not a problem with compilation
Nobody had objections in discussion
There are enough regress tests and documentation
check-world passed
doc build passed

I'll mark this patch as ready for committer

Possible enhancing for this feature (can be done in next steps)

1. support UNION DISTINCT
2. better compatibility with Oracle and DB2 (USING clause can be optional)

Regards

Pavel





Status update for a commitfest entry.
According to cfbot patch no longer applies. So I moved it to waiting on 
author.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: SEARCH and CYCLE clauses

2020-10-12 Thread Robert Haas
On Fri, May 22, 2020 at 5:25 AM Peter Eisentraut
 wrote:
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.

I don't have an opinion on this feature, but I think doing expression
evaluation in the raw parser would be a pretty bad idea. I think we're
not supposed to do anything in the parser that involves catalog access
or even references to GUC values. It might be OK if it happens in
parse analysis rather than gram.y, but even that sounds a bit sketchy
to me. We'd need to think carefully about what effects such things
woul have on the plan cache, and whether they introduce any security
holes, and maybe some other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: SEARCH and CYCLE clauses

2020-10-09 Thread Pavel Stehule
Hi

pá 9. 10. 2020 v 12:17 odesílatel Pavel Stehule 
napsal:

>
>
> pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> napsal:
>
>> On 2020-09-22 20:29, Pavel Stehule wrote:
>> > The result is correct. When I tried to use UNION instead UNION ALL, the
>> > pg crash
>>
>> I fixed the crash, but UNION [DISTINCT] won't actually work here because
>> row/record types are not hashable.  I'm leaving the partial support in,
>> but I'm documenting it as currently not supported.
>>
>
>  I think so UNION is a common solution against the cycles. So missing
> support for this specific case is not a nice thing. How much work is needed
> for hashing rows. It should not be too much code.
>
>
>> > looks so clause USING in cycle detection is unsupported for DB2 and
>> > Oracle - the examples from these databases doesn't work on PG without
>> > modifications
>>
>> Yeah, the path clause is actually not necessary from a user's
>> perspective, but it's required for internal bookkeeping.  We could
>> perhaps come up with a mechanism to make it invisible coming out of the
>> CTE (maybe give the CTE a target list internally), but that seems like a
>> separate project.
>>
>> The attached patch fixes the issues you have reported (also the view
>> issue from the other email).  I have also moved the whole rewrite
>> support to a new file to not blow up rewriteHandler.c so much.
>>
>> --
>> Peter Eisentraut  http://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
This patch is based on transformation CYCLE and SEARCH clauses to specific
expressions - it is in agreement with ANSI SQL

There is not a problem with compilation
Nobody had objections in discussion
There are enough regress tests and documentation
check-world passed
doc build passed

I'll mark this patch as ready for committer

Possible enhancing for this feature (can be done in next steps)

1. support UNION DISTINCT
2. better compatibility with Oracle and DB2 (USING clause can be optional)

Regards

Pavel


Re: SEARCH and CYCLE clauses

2020-10-09 Thread Pavel Stehule
pá 9. 10. 2020 v 11:40 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 2020-09-22 20:29, Pavel Stehule wrote:
> > The result is correct. When I tried to use UNION instead UNION ALL, the
> > pg crash
>
> I fixed the crash, but UNION [DISTINCT] won't actually work here because
> row/record types are not hashable.  I'm leaving the partial support in,
> but I'm documenting it as currently not supported.
>

 I think so UNION is a common solution against the cycles. So missing
support for this specific case is not a nice thing. How much work is needed
for hashing rows. It should not be too much code.


> > looks so clause USING in cycle detection is unsupported for DB2 and
> > Oracle - the examples from these databases doesn't work on PG without
> > modifications
>
> Yeah, the path clause is actually not necessary from a user's
> perspective, but it's required for internal bookkeeping.  We could
> perhaps come up with a mechanism to make it invisible coming out of the
> CTE (maybe give the CTE a target list internally), but that seems like a
> separate project.
>
> The attached patch fixes the issues you have reported (also the view
> issue from the other email).  I have also moved the whole rewrite
> support to a new file to not blow up rewriteHandler.c so much.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: SEARCH and CYCLE clauses

2020-10-09 Thread Peter Eisentraut

On 2020-09-22 20:29, Pavel Stehule wrote:
The result is correct. When I tried to use UNION instead UNION ALL, the 
pg crash


I fixed the crash, but UNION [DISTINCT] won't actually work here because 
row/record types are not hashable.  I'm leaving the partial support in, 
but I'm documenting it as currently not supported.


looks so clause USING in cycle detection is unsupported for DB2 and 
Oracle - the examples from these databases doesn't work on PG without 
modifications


Yeah, the path clause is actually not necessary from a user's 
perspective, but it's required for internal bookkeeping.  We could 
perhaps come up with a mechanism to make it invisible coming out of the 
CTE (maybe give the CTE a target list internally), but that seems like a 
separate project.


The attached patch fixes the issues you have reported (also the view 
issue from the other email).  I have also moved the whole rewrite 
support to a new file to not blow up rewriteHandler.c so much.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 4b36e81def50be44bd1f68247e33a22343520b5e Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 9 Oct 2020 11:32:10 +0200
Subject: [PATCH v3] SEARCH and CYCLE clauses

Discussion: 
https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5...@2ndquadrant.com
---
 doc/src/sgml/queries.sgml| 213 ++-
 doc/src/sgml/ref/select.sgml |  44 ++
 src/backend/nodes/copyfuncs.c|  39 ++
 src/backend/nodes/equalfuncs.c   |  35 ++
 src/backend/nodes/nodeFuncs.c|   6 +
 src/backend/nodes/outfuncs.c |  35 ++
 src/backend/nodes/readfuncs.c|  43 ++
 src/backend/parser/analyze.c |  47 +-
 src/backend/parser/gram.y|  58 +-
 src/backend/parser/parse_agg.c   |   7 +
 src/backend/parser/parse_cte.c   | 117 
 src/backend/parser/parse_expr.c  |   4 +
 src/backend/parser/parse_func.c  |   3 +
 src/backend/parser/parse_relation.c  |  54 +-
 src/backend/parser/parse_target.c|  21 +-
 src/backend/rewrite/Makefile |   1 +
 src/backend/rewrite/rewriteHandler.c |  18 +
 src/backend/rewrite/rewriteSearchCycle.c | 709 +++
 src/backend/utils/adt/ruleutils.c|  47 ++
 src/include/nodes/nodes.h|   2 +
 src/include/nodes/parsenodes.h   |  28 +-
 src/include/parser/analyze.h |   2 +
 src/include/parser/kwlist.h  |   2 +
 src/include/parser/parse_node.h  |   2 +
 src/include/rewrite/rewriteSearchCycle.h |  21 +
 src/test/regress/expected/with.out   | 433 ++
 src/test/regress/sql/with.sql| 210 +++
 27 files changed, 2153 insertions(+), 48 deletions(-)
 create mode 100644 src/backend/rewrite/rewriteSearchCycle.c
 create mode 100644 src/include/rewrite/rewriteSearchCycle.h

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 77fb1991ae..34b3e48986 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2011,6 +2011,10 @@ SELECT in 
WITH
but we'd have needed two levels of nested sub-SELECTs.  
It's a bit
easier to follow this way.
   
+ 
+
+ 
+  Recursive Queries
 
   

@@ -2114,6 +2118,153 @@ Recursive Query Evaluation
 
   
 
+  
+   Search Order
+
+   
+When computing a tree traversal using a recursive query, you might want to
+order the results in either depth-first or breadth-first order.  This can
+be done by computing an ordering column alongside the other data columns
+and using that to sort the results at the end.  Note that this does not
+actually control in which order the query evaluation visits the rows; that
+is as always in SQL implementation-dependent.  This approach merely
+provides a convenient way to order the results afterwards.
+   
+
+   
+To create a depth-first order, we compute for each result row an array of
+rows that we have visited so far.  For example, consider the following
+query that searches a table tree using a
+link field:
+
+
+WITH RECURSIVE search_tree(id, link, data) AS (
+SELECT t.id, t.link, t.data
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree;
+
+
+To add depth-first ordering information, you can write this:
+
+
+WITH RECURSIVE search_tree(id, link, data, path) AS (
+SELECT t.id, t.link, t.data, ARRAY[t.id]
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, path || t.id
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY path;
+
+   
+
+  
+   In the general case where more than one field needs to be used to identify
+   a row, use an array of rows.  For example, if we needed to track fields
+   f1 and f2:
+
+
+WITH RECUR

Re: SEARCH and CYCLE clauses

2020-09-22 Thread Pavel Stehule
Hi

I found another bug

create view xx as  WITH recursive destinations (departure, arrival,
connections, cost, itinerary) AS
(SELECT f.departure, f.arrival, 1, price,
CAST(f.departure || f.arrival AS VARCHAR(2000))
FROM flights f
WHERE f.departure = 'New York'
 UNION ALL
 SELECT r.departure, b.arrival, r.connections + 1 ,
r.cost + b.price, CAST(r.itinerary || b.arrival AS
VARCHAR(2000))
FROM destinations r, flights b
WHERE r.arrival = b.departure)
CYCLE arrival SET cyclic_data TO '1' DEFAULT '0' using path
SELECT departure, arrival, itinerary, cyclic_data
  FROM destinations  ;

postgres=# select * from xx;
ERROR:  attribute number 6 exceeds number of columns 5

Regards

Pavel


Re: SEARCH and CYCLE clauses

2020-09-22 Thread Pavel Stehule
Hi

út 22. 9. 2020 v 20:01 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> I have implemented the SEARCH and CYCLE clauses.
>
> This is standard SQL syntax attached to a recursive CTE to compute a
> depth- or breadth-first ordering and cycle detection, respectively.
> This is just convenience syntax for what you can already do manually.
> The original discussion about recursive CTEs briefly mentioned these as
> something to do later but then it was never mentioned again.
>
> SQL specifies these in terms of syntactic transformations, and so that's
> how I have implemented them also, mainly in the rewriter.
>
> I have successfully tested this against examples I found online that
> were aimed at DB2.
>
> The contained documentation and the code comment in rewriteHandler.c
> explain the details.
>

I am playing with this patch. It looks well, but I found some issues
(example is from attached data.sql)

WITH recursive destinations (departure, arrival, connections, cost) AS
(SELECT f.departure, f.arrival, 0, price
FROM flights f
WHERE f.departure = 'New York'
 UNION ALL
 SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure) cycle departure, arrival set
is_cycle to true default false using path

SELECT *
  FROM destinations ;
;

The result is correct. When I tried to use UNION instead UNION ALL, the pg
crash

Program received signal SIGABRT, Aborted.
0x7f761338ebc5 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x7f761338ebc5 in raise () from /lib64/libc.so.6
#1  0x7f76133778a4 in abort () from /lib64/libc.so.6
#2  0x0090e7eb in ExceptionalCondition (conditionName=, errorType=, fileName=,
lineNumber=) at assert.c:67
#3  0x007205e7 in generate_setop_grouplist
(targetlist=targetlist@entry=0x7f75fce5d018, op=,
op=)
at prepunion.c:1412
#4  0x007219d0 in generate_recursion_path
(pTargetList=0x7fff073ee728, refnames_tlist=, root=0xf90bd8,
setOp=0xf90840)
at prepunion.c:502
#5  plan_set_operations (root=0xf90bd8) at prepunion.c:156
#6  0x0070f79b in grouping_planner (root=0xf90bd8,
inheritance_update=false, tuple_fraction=) at planner.c:1886
#7  0x00712ea7 in subquery_planner (glob=,
parse=, parent_root=, hasRecursion=,
tuple_fraction=0) at planner.c:1015
#8  0x0071a614 in SS_process_ctes (root=0xf7abd8) at subselect.c:952
#9  0x007125d4 in subquery_planner (glob=glob@entry=0xf8a010,
parse=parse@entry=0xf6cf20, parent_root=parent_root@entry=0x0,
hasRecursion=hasRecursion@entry=false,
tuple_fraction=tuple_fraction@entry=0) at planner.c:645
#10 0x0071425b in standard_planner (parse=0xf6cf20,
query_string=, cursorOptions=256, boundParams=)
at planner.c:405
#11 0x007e5f68 in pg_plan_query (querytree=0xf6cf20,
query_string=query_string@entry=0xea6370 "WITH recursive destinations
(departure, arrival, connections, cost) AS \n(SELECT f.departure,
f.arrival, 0, price\n", ' ' , "FROM flights f \n", ' '
, "WHERE f.departure = 'New York' \n UNION "...,
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
at postgres.c:875
#12 0x007e6061 in pg_plan_queries (querytrees=0xf8b690,
query_string=query_string@entry=0xea6370 "WITH recursive destinations
(departure, arrival, connections, cost) AS \n(SELECT f.departure,
f.arrival, 0, price\n", ' ' , "FROM flights f \n", ' '
, "WHERE f.departure = 'New York' \n UNION "...,
cursorOptions=cursorOptions@entry=256, boundParams=boundParams@entry=0x0)
at postgres.c:966
#13 0x007e63b8 in exec_simple_query (
query_string=0xea6370 "WITH recursive destinations (departure, arrival,
connections, cost) AS \n(SELECT f.departure, f.arrival, 0, price\n", '
' , "FROM flights f \n", ' ' , "WHERE
f.departure = 'New York' \n UNION "...) at postgres.c:1158
#14 0x007e81e4 in PostgresMain (argc=,
argv=, dbname=, username=) at
postgres.c:4309
#15 0x007592b9 in BackendRun (port=0xecaf20) at postmaster.c:4541
#16 BackendStartup (port=0xecaf20) at postmaster.c:4225
#17 ServerLoop () at postmaster.c:1742
#18 0x0075a0ed in PostmasterMain (argc=,
argv=0xea0c90) at postmaster.c:1415
#19 0x004832ec in main (argc=3, argv=0xea0c90) at main.c:209



looks so clause USING in cycle detection is unsupported for DB2 and Oracle
- the examples from these databases doesn't work on PG without modifications

Regards

Pavel





>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


data.sql
Description: application/sql


Re: SEARCH and CYCLE clauses

2020-07-03 Thread Peter Eisentraut
While the larger patch is being considered, I think some simpler and 
separable pieces could be addressed.


Here is a patch that adjusts the existing cycle detection example and 
test queries to put the cycle column before the path column.  The CYCLE 
clause puts them in that order, and so if we added that feature that 
would make the sequence of examples more consistent and easier to follow.


(And while the order of columns has no semantic meaning, for a human 
left-to-right reader it also makes a bit more sense because the cycle 
flag is computed against the previous path value, so it happens "before" 
the path column.)


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 1042cad84631406eeec8715ab93c0451702b5c0d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 3 Jul 2020 08:51:12 +0200
Subject: [PATCH] Adjust cycle detection examples and tests

Adjust the existing cycle detection example and test queries to put
the cycle column before the path column.  This is mainly because the
SQL-standard CYCLE clause puts them in that order, and so if we added
that feature that would make the sequence of examples more consistent
and easier to follow.
---
 doc/src/sgml/queries.sgml  |  26 +++---
 src/test/regress/expected/with.out | 124 ++---
 src/test/regress/sql/with.sql  |  16 ++--
 3 files changed, 83 insertions(+), 83 deletions(-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 572e968273..ef0ca56c66 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -2112,20 +2112,20 @@ Recursive Query Evaluation
UNION ALL to UNION would not 
eliminate the looping.
Instead we need to recognize whether we have reached the same row again
while following a particular path of links.  We add two columns
-   path and cycle to the 
loop-prone query:
+   is_cycle and path to 
the loop-prone query:
 
 
-WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
+WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS (
 SELECT g.id, g.link, g.data, 1,
-  ARRAY[g.id],
-  false
+  false,
+  ARRAY[g.id]
 FROM graph g
   UNION ALL
 SELECT g.id, g.link, g.data, sg.depth + 1,
-  path || g.id,
-  g.id = ANY(path)
+  g.id = ANY(path),
+  path || g.id
 FROM graph g, search_graph sg
-WHERE g.id = sg.link AND NOT cycle
+WHERE g.id = sg.link AND NOT is_cycle
 )
 SELECT * FROM search_graph;
 
@@ -2140,17 +2140,17 @@ Recursive Query Evaluation
compare fields f1 and 
f2:
 
 
-WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
+WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS (
 SELECT g.id, g.link, g.data, 1,
-  ARRAY[ROW(g.f1, g.f2)],
-  false
+  false,
+  ARRAY[ROW(g.f1, g.f2)]
 FROM graph g
   UNION ALL
 SELECT g.id, g.link, g.data, sg.depth + 1,
-  path || ROW(g.f1, g.f2),
-  ROW(g.f1, g.f2) = ANY(path)
+  ROW(g.f1, g.f2) = ANY(path),
+  path || ROW(g.f1, g.f2)
 FROM graph g, search_graph sg
-WHERE g.id = sg.link AND NOT cycle
+WHERE g.id = sg.link AND NOT is_cycle
 )
 SELECT * FROM search_graph;
 
diff --git a/src/test/regress/expected/with.out 
b/src/test/regress/expected/with.out
index 67eaeb4f3e..457f3bf04f 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -579,79 +579,79 @@ insert into graph values
(1, 4, 'arc 1 -> 4'),
(4, 5, 'arc 4 -> 5'),
(5, 1, 'arc 5 -> 1');
-with recursive search_graph(f, t, label, path, cycle) as (
-   select *, array[row(g.f, g.t)], false from graph g
+with recursive search_graph(f, t, label, is_cycle, path) as (
+   select *, false, array[row(g.f, g.t)] from graph g
union all
-   select g.*, path || row(g.f, g.t), row(g.f, g.t) = any(path)
+   select g.*, row(g.f, g.t) = any(path), path || row(g.f, g.t)
from graph g, search_graph sg
-   where g.f = sg.t and not cycle
+   where g.f = sg.t and not is_cycle
 )
 select * from search_graph;
- f | t |   label|   path| cycle 
+---++---+---
- 1 | 2 | arc 1 -> 2 | {"(1,2)"} | f
- 1 | 3 | arc 1 -> 3 | {"(1,3)"} | f
- 2 | 3 | arc 2 -> 3 | {"(2,3)"} | f
- 1 | 4 | arc 1 -> 4 | {"(1,4)"} | f
- 4 | 5 | arc 4 -> 5 | {"(4,5)"} | f
- 5 | 1 | arc 5 -> 1 | {"(5,1)"} | f
- 1 | 2 | arc 1 -> 2 | {"(5,1)","(1,2)"} | f
- 1 | 3 | arc 1 -> 3 | {"(5,1)","(1,3)"} | f
- 1 | 4 | arc 1 -> 4 | {"(5,1)","(1,4)"} | f
- 2 | 3 | arc 2 -> 3 | {"(1,2)","(2,3)"}  

Re: SEARCH and CYCLE clauses

2020-06-15 Thread Peter Eisentraut
Here is an updated patch that I think fixes all the cases that you 
identified.  (The issue of what kinds of constants or expressions to 
accept for cycle marks has not been touched.)  To fix the star expansion 
I had to add a little bit of infrastructure that could also be used as a 
more general facility "don't include this column in star expansion", so 
this could perhaps use some consideration from a more general angle as well.


More tests and breakage reports welcome.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From f4fc5f63c53ba954066d38f968f54fb8fbf76c59 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 15 Jun 2020 11:43:52 +0200
Subject: [PATCH v2] SEARCH and CYCLE clauses

Discussion: 
https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5...@2ndquadrant.com
---
 doc/src/sgml/queries.sgml| 204 +++-
 doc/src/sgml/ref/select.sgml |  41 ++
 src/backend/nodes/copyfuncs.c|  39 ++
 src/backend/nodes/equalfuncs.c   |  35 ++
 src/backend/nodes/nodeFuncs.c|   6 +
 src/backend/nodes/outfuncs.c |  35 ++
 src/backend/nodes/readfuncs.c|  43 ++
 src/backend/parser/gram.y|  56 ++-
 src/backend/parser/parse_cte.c   | 117 +
 src/backend/parser/parse_relation.c  |  54 ++-
 src/backend/parser/parse_target.c|  21 +-
 src/backend/rewrite/rewriteHandler.c | 673 +++
 src/backend/utils/adt/ruleutils.c|  47 ++
 src/include/nodes/nodes.h|   2 +
 src/include/nodes/parsenodes.h   |  28 +-
 src/include/parser/kwlist.h  |   2 +
 src/include/parser/parse_node.h  |   1 +
 src/test/regress/expected/with.out   | 364 +++
 src/test/regress/sql/with.sql| 179 +++
 19 files changed, 1917 insertions(+), 30 deletions(-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 572e968273..f98d8d3dc1 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1979,6 +1979,10 @@ SELECT in 
WITH
but we'd have needed two levels of nested sub-SELECTs.  
It's a bit
easier to follow this way.
   
+ 
+
+ 
+  Recursive Queries
 
   

@@ -2082,6 +2086,144 @@ Recursive Query Evaluation
 
   
 
+  
+   Search Order
+
+   
+When computing a tree traversal using a recursive query, you might want to
+order the results in either depth-first or breadth-first order.  This can
+be done by computing an ordering column alongside the other data columns
+and using that to sort the results at the end.  Note that this does not
+actually control in which order the query evaluation visits the rows; that
+is as always in SQL implementation-dependent.  This approach merely
+provides a convenient way to order the results afterwards.
+   
+
+   
+To create a depth-first order, we compute for each result row an array of
+rows that we have visited so far.  For example, consider the following
+query that searches a table tree using a
+link field:
+
+
+WITH RECURSIVE search_tree(id, link, data) AS (
+SELECT t.id, t.link, t.data
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree;
+
+
+To add depth-first ordering information, you can write this:
+
+
+WITH RECURSIVE search_tree(id, link, data, path) AS (
+SELECT t.id, t.link, t.data, ARRAY[t.id]
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, path || t.id
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY path;
+
+   
+
+  
+   In the general case where more than one field needs to be used to identify
+   a row, use an array of rows.  For example, if we needed to track fields
+   f1 and f2:
+
+
+WITH RECURSIVE search_tree(id, link, data, path) AS (
+SELECT t.id, t.link, t.data, ARRAY[ROW(g.f1, g.f2)]
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, path || ROW(g.f1, g.f2)
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY path;
+
+  
+
+  
+   
+Omit the ROW() syntax in the common case where only one
+field needs to be tracked.  This allows a simple array rather than a
+composite-type array to be used, gaining efficiency.
+   
+  
+
+  
+   To create a breadth-first order, you can add a column that tracks the depth
+   of the search, for example:
+
+
+WITH RECURSIVE search_tree(id, link, data, level) AS (
+SELECT t.id, t.link, t.data, 0
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, level + 1
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY level;
+
+
+   To get a stable sort, add data columns as secondary sorting columns.
+  
+
+  
+   
+The recursive query evaluation algorithm produces its output in
+breadth-firs

Re: SEARCH and CYCLE clauses

2020-05-22 Thread Peter Eisentraut

On 2020-05-22 12:40, Vik Fearing wrote:

If you specify null, then the cycle check expression will always fail,
so it will abort after the first row.  (We should perhaps prohibit
specifying null, but see below.)


I would rather make the cycle check expression work with null.


It works correctly AFAICT.  What is your expectation?


This is something we need to think about.  If we want to check at parse
time whether the two values are not the same (and perhaps not null),
then we either need to restrict the generality of what we can specify,
or we need to be prepared to do full expression evaluation in the
parser.  A simple and practical way might be to only allow string and
boolean literal.  I don't have a strong opinion here.



I'm with Pavel on this one.  Why does it have to be a parse-time error?
  Also, I regularly see people write functions as sort of pauper's
variables, but a function call isn't allowed here.


If not parse-time error, at what time do you want to check it?


As an improvement over the spec, I think the vast majority of people
will be using simple true/false values.  Can we make that optional?

 CYCLE f, t SET is_cycle USING path

would be the same as

 CYCLE f, t SET is_cycle TO true DEFAULT false USING path


I was also considering that.  It would be an easy change to make.

(Apparently, in DB2 you can omit the USING path clause.  Not sure how to 
make that work, however.)


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SEARCH and CYCLE clauses

2020-05-22 Thread Vik Fearing
On 5/22/20 11:24 AM, Peter Eisentraut wrote:
> On 2020-05-20 21:28, Vik Fearing wrote:
>> 1)
>> There are some smart quotes in the comments that should be converted to
>> single quotes.
> 
> ok, fixing that
> 
>> 2)
>> This query is an infinite loop, as expected:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>    table a;
>>
>> But it becomes an error when you add a cycle clause to it:
>>
>>    with recursive a as (select 1 as b union all table a)
>>  cycle b set c to true default false using p
>>    table a;
>>
>>    ERROR:  each UNION query must have the same number of columns
> 
> table a expands to select * from a, and if you have a cycle clause, then
> a has three columns, but the other branch of the union only has one, so
> that won't work anymore, will it?

It seems there was a copy/paste error here.  The first query should have
been the same as the second but without the cycle clause.

It seems strange to me that adding a  would
break a previously working query.  I would rather see the * expanded
before adding the new columns.  This is a user's opinion, I don't know
how hard that would be to implement.

>> 3)
>> If I take the same infinite loop query but replace the TABLE syntax with
>> a SELECT and add a cycle clause, it's not an infinite loop anymore.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>  cycle b set c to true default false using p
>>    table a;
>>
>>   b | c | p
>> ---+---+---
>>   1 | f | {(1)}
>>   1 | t | {(1),(1)}
>> (2 rows)
>>
>> Why does it stop?  It should still be an infinite loop.
> 
> If you specify the cycle clause, then the processing will stop if it
> sees the same row more than once, which it did here.

Yes, this was a misplaced expectation on my part.

>> 4)
>> If I use NULL instead of false, I only get one row back.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>  cycle b set c to true default false using p
>>    table a;
>>
>>   b | c |   p
>> ---+---+---
>>   1 |   | {(1)}
>> (1 row)
> 
> If you specify null, then the cycle check expression will always fail,
> so it will abort after the first row.  (We should perhaps prohibit
> specifying null, but see below.)

I would rather make the cycle check expression work with null.

>> 5)
>> I can set both states to the same value.
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>  cycle b set c to true default true using p
>>    table a;
> 
>> This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
>>   BTW, I applaud your decision to violate the other part of that rule and
>> allowing any data type here.
>>
>>
>> 5)
>> The same rule as above says that the value and the default value must be
>> literals but not everything that a human might consider a literal is
>> accepted.  In particular:
>>
>>    with recursive a as (select 1 as b union all select b from a)
>>  cycle b set c to 1 default -1 using p
>>    table a;
>>
>>    ERROR:  syntax error at or near "-"
>>
>> Can we just accept a full a_expr here instead of AexprConst?  Both
>> DEFAULT and USING are fully reserved keywords.
> 
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.


I'm with Pavel on this one.  Why does it have to be a parse-time error?
 Also, I regularly see people write functions as sort of pauper's
variables, but a function call isn't allowed here.



Another bug I found.  If I try to do your regression test as an
autonomous query, I get this:

with recursive

graph (f, t, label) as (
values (1, 2, 'arc 1 -> 2'),
   (1, 3, 'arc 1 -> 3'),
   (2, 3, 'arc 2 -> 3'),
   (1, 4, 'arc 1 -> 4'),
   (4, 5, 'arc 4 -> 5'),
   (5, 1, 'arc 5 -> 1')
),

search_graph (f, t, label) as (
select * from graph g
union all
select g.*
from graph g, search_graph sg
where g.f = sg.t
)
cycle f, t set is_cycle to true default false using path

select * from search_graph;

ERROR:  could not find CTE "graph"



As an improvement over the spec, I think the vast majority of people
will be using simple true/false values.  Can we make that optional?

CYCLE f, t SET is_cycle USING path

would be the same as

CYCLE f, t SET is_cycle TO true DEFAULT false USING path
-- 
Vik Fearing




Re: SEARCH and CYCLE clauses

2020-05-22 Thread Pavel Stehule
pá 22. 5. 2020 v 11:25 odesílatel Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> napsal:

> On 2020-05-20 21:28, Vik Fearing wrote:
> > 1)
> > There are some smart quotes in the comments that should be converted to
> > single quotes.
>
> ok, fixing that
>
> > 2)
> > This query is an infinite loop, as expected:
> >
> >with recursive a as (select 1 as b union all select b from a)
> >table a;
> >
> > But it becomes an error when you add a cycle clause to it:
> >
> >with recursive a as (select 1 as b union all table a)
> >  cycle b set c to true default false using p
> >table a;
> >
> >ERROR:  each UNION query must have the same number of columns
>
> table a expands to select * from a, and if you have a cycle clause, then
> a has three columns, but the other branch of the union only has one, so
> that won't work anymore, will it?
>
> > 3)
> > If I take the same infinite loop query but replace the TABLE syntax with
> > a SELECT and add a cycle clause, it's not an infinite loop anymore.
> >
> >with recursive a as (select 1 as b union all select b from a)
> >  cycle b set c to true default false using p
> >table a;
> >
> >   b | c | p
> > ---+---+---
> >   1 | f | {(1)}
> >   1 | t | {(1),(1)}
> > (2 rows)
> >
> > Why does it stop?  It should still be an infinite loop.
>
> If you specify the cycle clause, then the processing will stop if it
> sees the same row more than once, which it did here.
>
> > 4)
> > If I use NULL instead of false, I only get one row back.
> >
> >with recursive a as (select 1 as b union all select b from a)
> >  cycle b set c to true default false using p
> >table a;
> >
> >   b | c |   p
> > ---+---+---
> >   1 |   | {(1)}
> > (1 row)
>
> If you specify null, then the cycle check expression will always fail,
> so it will abort after the first row.  (We should perhaps prohibit
> specifying null, but see below.)
>
> > 5)
> > I can set both states to the same value.
> >
> >with recursive a as (select 1 as b union all select b from a)
> >  cycle b set c to true default true using p
> >table a;
>
> > This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
> >   BTW, I applaud your decision to violate the other part of that rule and
> > allowing any data type here.
> >
> >
> > 5)
> > The same rule as above says that the value and the default value must be
> > literals but not everything that a human might consider a literal is
> > accepted.  In particular:
> >
> >with recursive a as (select 1 as b union all select b from a)
> >  cycle b set c to 1 default -1 using p
> >table a;
> >
> >ERROR:  syntax error at or near "-"
> >
> > Can we just accept a full a_expr here instead of AexprConst?  Both
> > DEFAULT and USING are fully reserved keywords.
>
> This is something we need to think about.  If we want to check at parse
> time whether the two values are not the same (and perhaps not null),
> then we either need to restrict the generality of what we can specify,
> or we need to be prepared to do full expression evaluation in the
> parser.  A simple and practical way might be to only allow string and
> boolean literal.  I don't have a strong opinion here.
>

if you check it in parse time, then you disallow parametrization there.

Is any reason to do this check in parse time?


> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>


Re: SEARCH and CYCLE clauses

2020-05-22 Thread Peter Eisentraut

On 2020-05-20 21:28, Vik Fearing wrote:

1)
There are some smart quotes in the comments that should be converted to
single quotes.


ok, fixing that


2)
This query is an infinite loop, as expected:

   with recursive a as (select 1 as b union all select b from a)
   table a;

But it becomes an error when you add a cycle clause to it:

   with recursive a as (select 1 as b union all table a)
 cycle b set c to true default false using p
   table a;

   ERROR:  each UNION query must have the same number of columns


table a expands to select * from a, and if you have a cycle clause, then 
a has three columns, but the other branch of the union only has one, so 
that won't work anymore, will it?



3)
If I take the same infinite loop query but replace the TABLE syntax with
a SELECT and add a cycle clause, it's not an infinite loop anymore.

   with recursive a as (select 1 as b union all select b from a)
 cycle b set c to true default false using p
   table a;

  b | c | p
---+---+---
  1 | f | {(1)}
  1 | t | {(1),(1)}
(2 rows)

Why does it stop?  It should still be an infinite loop.


If you specify the cycle clause, then the processing will stop if it 
sees the same row more than once, which it did here.



4)
If I use NULL instead of false, I only get one row back.

   with recursive a as (select 1 as b union all select b from a)
 cycle b set c to true default false using p
   table a;

  b | c |   p
---+---+---
  1 |   | {(1)}
(1 row)


If you specify null, then the cycle check expression will always fail, 
so it will abort after the first row.  (We should perhaps prohibit 
specifying null, but see below.)



5)
I can set both states to the same value.

   with recursive a as (select 1 as b union all select b from a)
 cycle b set c to true default true using p
   table a;



This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
  BTW, I applaud your decision to violate the other part of that rule and
allowing any data type here.


5)
The same rule as above says that the value and the default value must be
literals but not everything that a human might consider a literal is
accepted.  In particular:

   with recursive a as (select 1 as b union all select b from a)
 cycle b set c to 1 default -1 using p
   table a;

   ERROR:  syntax error at or near "-"

Can we just accept a full a_expr here instead of AexprConst?  Both
DEFAULT and USING are fully reserved keywords.


This is something we need to think about.  If we want to check at parse 
time whether the two values are not the same (and perhaps not null), 
then we either need to restrict the generality of what we can specify, 
or we need to be prepared to do full expression evaluation in the 
parser.  A simple and practical way might be to only allow string and 
boolean literal.  I don't have a strong opinion here.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SEARCH and CYCLE clauses

2020-05-20 Thread Vik Fearing
On 5/20/20 3:04 PM, Vik Fearing wrote:

> I'm looking forward to reviewing this.
A few quick things I've noticed so far:

1)
There are some smart quotes in the comments that should be converted to
single quotes.


2)
This query is an infinite loop, as expected:

  with recursive a as (select 1 as b union all select b from a)
  table a;

But it becomes an error when you add a cycle clause to it:

  with recursive a as (select 1 as b union all table a)
cycle b set c to true default false using p
  table a;

  ERROR:  each UNION query must have the same number of columns

The same error occurs with a search clause.


3)
If I take the same infinite loop query but replace the TABLE syntax with
a SELECT and add a cycle clause, it's not an infinite loop anymore.

  with recursive a as (select 1 as b union all select b from a)
cycle b set c to true default false using p
  table a;

 b | c | p
---+---+---
 1 | f | {(1)}
 1 | t | {(1),(1)}
(2 rows)

Why does it stop?  It should still be an infinite loop.


4)
If I use NULL instead of false, I only get one row back.

  with recursive a as (select 1 as b union all select b from a)
cycle b set c to true default false using p
  table a;

 b | c |   p
---+---+---
 1 |   | {(1)}
(1 row)


5)
I can set both states to the same value.

  with recursive a as (select 1 as b union all select b from a)
cycle b set c to true default true using p
  table a;

 b | c |   p
---+---+---
 1 | t | {(1)}
(1 row)

This is a direct violation of 7.18 SR 2.b.ii.3 as well as common sense.
 BTW, I applaud your decision to violate the other part of that rule and
allowing any data type here.


5)
The same rule as above says that the value and the default value must be
literals but not everything that a human might consider a literal is
accepted.  In particular:

  with recursive a as (select 1 as b union all select b from a)
cycle b set c to 1 default -1 using p
  table a;

  ERROR:  syntax error at or near "-"

Can we just accept a full a_expr here instead of AexprConst?  Both
DEFAULT and USING are fully reserved keywords.



That's all for now; will test more later.
Thanks for working on this!
-- 
Vik Fearing




Re: SEARCH and CYCLE clauses

2020-05-20 Thread Vik Fearing
On 5/20/20 1:46 PM, Peter Eisentraut wrote:
> I have implemented the SEARCH and CYCLE clauses.

YES!

> This is standard SQL syntax attached to a recursive CTE to compute a
> depth- or breadth-first ordering and cycle detection, respectively. This
> is just convenience syntax for what you can already do manually. The
> original discussion about recursive CTEs briefly mentioned these as
> something to do later but then it was never mentioned again.
> 
> SQL specifies these in terms of syntactic transformations, and so that's
> how I have implemented them also, mainly in the rewriter.

I've attempted to do this several times but didn't get anywhere with it.
 I'm looking forward to reviewing this.

(And maybe it will put me on the right path for implementing .)
-- 
Vik Fearing




SEARCH and CYCLE clauses

2020-05-20 Thread Peter Eisentraut

I have implemented the SEARCH and CYCLE clauses.

This is standard SQL syntax attached to a recursive CTE to compute a 
depth- or breadth-first ordering and cycle detection, respectively. 
This is just convenience syntax for what you can already do manually. 
The original discussion about recursive CTEs briefly mentioned these as 
something to do later but then it was never mentioned again.


SQL specifies these in terms of syntactic transformations, and so that's 
how I have implemented them also, mainly in the rewriter.


I have successfully tested this against examples I found online that 
were aimed at DB2.


The contained documentation and the code comment in rewriteHandler.c 
explain the details.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 787a7e9400e9d6ea3ab3e3f344422af56184575c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 20 May 2020 13:08:42 +0200
Subject: [PATCH v1] SEARCH and CYCLE clauses

---
 doc/src/sgml/queries.sgml| 204 +++-
 doc/src/sgml/ref/select.sgml |  41 ++
 src/backend/nodes/copyfuncs.c|  39 ++
 src/backend/nodes/equalfuncs.c   |  35 ++
 src/backend/nodes/nodeFuncs.c|   6 +
 src/backend/nodes/outfuncs.c |  35 ++
 src/backend/nodes/readfuncs.c|  43 ++
 src/backend/parser/gram.y|  56 ++-
 src/backend/parser/parse_cte.c   | 112 +
 src/backend/parser/parse_relation.c  |  30 +-
 src/backend/parser/parse_target.c|  21 +-
 src/backend/rewrite/rewriteHandler.c | 668 +++
 src/backend/utils/adt/ruleutils.c|  47 ++
 src/include/nodes/nodes.h|   2 +
 src/include/nodes/parsenodes.h   |  28 +-
 src/include/parser/kwlist.h  |   2 +
 src/test/regress/expected/with.out   | 304 
 src/test/regress/sql/with.sql| 152 ++
 18 files changed, 1797 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 572e968273..bcb2161c42 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1979,6 +1979,10 @@ SELECT in 
WITH
but we'd have needed two levels of nested sub-SELECTs.  
It's a bit
easier to follow this way.
   
+ 
+
+ 
+  Recursive Queries
 
   

@@ -2082,6 +2086,144 @@ Recursive Query Evaluation
 
   
 
+  
+   Search Order
+
+   
+When computing a tree traversal using a recursive query, you might want to
+order the results in either depth-first or breadth-first order.  This can
+be done by computing an ordering column alongside the other data columns
+and using that to sort the results at the end.  Note that this does not
+actually control in which order the query evaluation visits the rows; that
+is as always in SQL implementation-dependent.  This approach merely
+provides a convenient way to order the results afterwards.
+   
+
+   
+To create a depth-first order, we compute for each result row an array of
+rows that we have visited so far.  For example, consider the following
+query that searches a table tree using a
+link field:
+
+
+WITH RECURSIVE search_tree(id, link, data) AS (
+SELECT t.id, t.link, t.data
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree;
+
+
+To add depth-first ordering information, you can write this:
+
+
+WITH RECURSIVE search_tree(id, link, data, path) AS (
+SELECT t.id, t.link, t.data, ARRAY[t.id]
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, path || t.id
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY path;
+
+   
+
+  
+   In the general case where more than one field needs to be checked to
+   recognize a cycle, use an array of rows.  For example, if we needed to
+   track fields f1 and 
f2:
+
+
+WITH RECURSIVE search_tree(id, link, data, path) AS (
+SELECT t.id, t.link, t.data, ARRAY[ROW(g.f1, g.f2)]
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, path || ROW(g.f1, g.f2)
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY path;
+
+  
+
+  
+   
+Omit the ROW() syntax in the common case where only one
+field needs to be tracked.  This allows a simple array rather than a
+composite-type array to be used, gaining efficiency.
+   
+  
+
+  
+   To create a breadth-first order, you can add a column that tracks the depth
+   of the search, for example:
+
+
+WITH RECURSIVE search_tree(id, link, data, level) AS (
+SELECT t.id, t.link, t.data, 0
+FROM tree t
+  UNION ALL
+SELECT t.id, t.link, t.data, level + 1
+FROM tree t, search_tree st
+WHERE t.id = st.link
+)
+SELECT * FROM search_tree ORDER BY level;
+
+
+   To get a stable sort, add data columns as secondary sorting columns.
+  
+
+  
+   
+The recur