Re: PL/pgSQL cursors should get generated portal names by default

2023-01-09 Thread Pavel Stehule
Hi

I wrote a new check in plpgsql_check, that tries to identify explicit work
with the name of the referenced portal.

create or replace function foo01()
returns refcursor as $$#option dump
declare
  c cursor for select 1;
  r refcursor;
begin
  open c;
  r := 'c';
  return r;
end;
$$ language plpgsql;
CREATE FUNCTION
(2023-01-09 16:49:10) postgres=# select * from
plpgsql_check_function('foo01', compatibility_warnings => true);
┌───┐
│  plpgsql_check_function
│
╞═══╡
│ compatibility:0:7:assignment:obsolete setting of refcursor or cursor
variable │
│ Detail: Internal name of cursor should not be specified by users.
│
│ Context: at assignment to variable "r" declared on line 4
│
│ warning extra:0:3:DECLARE:never read variable "c"
│
└───┘
(4 rows)

Regards

Pavel


Re: PL/pgSQL cursors should get generated portal names by default

2022-11-08 Thread Kirk Wolak
On Mon, Nov 7, 2022 at 11:10 AM Jan Wieck  wrote:

> On 11/4/22 19:46, Tom Lane wrote:
> > Jan Wieck  writes:
> >> I need to do some testing on this. I seem to recall that the naming was
> >> originally done because a reference cursor is basically a named cursor
> >> that can be handed around between functions and even the top SQL level
> >> of the application. For the latter to work the application needs to
> know
> >> the name of the portal.
> >
> > Right.  With this patch, it'd be necessary to hand back the actual
> > portal name (by returning the refcursor value), or else manually
> > set the refcursor value before OPEN to preserve the previous behavior.
> > But as far as I saw, all our documentation examples show handing back
> > the portal name, so I'm hoping most people do it like that already.
>
> I was mostly concerned that we may unintentionally break underdocumented
> behavior that was originally implemented on purpose. As long as everyone
> is aware that this is breaking backwards compatibility in the way it
> does, that's fine.
>

I respect the concern, and applied some deeper thinking to it...

Here is the logic I am applying to this compatibility issue and what may
break.
[FWIW, my motto is to be wrong out  loud, as you learn faster]

At first pass, I thought "Well, since this does not break a refcursor,
which is the obvious use case for RETURNING/PASSING, we are fine!"

But in trying to DEFEND this case, I have come up with example of code
(that makes some SENSE, but would break):

CREATE FUNCTION test() RETURNS refcursor() LANGUAGE plpgsql AS  $$
DECLARE
cur_this cursor FOR SELECT 1;
ref_cur refcursor;
BEGIN
OPEN cur_this;
ref_cur := 'cur_this';  -- Using the NAME of the cursor as the portal
name: Should do:  ref_cur := cur_this; -- Only works after OPEN
RETURN ref_cur;
END;
$$;

As noted in the comments.  If the code were:
  ref_cur := 'cur_this';  -- Now you can't just use ref_cur := cur_this;
  OPEN cur_this;
  RETURN ref_cur;
Then it would break now...  And even the CORRECT syntax would break, since
the cursor was not opened, so "cur_this" is null.

Now, I have NO IDEA if someone would actually do this.  It is almost
pathological.  The use case would be a complex cursor with parameters,
and they changed the code to return a refcursor!
This was the ONLY use case I could think of that wasn't HACKY!

HACKY use cases involve a child routine setting:  local_ref_cursor :=
'cur_this'; in order to access a cursor that was NOT passed to the child.
FWIW, I tested this, and it works, and I can FETCH in the child routine,
and it affects the parents' LOOP as it should... WOW.  I would be HAPPY
to break such horrible code, it has to be a security concern at some level.

Personally (and my 2 cents really shouldn't matter much), I think this
should still be fixed.
Because I believe this small use case is rare, it will break immediately,
and the fix is trivial (just initialize cur_this := 'cur_this'  in this
example),
and the fix removes the Orthogonal Behavior Tom pointed out, which led me
to reporting this.

I think I have exhausted examples of how this impacts a VALID
refcursor implementation.  I believe any other such versions are variations
of this!
And maybe we document that if a refcursor of a cursor is to be returned,
that the refcursor is ASSIGNED after the OPEN of the cursor, and it is done
without the quotes, as:
  ref_cursor := cur_this;  -- assign the name after opening.

Thanks!


Re: PL/pgSQL cursors should get generated portal names by default

2022-11-07 Thread Jan Wieck
My comments were in no way meant as an argument for or against the 
change itself. Only to clearly document the side effect it will have.



Regards, Jan


On 11/7/22 11:57, Kirk Wolak wrote:



On Mon, Nov 7, 2022 at 11:10 AM Jan Wieck > wrote:


On 11/4/22 19:46, Tom Lane wrote:
 > Jan Wieck mailto:j...@wi3ck.info>> writes:
 >> I need to do some testing on this. I seem to recall that the
naming was
 >> originally done because a reference cursor is basically a named
cursor
 >> that can be handed around between functions and even the top SQL
level
 >> of the application. For the latter to work the application needs
to know
 >> the name of the portal.
 >
 > Right.  With this patch, it'd be necessary to hand back the actual
 > portal name (by returning the refcursor value), or else manually
 > set the refcursor value before OPEN to preserve the previous
behavior.
 > But as far as I saw, all our documentation examples show handing back
 > the portal name, so I'm hoping most people do it like that already.

I was mostly concerned that we may unintentionally break
underdocumented
behavior that was originally implemented on purpose. As long as
everyone
is aware that this is breaking backwards compatibility in the way it
does, that's fine.


I respect the concern, and applied some deeper thinking to it...

Here is the logic I am applying to this compatibility issue and what may 
break.

[FWIW, my motto is to be wrong out  loud, as you learn faster]

At first pass, I thought "Well, since this does not break a refcursor, 
which is the obvious use case for RETURNING/PASSING, we are fine!"


But in trying to DEFEND this case, I have come up with example of code 
(that makes some SENSE, but would break):


CREATE FUNCTION test() RETURNS refcursor() LANGUAGE plpgsql AS  $$
DECLARE
     cur_this cursor FOR SELECT 1;
     ref_cur refcursor;
BEGIN
     OPEN cur_this;
     ref_cur := 'cur_this';  -- Using the NAME of the cursor as the 
portal name: Should do:  ref_cur := cur_this; -- Only works after OPEN

     RETURN ref_cur;
END;
$$;

As noted in the comments.  If the code were:
   ref_cur := 'cur_this';  -- Now you can't just use ref_cur := cur_this;
   OPEN cur_this;
   RETURN ref_cur;
Then it would break now...  And even the CORRECT syntax would break, 
since the cursor was not opened, so "cur_this" is null.


Now, I have NO IDEA if someone would actually do this.  It is almost 
pathological.  The use case would be a complex cursor with parameters,

and they changed the code to return a refcursor!
This was the ONLY use case I could think of that wasn't HACKY!

HACKY use cases involve a child routine setting:  local_ref_cursor := 
'cur_this'; in order to access a cursor that was NOT passed to the child.
FWIW, I tested this, and it works, and I can FETCH in the child routine, 
and it affects the parents' LOOP as it should... WOW.  I would be HAPPY

to break such horrible code, it has to be a security concern at some level.

Personally (and my 2 cents really shouldn't matter much), I think this 
should still be fixed.
Because I believe this small use case is rare, it will break 
immediately, and the fix is trivial (just initialize cur_this := 
'cur_this'  in this example),
and the fix removes the Orthogonal Behavior Tom pointed out, which led 
me to reporting this.


I think I have exhausted examples of how this impacts a VALID 
refcursor implementation.  I believe any other such versions are 
variations of this!
And maybe we document that if a refcursor of a cursor is to be returned, 
that the refcursor is ASSIGNED after the OPEN of the cursor, and it is 
done without the quotes, as:

   ref_cursor := cur_this;  -- assign the name after opening.

Thanks!








Re: PL/pgSQL cursors should get generated portal names by default

2022-11-07 Thread Pavel Stehule
Dne po 7. 11. 2022 17:10 uživatel Jan Wieck  napsal:

> On 11/4/22 19:46, Tom Lane wrote:
> > Jan Wieck  writes:
> >> I need to do some testing on this. I seem to recall that the naming was
> >> originally done because a reference cursor is basically a named cursor
> >> that can be handed around between functions and even the top SQL level
> >> of the application. For the latter to work the application needs to
> know
> >> the name of the portal.
> >
> > Right.  With this patch, it'd be necessary to hand back the actual
> > portal name (by returning the refcursor value), or else manually
> > set the refcursor value before OPEN to preserve the previous behavior.
> > But as far as I saw, all our documentation examples show handing back
> > the portal name, so I'm hoping most people do it like that already.
>
> I was mostly concerned that we may unintentionally break underdocumented
> behavior that was originally implemented on purpose. As long as everyone
> is aware that this is breaking backwards compatibility in the way it
> does, that's fine.
>

In this case I see current behaviors little bit unhappy. It breaks any
recursive call, it can break variable shadowing, so I prefer change. The
possibility of compatibility break is clean, but there is an possibility of
easy fix, and I think I can detect some possibly not compatible usage in
plpgsql_check.

The dependency on current behavior can be probably just for pretty old
application that doesn't use refcursors.

Regards

Pavel


> >
> >> I am currently down with Covid and have trouble focusing. But I hope to
> >> get to it some time next week.
> >
> > Get well soon!
>
> Thanks, Jan
>
>


Re: PL/pgSQL cursors should get generated portal names by default

2022-11-07 Thread Jan Wieck

On 11/4/22 19:46, Tom Lane wrote:

Jan Wieck  writes:
I need to do some testing on this. I seem to recall that the naming was 
originally done because a reference cursor is basically a named cursor 
that can be handed around between functions and even the top SQL level 
of the application. For the latter to work the application needs to know 
the name of the portal.


Right.  With this patch, it'd be necessary to hand back the actual
portal name (by returning the refcursor value), or else manually
set the refcursor value before OPEN to preserve the previous behavior.
But as far as I saw, all our documentation examples show handing back
the portal name, so I'm hoping most people do it like that already.


I was mostly concerned that we may unintentionally break underdocumented 
behavior that was originally implemented on purpose. As long as everyone 
is aware that this is breaking backwards compatibility in the way it 
does, that's fine.




I am currently down with Covid and have trouble focusing. But I hope to 
get to it some time next week.


Get well soon!


Thanks, Jan





Re: PL/pgSQL cursors should get generated portal names by default

2022-11-04 Thread Tom Lane
Jan Wieck  writes:
> I need to do some testing on this. I seem to recall that the naming was 
> originally done because a reference cursor is basically a named cursor 
> that can be handed around between functions and even the top SQL level 
> of the application. For the latter to work the application needs to know 
> the name of the portal.

Right.  With this patch, it'd be necessary to hand back the actual
portal name (by returning the refcursor value), or else manually
set the refcursor value before OPEN to preserve the previous behavior.
But as far as I saw, all our documentation examples show handing back
the portal name, so I'm hoping most people do it like that already.

> I am currently down with Covid and have trouble focusing. But I hope to 
> get to it some time next week.

Get well soon!

regards, tom lane




Re: PL/pgSQL cursors should get generated portal names by default

2022-11-04 Thread Jan Wieck

On 11/4/22 03:22, Pavel Stehule wrote:

Hi


st 2. 11. 2022 v 0:39 odesílatel Tom Lane > napsal:


There's a complaint at [1] about how you can't re-use the same
cursor variable name within a routine called from another routine
that's already using that name.  The complaint is itself a bit
under-documented, but I believe it is referring to this ancient
bit of behavior:

          A bound cursor variable is initialized to the string value
          representing its name, so that the portal name is the same as
          the cursor variable name, unless the programmer overrides it
          by assignment before opening the cursor.

So if you try to nest usage of two bound cursor variables of the
same name, it blows up on the portal-name conflict.  But it'll work
fine if you use unbound cursors (i.e., plain "refcursor" variables):

          But an unbound cursor
          variable defaults to the null value initially, so it will
receive
          an automatically-generated unique name, unless overridden.

I wonder why we did it like that; maybe it's to be bug-compatible with
some Oracle PL/SQL behavior or other?  Anyway, this seems non-orthogonal
and contrary to all principles of structured programming.  We don't even
offer an example of the sort of usage that would benefit from it, ie
that calling code could "just know" what the portal name is.

I propose that we should drop this auto initialization and let all
refcursor variables start out null, so that they'll get unique
portal names unless you take explicit steps to do something else.
As attached.

(Obviously this would be a HEAD-only fix, but maybe there's scope for
improving the back-branch docs along lines similar to these changes.)


I am sending an review of this patch

1. The patching, compilation without any problems
2. All tests passed
3. The implemented change is documented well
4. Although this is potencial compatibility break, we want this feature. 
It allows to use cursors variables in recursive calls by default, it 
allows shadowing of cursor variables

5. This patch is short and almost trivial, just remove code.

I'll mark this patch as ready for commit


I need to do some testing on this. I seem to recall that the naming was 
originally done because a reference cursor is basically a named cursor 
that can be handed around between functions and even the top SQL level 
of the application. For the latter to work the application needs to know 
the name of the portal.


I am currently down with Covid and have trouble focusing. But I hope to 
get to it some time next week.



Regards, Jan





Re: PL/pgSQL cursors should get generated portal names by default

2022-11-04 Thread Pavel Stehule
Hi


st 2. 11. 2022 v 0:39 odesílatel Tom Lane  napsal:

> There's a complaint at [1] about how you can't re-use the same
> cursor variable name within a routine called from another routine
> that's already using that name.  The complaint is itself a bit
> under-documented, but I believe it is referring to this ancient
> bit of behavior:
>
>  A bound cursor variable is initialized to the string value
>  representing its name, so that the portal name is the same as
>  the cursor variable name, unless the programmer overrides it
>  by assignment before opening the cursor.
>
> So if you try to nest usage of two bound cursor variables of the
> same name, it blows up on the portal-name conflict.  But it'll work
> fine if you use unbound cursors (i.e., plain "refcursor" variables):
>
>  But an unbound cursor
>  variable defaults to the null value initially, so it will receive
>  an automatically-generated unique name, unless overridden.
>
> I wonder why we did it like that; maybe it's to be bug-compatible with
> some Oracle PL/SQL behavior or other?  Anyway, this seems non-orthogonal
> and contrary to all principles of structured programming.  We don't even
> offer an example of the sort of usage that would benefit from it, ie
> that calling code could "just know" what the portal name is.
>
> I propose that we should drop this auto initialization and let all
> refcursor variables start out null, so that they'll get unique
> portal names unless you take explicit steps to do something else.
> As attached.
>
> (Obviously this would be a HEAD-only fix, but maybe there's scope for
> improving the back-branch docs along lines similar to these changes.)
>
>
I am sending an review of this patch

1. The patching, compilation without any problems
2. All tests passed
3. The implemented change is documented well
4. Although this is potencial compatibility break, we want this feature. It
allows to use cursors variables in recursive calls by default, it allows
shadowing of cursor variables
5. This patch is short and almost trivial, just remove code.

I'll mark this patch as ready for commit

Regards

Pavel



> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/166689990972.627.16269382598283029015%40wrigleys.postgresql.org
>
>


Re: PL/pgSQL cursors should get generated portal names by default

2022-11-01 Thread Pavel Stehule
st 2. 11. 2022 v 0:39 odesílatel Tom Lane  napsal:

> There's a complaint at [1] about how you can't re-use the same
> cursor variable name within a routine called from another routine
> that's already using that name.  The complaint is itself a bit
> under-documented, but I believe it is referring to this ancient
> bit of behavior:
>
>  A bound cursor variable is initialized to the string value
>  representing its name, so that the portal name is the same as
>  the cursor variable name, unless the programmer overrides it
>  by assignment before opening the cursor.
>
> So if you try to nest usage of two bound cursor variables of the
> same name, it blows up on the portal-name conflict.  But it'll work
> fine if you use unbound cursors (i.e., plain "refcursor" variables):
>
>  But an unbound cursor
>  variable defaults to the null value initially, so it will receive
>  an automatically-generated unique name, unless overridden.
>
> I wonder why we did it like that; maybe it's to be bug-compatible with
> some Oracle PL/SQL behavior or other?  Anyway, this seems non-orthogonal
> and contrary to all principles of structured programming.  We don't even
> offer an example of the sort of usage that would benefit from it, ie
> that calling code could "just know" what the portal name is.
>
> I propose that we should drop this auto initialization and let all
> refcursor variables start out null, so that they'll get unique
> portal names unless you take explicit steps to do something else.
> As attached.
>

+1


> (Obviously this would be a HEAD-only fix, but maybe there's scope for
> improving the back-branch docs along lines similar to these changes.)
>

+1

I agree with this proposal. The current behavior breaks the nesting
concept.

Unfortunately, it can breaks back compatibility, but I think so I am
possible to detect phony usage of cursor's variables in plpgsql_check

Regards

Pavel



> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/166689990972.627.16269382598283029015%40wrigleys.postgresql.org
>
>


PL/pgSQL cursors should get generated portal names by default

2022-11-01 Thread Tom Lane
There's a complaint at [1] about how you can't re-use the same
cursor variable name within a routine called from another routine
that's already using that name.  The complaint is itself a bit
under-documented, but I believe it is referring to this ancient
bit of behavior:

 A bound cursor variable is initialized to the string value
 representing its name, so that the portal name is the same as
 the cursor variable name, unless the programmer overrides it
 by assignment before opening the cursor.

So if you try to nest usage of two bound cursor variables of the
same name, it blows up on the portal-name conflict.  But it'll work
fine if you use unbound cursors (i.e., plain "refcursor" variables):

 But an unbound cursor
 variable defaults to the null value initially, so it will receive
 an automatically-generated unique name, unless overridden.

I wonder why we did it like that; maybe it's to be bug-compatible with
some Oracle PL/SQL behavior or other?  Anyway, this seems non-orthogonal
and contrary to all principles of structured programming.  We don't even
offer an example of the sort of usage that would benefit from it, ie
that calling code could "just know" what the portal name is.

I propose that we should drop this auto initialization and let all
refcursor variables start out null, so that they'll get unique
portal names unless you take explicit steps to do something else.
As attached.

(Obviously this would be a HEAD-only fix, but maybe there's scope for
improving the back-branch docs along lines similar to these changes.)

regards, tom lane

[1] 
https://www.postgresql.org/message-id/166689990972.627.16269382598283029015%40wrigleys.postgresql.org

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index d85f89bf30..89871f4f25 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -3177,7 +3177,9 @@ DECLARE
 
  Before a cursor can be used to retrieve rows, it must be
  opened. (This is the equivalent action to the SQL
- command DECLARE CURSOR.) PL/pgSQL has
+ command DECLARE
+ CURSOR.)
+ PL/pgSQL has
  three forms of the OPEN statement, two of which use unbound
  cursor variables while the third uses a bound cursor variable.
 
@@ -3187,9 +3189,28 @@ DECLARE
   Bound cursor variables can also be used without explicitly opening the cursor,
   via the FOR statement described in
   .
+  A FOR loop will open the cursor and then
+  close it again when the loop completes.
  
 
 
+
+ portal
+ in PL/pgSQL
+
+
+
+ Opening a cursor involves creating a server-internal data structure
+ called a portal, which holds the execution
+ state for the cursor's query.  A portal has a name, which must be
+ unique within the session for the duration of the portal's existence.
+ By default, PL/pgSQL will assign a unique
+ name to each portal it creates.  However, if you assign a non-null
+ string value to a cursor variable, that string will be used as its
+ portal name.  This feature can be used as described in
+ .
+
+
 
  OPEN FOR query
 
@@ -3338,7 +3359,7 @@ BEGIN
  opened the cursor to begin with.  You can return a refcursor
  value out of a function and let the caller operate on the cursor.
  (Internally, a refcursor value is simply the string name
- of a so-called portal containing the active query for the cursor.  This name
+ of the portal containing the active query for the cursor.  This name
  can be passed around, assigned to other refcursor variables,
  and so on, without disturbing the portal.)
 
@@ -3480,7 +3501,7 @@ CLOSE curs1;

  
 
-
+
  Returning Cursors
 

@@ -3500,7 +3521,8 @@ CLOSE curs1;
 simply assign a string to the refcursor variable before
 opening it.  The string value of the refcursor variable
 will be used by OPEN as the name of the underlying portal.
-However, if the refcursor variable is null,
+However, if the refcursor variable is null (as it
+will be by default), then
 OPEN automatically generates a name that does not
 conflict with any existing portal, and assigns it to the
 refcursor variable.
@@ -3508,12 +3530,12 @@ CLOSE curs1;
 

 
- A bound cursor variable is initialized to the string value
- representing its name, so that the portal name is the same as
- the cursor variable name, unless the programmer overrides it
- by assignment before opening the cursor.  But an unbound cursor
- variable defaults to the null value initially, so it will receive
- an automatically-generated unique name, unless overridden.
+ Prior to PostgreSQL 16, bound cursor
+ variables were initialized to contain their own names, rather
+ than being l