Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-22 Thread Neil Conway
Neil Conway wrote:
Attached is a revised patch.
Applied to HEAD.
-Neil
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-20 Thread Neil Conway
Tom Lane wrote:
Still a few bricks shy of a load I fear [...]
My apologies; thanks for the code review.
regression=# create or replace function foo() returns int language plpgsql as $$
regression$# begin
regression$#   return ;
regression$# end$$;
CREATE FUNCTION
regression=# select foo();
server closed the connection unexpectedly
If you're going to stick a NULL into the return's expression then you'd
better check for same when it's used.
Right. Better I think is to only allow a missing RETURN expression for 
functions declared to return void anyway; that catches the mistake at 
compile-time. I've implemented this behavior in the revised patch. The 
error message in this scenario is still a little poor:

create function missing_return_expr() returns int as $$
begin
return ;
end;$$ language plpgsql;
ERROR:  syntax error at end of input at character 8
QUERY:  SELECT
CONTEXT:  SQL statement in PL/PgSQL function missing_return_expr near 
line 2
LINE 1: SELECT
   ^

Is it worth putting in a special-case to look for an unexpected ';' in 
either the RETURN production or read_sql_construct() ?

What I was actually intending to check when I ran into that is why some
of the error paths in your FOR-loop rewrite use
plpgsql_error_lineno = $1;
while others have
plpgsql_error_lineno = plpgsql_scanner_lineno();
I suspect the former is more appropriate, at least for errors that
reference the FOR variable and not some other part of the statement.
Also why the inconsistent use of yyerror() vs ereport()?
Sorry, both of these result from sloppiness on my part: I think the code 
was originally correct, but I tried to refactor some of the more complex 
parts of the FOR statement production into a separate function, and 
eventually wasn't able to (because of the `forvariable' union member). 
When I gave up and reverted the code, I obviously wasn't careful enough.

Attached is a revised patch.
-Neil


plpgsql_cleanup-34.patch.gz
Description: application/gzip

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-10 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Wed, 2005-02-09 at 23:57 -0500, Tom Lane wrote:
 That seems like a step backwards from the current behavior [...]

 Hmm, fair enough. Is this better?

Yeah, looks better, though I question the use of embedded in the
message --- seems a bit jargony.  Are you going to post a revised patch?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-10 Thread Neil Conway
On Thu, 2005-02-10 at 10:15 -0500, Tom Lane wrote:
 Yeah, looks better, though I question the use of embedded in the
 message --- seems a bit jargony.  Are you going to post a revised patch?

Actually the code to present error messages as in the previous message
was in the previous patch, just #if 0'd out :)

Attached is a revised patch that removes embedded and updates the
regression tests. I'll apply this to HEAD later today barring any
further suggestions for improvement.

-Neil



plpgsql_cleanup-27.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-10 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Attached is a revised patch that removes embedded and updates the
 regression tests. I'll apply this to HEAD later today barring any
 further suggestions for improvement.

You've broken the FOR syntax.  You may not assume that the first token
of a FOR-over-SELECT is literally SELECT; it could for example be a left
parenthesis starting some kind of UNION construct.  This is why it's so
hard to do it right, and why the old way was so messy.  (As of CVS tip
it also works to do something like
for x in explain analyze select ...
I will grant that that didn't work before today, but it wasn't plpgsql's
fault that it didn't.)

I suggest you go back to the old parsing code for FOR.

I think it's a bad idea to entirely override the error context stack as
you do in check_sql_expr().  I can see the case for removing the
immediately previous entry, if you're sure it is
plpgsql_compile_error_callback(), but that doesn't translate to it being
a good idea to knock out any surrounding levels of context.

(Also I thought you were going to reword the embedded message?)

The head comment added to do_compile could stand some copy-editing :-(

Otherwise it looks pretty good...

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-10 Thread Neil Conway
On Thu, 2005-02-10 at 20:48 -0500, Tom Lane wrote:
 You've broken the FOR syntax.  You may not assume that the first token
 of a FOR-over-SELECT is literally SELECT; it could for example be a left
 parenthesis starting some kind of UNION construct.

Yeah, I was wondering if this would break anything, I just forgot to ask
about it :( Looking for two periods is pretty ugly. I was thinking we
might be able to look at the for loop variable: if it was previously
undeclared, it must be an integer for loop. If it was declared but is
not of a row or record type, it must also be an integer for loop. But
this is still ambiguous in the case of a declared row or record type --
it could either be a SELECT loop or an integer loop, and in the latter
case the loop variable would shadow the variable in the enclosing
block :(

Unless we can figure out a better way to do this, I'll just revert to
the old kludge.

 I think it's a bad idea to entirely override the error context stack as
 you do in check_sql_expr().  I can see the case for removing the
 immediately previous entry, if you're sure it is
 plpgsql_compile_error_callback(), but that doesn't translate to it being
 a good idea to knock out any surrounding levels of context.

Yes, that's a good point. I'll change the patch to just elide the
previous entry from the stack of callbacks, which is going to be
plpgsql_compile_error_callback (unfortunately we can't actually verify
that, AFAICS, since that callback is private to pl_comp.c)

-Neil



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-10 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 ... Looking for two periods is pretty ugly. I was thinking we
 might be able to look at the for loop variable: if it was previously
 undeclared, it must be an integer for loop. If it was declared but is
 not of a row or record type, it must also be an integer for loop.

Congratulations, you just reinvented the scheme we used before 8.0.
It's *not* an improvement.  The dot-dot business is better.  At least,
I'm not going to hold still for reverting this logic when there have
so far been zero field complaints about it, and there were plenty of
complaints about the test based on variable datatype.

 Yes, that's a good point. I'll change the patch to just elide the
 previous entry from the stack of callbacks, which is going to be
 plpgsql_compile_error_callback (unfortunately we can't actually verify
 that, AFAICS, since that callback is private to pl_comp.c)

IMHO verifying that is well worth an extern.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-09 Thread Neil Conway
Another version of the patch is attached. Changes:

- clean up grammar so that read_sql_construct() is always called to read
a well-formed SQL expression. That way we can check for errors in
embedded SQL by just adding a single function call to
read_sql_construct(), rather than adding calls at most of its call
sites.

- changed location of array overflow checks per Tom

- mostly fixed error message formatting for syntax errors in PL/PgSQL. I
found this part of the ereport() framework rather confusing. The patch
currently emits errors like:

create function bad_sql1() returns int as $$
declare a int;
begin
a := 5;
Johnny Yuma;
a := 10;
return a;
end$$ language 'plpgsql';
ERROR:  syntax error at or near Johnny
CONTEXT:  SQL statement embedded in PL/PgSQL function bad_sql1 near
line 4

Any suggestions for improvement would be welcome.

Barring any objections, I'd like to apply this patch to HEAD tomorrow.

-Neil



plpgsql_cleanup-26.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-09 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 create function bad_sql1() returns int as $$
 declare a int;
 begin
 a := 5;
 Johnny Yuma;
 a := 10;
 return a;
 end$$ language 'plpgsql';
 ERROR:  syntax error at or near Johnny
 CONTEXT:  SQL statement embedded in PL/PgSQL function bad_sql1 near
 line 4

That seems like a step backwards from the current behavior:

regression=# create function bad_sql1() returns int as $$
regression$# declare a int;
regression$# begin
regression$# a := 5;
regression$# Johnny Yuma;
regression$# a := 10;
regression$# return a;
regression$# end$$ language 'plpgsql';
CREATE FUNCTION
regression=# select bad_sql1();
ERROR:  syntax error at or near Johnny at character 1
QUERY:  Johnny Yuma
CONTEXT:  PL/pgSQL function bad_sql1 line 4 at SQL statement
LINE 1: Johnny Yuma
^
regression=# 

While the difference in information content may not seem great, it is a
big deal when you are talking about a small syntax error in a large SQL
command spanning many lines.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-09 Thread Neil Conway
On Wed, 2005-02-09 at 23:57 -0500, Tom Lane wrote:
 That seems like a step backwards from the current behavior [...]

Hmm, fair enough. Is this better?

create function bad_sql1() returns int as $$
declare a int;
begin
a := 5;
Johnny Yuma;
a := 10;
return a;
end$$ language 'plpgsql';
ERROR:  syntax error at or near Johnny at character 1
QUERY:  Johnny Yuma
CONTEXT:  SQL statement embedded in PL/PgSQL function bad_sql1 near
line 4
LINE 1: Johnny Yuma
^

-Neil



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-08 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 BTW, both of our fixes suffer from the deficiency that they will
 actually reject input one variable too early: we disallow a SQL
 statement with 1023 variables that we strictly speaking could store.

Right.  I thought about putting the overflow checks inside the switches
so that they wouldn't trigger on the case where we don't need another
variable ... but it doesn't seem worth the extra code to me either.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-08 Thread Neil Conway
On Tue, 2005-02-08 at 13:25 -0500, Tom Lane wrote:
 When you apply this, please put the remaining array-overflow checks
 before the overflow occurs, not after.

Actually, my original fix _does_ check for the overflow before it
occurs. ISTM both fixes are essentially identical, although yours may be
preferable as it is slightly less confusing.

BTW, both of our fixes suffer from the deficiency that they will
actually reject input one variable too early: we disallow a SQL
statement with 1023 variables that we strictly speaking could store. But
I don't see that as a major problem.

-Neil



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-07 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 - abandonded the falloc() idea. There really aren't that many
 short-lived allocations in the PL/PgSQL compiler, and using falloc()
 made it difficult to use List. Instead, make the CurrentMemoryContext
 the long-lived function context, and explicitly pfree short-term
 allocations. Not _all_ short-lived allocations are explicitly released;
 if this turns out to be a problem, it can be cleaned up later.

My recollection is that I was not nearly as worried about short-term
pallocs in the plpgsql code itself, as about leakage in various main-
backend routines that get called incidentally during parsing.
backend/parser/ is quite cavalier about this as a whole, because it
expects to be called in contexts that are not long-lived.  Have you
done any checking to ensure that the per-function context doesn't get
unreasonably bloated this way?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-07 Thread Neil Conway
On Mon, 2005-02-07 at 10:41 -0500, Tom Lane wrote:
 My recollection is that I was not nearly as worried about short-term
 pallocs in the plpgsql code itself, as about leakage in various main-
 backend routines that get called incidentally during parsing.
 backend/parser/ is quite cavalier about this as a whole, because it
 expects to be called in contexts that are not long-lived.

Hmmm. What about switching the CurrentMemoryContext to the function's
long-lived context when we invoke plpgsql_yyparse(), but keeping it as
the short-lived context during the rest of the compilation process? This
unfortunately complicates the memory management model still further, but
it should significantly reduce the chance of any memory leaks.

-Neil



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-07 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Mon, 2005-02-07 at 10:41 -0500, Tom Lane wrote:
 My recollection is that I was not nearly as worried about short-term
 pallocs in the plpgsql code itself, as about leakage in various main-
 backend routines that get called incidentally during parsing.

 Hmmm. What about switching the CurrentMemoryContext to the function's
 long-lived context when we invoke plpgsql_yyparse(), but keeping it as
 the short-lived context during the rest of the compilation process?

Doesn't plpgsql_yyparse cover pretty much all of it?  IIRC, a lot of
code in pl_comp.c is called from either the lexing or parsing code.
It's not nearly as well compartmentalized as in the main SQL parser :-(

What might work is to run in the function context as the basic state,
but switch to a short-term context when calling any main-backend code
from pl_comp.c.  I'm not sure how many such calls there are, but if it's
not more than a dozen or two then this wouldn't be horridly painful.
(he says, knowing *he* doesn't have to do it...)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-02-07 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 WRT calls to backend/parser, I can see LookupTypeName (pl_comp.c:1060),
 and parseTypeString (pl_comp.c:1627). Are these the only calls you had
 in mind, or am I missing some?

I haven't looked lately, but my recollection is that there are just a
few calls into the main backend from pl_comp.c.  I'm not sure they are
all to backend/parser though; check /catalog, etc as well.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-01-20 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Tue, 2005-01-18 at 01:02 -0500, Tom Lane wrote:
 It might be better to keep CurrentMemoryContext pointing at a temp
 context, and translate malloc() to MemoryContextAlloc(function_context)
 rather than just palloc().  (Of course you could hide this in a macro,
 maybe falloc()?)

 Are there really enough short-lived pallocs that this is worth the
 trouble?

Not sure, but it seems like at least as straightforward a translation
as the other way.  More to the point, it makes clear the difference
between what is meant to be a long-lived data structure and what isn't.

 One potential issue is that there are plenty of places where
 we'd want to falloc(), but don't have the function easily available
 (e.g. in the parser).

Why not?  You'd need to keep the context-to-use in a static variable,
but that's no great difficulty considering that plpgsql function
parsing needn't be re-entrant.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-01-20 Thread Neil Conway
On Thu, 2005-01-20 at 15:48 +1100, Neil Conway wrote:
 Attached is a revised patch (no major changes, just grammar cleanup).
 While rewriting some cruft in PL/PgSQL's gram.y, I noticed that we will
 overflow a heap-allocated array if the user specifies more than 1024
 parameters to a refcursor (a demonstration .sql file is also attached).
 I think this is probably worth backpatching to 8.0 (and earlier?) --
 I've attached a quick hack of a patch that fixes the issue, although of
 course the fix for 8.1 is nicer.

I've applied the minimal fix for the buffer overrun to REL7_4_STABLE and
REL8_0_STABLE (let me know if you think it's worth backpatching
further).

I'll post a revised patch for HEAD that does the falloc() stuff soon.

-Neil



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] WIP: pl/pgsql cleanup

2005-01-17 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 This patch makes a number of cleanups to PL/PgSQL:

 - replaced all uses of malloc/strdup with palloc/pstrdup.
 ...
 (This was surprisingly easy, btw, so I am suspect that I've missed
 something fundamental -- hence the patch is marked WIP. Guidance would
 be welcome.)

I think that the existing parsing code feels free to palloc junk data
that needn't be kept around as part of the finished function definition.
It might be better to keep CurrentMemoryContext pointing at a temp
context, and translate malloc() to MemoryContextAlloc(function_context)
rather than just palloc().  (Of course you could hide this in a macro,
maybe falloc()?)

Other than that consideration, I never thought it would be complex to do
this, just tedious.  As long as you're sure you caught everyplace that
needs to change ...

I don't have time to study the diff now, but your summary sounds good
on the other points.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend