[HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane
On my way to do something else entirely, I came across a couple of
things that are not very nice about psql's lexer:

1. Somebody broke the no-backtracking property back in 9.0 while adding
quoted variable substitution.  According to the flex manual, use of
backtracking creates a performance penalty.  We once measured the
backend's lexer as being about a third faster with backtrack avoidance,
and presumably it's about the same for psql's.  This is not hard to fix,
but should I consider it a bug fix and back-patch?  We've not had
complaints about psql getting slower as of 9.0.

2. The lexer rules associated with variable substitution think that
variable names can consist only of ASCII letters and digits (and
underscores).  The psql manual is noncommittal about whether non-ASCII
characters are allowed, but a reasonable person would think that the
rules ought to be the same as the backend's idea of what an identifier
is.  Does anybody have a problem with improving that?  (I'm not
proposing this part as a bug fix, because it does look a little bit
more invasive to fix, because of the way psql deals with unsafe
multibyte encodings.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Robert Haas
On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On my way to do something else entirely, I came across a couple of
 things that are not very nice about psql's lexer:

 1. Somebody broke the no-backtracking property back in 9.0 while adding
 quoted variable substitution.  According to the flex manual, use of
 backtracking creates a performance penalty.  We once measured the
 backend's lexer as being about a third faster with backtrack avoidance,
 and presumably it's about the same for psql's.  This is not hard to fix,
 but should I consider it a bug fix and back-patch?  We've not had
 complaints about psql getting slower as of 9.0.

That may well have been me.  How would I have known that I broke it?

Also, how invasive is the fix?

 2. The lexer rules associated with variable substitution think that
 variable names can consist only of ASCII letters and digits (and
 underscores).  The psql manual is noncommittal about whether non-ASCII
 characters are allowed, but a reasonable person would think that the
 rules ought to be the same as the backend's idea of what an identifier
 is.  Does anybody have a problem with improving that?

Nope.  Or at least, I don't.

 (I'm not
 proposing this part as a bug fix, because it does look a little bit
 more invasive to fix, because of the way psql deals with unsafe
 multibyte encodings.)

+1 for not back-patching this part.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. Somebody broke the no-backtracking property back in 9.0 while adding
 quoted variable substitution.  According to the flex manual, use of
 backtracking creates a performance penalty.  We once measured the
 backend's lexer as being about a third faster with backtrack avoidance,
 and presumably it's about the same for psql's.  This is not hard to fix,
 but should I consider it a bug fix and back-patch?  We've not had
 complaints about psql getting slower as of 9.0.

 That may well have been me.

[ checks git blame ]  Well, you commmitted the patch anyway: d0cfc018.

 How would I have known that I broke it?

Per the header comments in the backend lexer, you should run flex with
-b switch and verify that the resulting lex.backup file says no
backing up.  I've occasionally thought about automating that, but I'm
not sure if the output is entirely locale- and flex-version-independent.

 Also, how invasive is the fix?

We need to add a couple more rules that will match an unterminated
quoted variable and do something reasonable (probably just throw back
everything but the colon with yyless).  I've not coded it but I think
it can't be more than a dozen lines or so.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Alvaro Herrera
Excerpts from Tom Lane's message of jue ago 25 14:00:57 -0300 2011:
 Robert Haas robertmh...@gmail.com writes:
  On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  1. Somebody broke the no-backtracking property back in 9.0 while adding
  quoted variable substitution.  According to the flex manual, use of
  backtracking creates a performance penalty.  We once measured the
  backend's lexer as being about a third faster with backtrack avoidance,
  and presumably it's about the same for psql's.  This is not hard to fix,
  but should I consider it a bug fix and back-patch?  We've not had
  complaints about psql getting slower as of 9.0.
 
  That may well have been me.
 
 [ checks git blame ]  Well, you commmitted the patch anyway: d0cfc018.
 
  How would I have known that I broke it?
 
 Per the header comments in the backend lexer, you should run flex with
 -b switch and verify that the resulting lex.backup file says no
 backing up.  I've occasionally thought about automating that, but I'm
 not sure if the output is entirely locale- and flex-version-independent.

It is locale dependent, though of course for the automated check you
could just run flex under LC_ALL=C.

$ /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup 
Sin retroceso.
$ LC_ALL=C /usr/bin/flex -Cfe -b 
/pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup 
No backing up.



-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Andrew Dunstan



On 08/25/2011 01:16 PM, Alvaro Herrera wrote:

Excerpts from Tom Lane's message of jue ago 25 14:00:57 -0300 2011:

Robert Haasrobertmh...@gmail.com  writes:

On Thu, Aug 25, 2011 at 12:47 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

1. Somebody broke the no-backtracking property back in 9.0 while adding
quoted variable substitution.  According to the flex manual, use of
backtracking creates a performance penalty.  We once measured the
backend's lexer as being about a third faster with backtrack avoidance,
and presumably it's about the same for psql's.  This is not hard to fix,
but should I consider it a bug fix and back-patch?  We've not had
complaints about psql getting slower as of 9.0.

That may well have been me.

[ checks git blame ]  Well, you commmitted the patch anyway: d0cfc018.


How would I have known that I broke it?

Per the header comments in the backend lexer, you should run flex with
-b switch and verify that the resulting lex.backup file says no
backing up.  I've occasionally thought about automating that, but I'm
not sure if the output is entirely locale- and flex-version-independent.

It is locale dependent, though of course for the automated check you
could just run flex under LC_ALL=C.

$ /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup
Sin retroceso.
$ LC_ALL=C /usr/bin/flex -Cfe -b 
/pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l
$ cat lex.backup
No backing up.



We could just add -b unconditionally to the flex flags and then count 
the number of lines in lex.backup. If it's greater that 1 whine loudly, 
or even fail, otherwise remove lex.backup. Would that avoid locale 
dependencies?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 We could just add -b unconditionally to the flex flags and then count 
 the number of lines in lex.backup. If it's greater that 1 whine loudly, 
 or even fail, otherwise remove lex.backup. Would that avoid locale 
 dependencies?

Hm, yeah, seems like that ought to work.

I'm tempted to add -p -p also, even though that only results in some
whinging on stderr.  It would still probably get noticed by anyone who
was changing the lexer.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane
I wrote:
 Andrew Dunstan and...@dunslane.net writes:
 We could just add -b unconditionally to the flex flags and then count 
 the number of lines in lex.backup. If it's greater that 1 whine loudly, 
 or even fail, otherwise remove lex.backup. Would that avoid locale 
 dependencies?

 Hm, yeah, seems like that ought to work.

Done in HEAD, but only for the Makefile-based build mechanism.  Anybody
want to add the comparable logic to the MSVC scripts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane
While I'm looking at this ... the current implementation has got a
number of very inconsistent behaviors with respect to when it will
expand a variable reference within a psql meta-command argument.
Observe:

regression=# \set foo 'value of foo'
regression=# \set bar 'value of bar'
regression=# \echo :foo
value of foo
regression=# \echo :foo@bar
value of foo @bar

(there shouldn't be a space before the @, IMO --- there is because this
gets treated as two separate arguments, which seems bizarre)

regression=# \echo :foo:bar
value of foo value of bar

(again, why is this two arguments not one?)

regression=# \echo :foo@:bar
value of foo @:bar

(why isn't :bar expanded here, when it is in the previous case?)

regression=# \echo foo:foo@:bar
foo:foo@:bar

(and now neither one gets expanded)

ISTM the general rule ought to be that we attempt to substitute for a
colon-construct regardless of where it appears within an argument, as
long as it's not within quotes.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Robert Haas
On Thu, Aug 25, 2011 at 5:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 While I'm looking at this ... the current implementation has got a
 number of very inconsistent behaviors with respect to when it will
 expand a variable reference within a psql meta-command argument.
 Observe:

 regression=# \set foo 'value of foo'
 regression=# \set bar 'value of bar'
 regression=# \echo :foo
 value of foo
 regression=# \echo :foo@bar
 value of foo @bar

 (there shouldn't be a space before the @, IMO --- there is because this
 gets treated as two separate arguments, which seems bizarre)

 regression=# \echo :foo:bar
 value of foo value of bar

 (again, why is this two arguments not one?)

 regression=# \echo :foo@:bar
 value of foo @:bar

 (why isn't :bar expanded here, when it is in the previous case?)

 regression=# \echo foo:foo@:bar
 foo:foo@:bar

 (and now neither one gets expanded)

 ISTM the general rule ought to be that we attempt to substitute for a
 colon-construct regardless of where it appears within an argument, as
 long as it's not within quotes.

 Thoughts?

My main thought is that I remember this code being pretty awful -
especially with respect to error handling - when I looked at it.  A
lot of dubious behaviors were more or less compelled by the
impossibility of bailing out at an arbitrary point a la ereport().  At
least, barring a drastic refactoring of the code, which might not be a
bad idea either.

No objection if you want to clean some of it up, but you may find it's
a larger sinkhole than you anticipate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Andrew Dunstan



On 08/25/2011 02:45 PM, Tom Lane wrote:

I wrote:

Andrew Dunstanand...@dunslane.net  writes:

We could just add -b unconditionally to the flex flags and then count
the number of lines in lex.backup. If it's greater that 1 whine loudly,
or even fail, otherwise remove lex.backup. Would that avoid locale
dependencies?

Hm, yeah, seems like that ought to work.

Done in HEAD, but only for the Makefile-based build mechanism.  Anybody
want to add the comparable logic to the MSVC scripts?




Done.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of issues with psql variable substitution

2011-08-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 25, 2011 at 5:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ISTM the general rule ought to be that we attempt to substitute for a
 colon-construct regardless of where it appears within an argument, as
 long as it's not within quotes.

 My main thought is that I remember this code being pretty awful -
 especially with respect to error handling - when I looked at it.  A
 lot of dubious behaviors were more or less compelled by the
 impossibility of bailing out at an arbitrary point a la ereport().  At
 least, barring a drastic refactoring of the code, which might not be a
 bad idea either.

What I had in mind to do was just to rearrange the flex rules --- the
issues that I called out have to do with dubious choices about when to
transition between different lexer states.

I agree that the error handling isn't terribly friendly in unexpected
cases like there not being a connection available to determine the
literal-quoting rules, but that's not what I'm on about here.  I'm
just after consistent variable-expansion behavior in normal operation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers