Re: [PATCHES] [HACKERS] dollar quoting

Andrew Dunstan Sat, 14 Feb 2004 12:07:23 -0800

Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:

I ended up not using a regex, which seemed to be a little heavy handed, but just writing a small custom recognition function, that should (and I think does) mimic the pattern recognition for these tokens used by the backend lexer.
I looked at this and realized that it still doesn't do very well at
distinguishing $foo$ from other random uses of $.  The problem is that
looking back at just the immediately preceding character isn't enough
context to tell whether a $ is part of an identifier.  Consider the
input
        a42$foo$
This is a legal identifier according to PG 7.4.  But how about
        42$foo$
This is a syntax error in 7.4, and we propose to redefine it as an
integer literal '42' followed by a dollar-quote start symbol.

The test in the patch I sent is this:


           else if (!dol_quote && valid_dolquote(line+i) &&
                    (i == 0 ||
                     ! ((line[i-prevlen] & 0x80) != 0 ||
                        isalnum(line[i-prevlen]) ||
                        line[i-prevlen] == '_' ||
                        line[i-prevlen] == '$' )))

The test should not succeed anywhere in the string '42$foo$'.

Note that psql does not change any '$foo$' at all - it just passes it to the backend. The reason we need this at all in psql is that it has to detect the end of a statement, and it has to prompt correctly, and to do that it needs to know if we are in a quote (single, double, dollar) or a comment.

psql does not detect many syntax errors, or even lexical errors - that is the job of the backend - rightly so, I believe.


There's no way to tell these apart with a single-character lookback,
or indeed any fixed number of characters of lookback.

I'm still not convinced, although maybe there's something I'm not getting.


I begin to think that we'll really have to bite the bullet and convert
psql's input parser to use flex.  If we're not scanning with exactly the
same rules as the backend uses, we're going to get the wrong answers.

Interacting with lexer states would probably be ... unpleasant. Matching a stream oriented lexer with a line oriented CLI would be messy I suspect.

cheers

andrew


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PATCHES] [HACKERS] dollar quoting

Reply via email to