Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:


I ended up not using a regex, which seemed to be a little heavy handed, but just writing a small custom recognition function, that should (and I think does) mimic the pattern recognition for these tokens used by the backend lexer.



I looked at this and realized that it still doesn't do very well at distinguishing $foo$ from other random uses of $. The problem is that looking back at just the immediately preceding character isn't enough context to tell whether a $ is part of an identifier. Consider the input a42$foo$ This is a legal identifier according to PG 7.4. But how about 42$foo$ This is a syntax error in 7.4, and we propose to redefine it as an integer literal '42' followed by a dollar-quote start symbol.


The test in the patch I sent is this:



else if (!dol_quote && valid_dolquote(line+i) && (i == 0 || ! ((line[i-prevlen] & 0x80) != 0 || isalnum(line[i-prevlen]) || line[i-prevlen] == '_' || line[i-prevlen] == '$' )))


The test should not succeed anywhere in the string '42$foo$'.


Note that psql does not change any '$foo$' at all - it just passes it to the backend. The reason we need this at all in psql is that it has to detect the end of a statement, and it has to prompt correctly, and to do that it needs to know if we are in a quote (single, double, dollar) or a comment.

psql does not detect many syntax errors, or even lexical errors - that is the job of the backend - rightly so, I believe.


There's no way to tell these apart with a single-character lookback, or indeed any fixed number of characters of lookback.


I'm still not convinced, although maybe there's something I'm not getting.



I begin to think that we'll really have to bite the bullet and convert psql's input parser to use flex. If we're not scanning with exactly the same rules as the backend uses, we're going to get the wrong answers.




Interacting with lexer states would probably be ... unpleasant. Matching a stream oriented lexer with a line oriented CLI would be messy I suspect.


cheers

andrew


---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to