Re: [PATCHES] Doing psql's lexing with flex
Tom Lane wrote: I got interested enough in the psql-with-flex problem to go off and solve it. Attached is a working patch, which I'm now debating whether to apply. Comments solicited... The patch removes about 200 lines of very spaghetti-ish code in mainloop.c. However, it adds an 875-line flex source file, which might be thought a bad tradeoff :-(. One bright spot is that about half of that total is a direct copy of the main backend lexer, so it's not really as much new, separately maintainable code as all that. Also, Andrew Dunstan's patch for supporting dollar-quoting would add about 100 lines to mainloop.c, versus only a dozen or so lines in the flex implementation. Am I missing something, or is dollar quoting not in this patch? Perhaps you have a followup patch? Once that's taken into account I don't think there is a lot of difference in effective SLOC to maintain. I'm also of the opinion that the new C code in psqlscan.l is much more straightforward than the code removed from mainloop.c, though having just written it, I'm no doubt pretty biased. Bruce was asking about speed. On normal-size queries I cannot measure any difference at all. For testing purposes I made up a file containing a single 750K query (just a SELECT big-honking-string-constant, with the string literal broken into lines of 75 bytes). The client-side (psql) CPU time to run this file looks about like this on my machine: PGCLIENTENCODING UNICODE SJIS CVS tip 1.571.82 flex implementation 0.932.33 The flex implementation is consistently faster than CVS tip when dealing with backend-compatible encodings (such as UTF-8). It's consistently slower when it has to deal with a non-backend-safe encoding such as SJIS or Big5. But for real-world cases the differential is down in the noise either way. I'm inclined to apply this but I can see where a person not comfortable with flex might feel differently. Opinions? In principle this is the Right Thing (tm). We use flex elsewhere, of course, so the fact that it is a flex scanner doesn't seem to me to be any sort of strong argument. If we are going to do this we need to get it in ASAP, so it gets plenty of beating on. cheers andrew ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] Doing psql's lexing with flex
Tom Lane wrote: I got interested enough in the psql-with-flex problem to go off and solve it. Attached is a working patch, which I'm now debating whether to apply. Comments solicited... The patch removes about 200 lines of very spaghetti-ish code in mainloop.c. However, it adds an 875-line flex source file, which might be thought a bad tradeoff :-(. One bright spot is that about half of that total is a direct copy of the main backend lexer, so it's not really as much new, separately maintainable code as all that. Also, Andrew Dunstan's patch for supporting dollar-quoting would add about 100 lines to mainloop.c, versus only a dozen or so lines in the flex implementation. Once that's taken into account I don't think there is a lot of difference in effective SLOC to maintain. I'm also of the opinion that the new C code in psqlscan.l is much more straightforward than the code removed from mainloop.c, though having just written it, I'm no doubt pretty biased. Bruce was asking about speed. On normal-size queries I cannot measure any difference at all. For testing purposes I made up a file containing a single 750K query (just a SELECT big-honking-string-constant, with the string literal broken into lines of 75 bytes). The client-side (psql) CPU time to run this file looks about like this on my machine: PGCLIENTENCODING UNICODE SJIS CVS tip 1.571.82 flex implementation 0.932.33 Looks good. I withdraw my performance concerns. Thanks for testing. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Doing psql's lexing with flex
Tom Lane [EMAIL PROTECTED] writes: I'm inclined to apply this but I can see where a person not comfortable with flex might feel differently. Opinions? Looks good to me. The psql cleanup is nice, and ISTM that much of the flex code is comments or flex boilerplate anyway, so the actual LOC increase isn't too bad. -Neil ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PATCHES] Doing psql's lexing with flex
Neil Conway [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I'm inclined to apply this but I can see where a person not comfortable with flex might feel differently. Opinions? Looks good to me. The psql cleanup is nice, and ISTM that much of the flex code is comments or flex boilerplate anyway, so the actual LOC increase isn't too bad. I just realized that something I had planned to look at later is actually an essential step if this is to be applied. I had wanted to see if lexing of backslash command tokens could be folded into the flex lexer as well, but thought I could leave it for later. The case where this doesn't preserve the previous behavior is if the expansion of a psql variable includes both a backslash command and some part of its arguments. As patched, HandleSlashCommand will only look to the original input string and not see the contents of any psql variables (because those are only seen inside the lexer, I'm not physically inserting them into the line buffer as the old code did). Imagine for example \set foo '\c mydb' blah :foo bar The existing code would interpret this as blah \c mydb bar but my patch as it stands would behave very strangely --- the \c command would see bar as its argument and then 'mydb' would be regurgitated after HandleSlashCommand finishes. The existing code is pretty darn inconsistent on this point anyway when you look at it carefully. AFAICS a variable reference is handled quite differently in the arguments of a backslash command than elsewhere; it can't supply general text but only a single option value. The same variable expanded before control reaches HandleSlashCommand is going to look indistinguishable from hand-entered text. If we did translate it into a flex thing the behavior would be different in corner cases like this. Peter, any comments on this? Is the current behavior actually intentional, or did it just fall out of the implementation techniques? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] Doing psql's lexing with flex
Andrew Dunstan [EMAIL PROTECTED] writes: Am I missing something, or is dollar quoting not in this patch? It is not. If we go this way, then we'd add essentially identical flex patches to backend and psql to implement dollar quoting (plus perhaps a few more lines in psql to support signaling dollar quoting in the prompt, but that's pretty trivial). My intent with the given patch was just to replicate existing functionality with flex. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Doing psql's lexing with flex
Tom Lane wrote: I got interested enough in the psql-with-flex problem to go off and solve it. Attached is a working patch, which I'm now debating whether to apply. Comments solicited... That should teach me a lesson not to leave random comments lying around in the source code. Years later someone might take them seriously. :) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]