Re: ECPG cleanup and fix for clang compile-time problem
I wrote: > Thanks, done. Here's a revised patchset. The cfbot points out that I should probably have marked progname as "static" in 0008. I'm not going to repost the patchset just for that, though. regards, tom lane
Re: ECPG cleanup and fix for clang compile-time problem
On 15.08.24 02:43, Tom Lane wrote: I wrote: Here's a rebased but otherwise identical patchset. I also added an 0007 that removes check_rules.pl as threatened. I've done some more work on this and hope to post an updated patchset tomorrow. Before that though, is there any objection to going ahead with pushing the 0001 patch (pgindent'ing ecpg's lexer and parser files)? It's pretty bulky yet of no intellectual interest, so I'd like to stop carrying it forward. The indentation patch looks good to me and it would be good to get it out of the way.
Re: ECPG cleanup and fix for clang compile-time problem
I wrote: > Here's a rebased but otherwise identical patchset. I also added > an 0007 that removes check_rules.pl as threatened. I've done some more work on this and hope to post an updated patchset tomorrow. Before that though, is there any objection to going ahead with pushing the 0001 patch (pgindent'ing ecpg's lexer and parser files)? It's pretty bulky yet of no intellectual interest, so I'd like to stop carrying it forward. regards, tom lane
Re: ECPG cleanup and fix for clang compile-time problem
On Fri, Jul 5, 2024 at 10:59 PM Tom Lane wrote: > > The cfbot noticed that this patchset had a conflict with d35cd0619, > so here's a rebase. It's just a rebase of v1, no other changes. Hi Tom, I started looking at the earlier cleanup patches. 0001 seems straightforward. Note: It doesn't apply cleanly anymore, but does with 'patch'. 0002 LGTM, just a couple minor comments: --- a/src/interfaces/ecpg/preproc/parse.pl +++ b/src/interfaces/ecpg/preproc/parse.pl @@ -1,7 +1,13 @@ #!/usr/bin/perl # src/interfaces/ecpg/preproc/parse.pl # parser generator for ecpg version 2 -# call with backend parser as stdin +# +# See README.parser for some explanation of what this does. Doesn't this patch set put us up to version 3? ;-) Looking in the history, a very long time ago a separate "parse2.pl" was committed for some reason, but that was reconciled some time later. This patch doesn't need to get rid of that meaningless version number, but I find it distracting. + # There may be multiple ECPG: lines and then multiple lines of code. + # Each block of code needs to be added to all prior ECPG records. This took me a while to parse at first. Some places in this script put quotes around words-with-colons, and that seems good for readability. 0003: Looks a heck of a lot better, but I didn't try to understand everything in the script, either before or after. + # Emit the target part of the rule. + # Note: the leading space is just to match + # the old, rather weird output logic. + $tstr = ' ' . $non_term_id . ':'; + add_to_buffer('rules', $tstr); Removing the leading space (or making it two spaces) has no effect on the output -- does that get normalized elsewhere? That's all I have for now.
Re: ECPG cleanup and fix for clang compile-time problem
On Fri, Apr 19, 2024 at 10:21 PM Tom Lane wrote: > > One other bit of randomness that I noticed: ecpg's parse.pl has > this undocumented bit of logic: > > if ($a eq 'IDENT' && $prior eq '%nonassoc') > { > > # add more tokens to the list > $str = $str . "\n%nonassoc CSTRING"; > } > preproc.c has > > %nonassoc UNBOUNDED NESTED > %nonassoc IDENT > %nonassoc CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP > SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH > %left Op OPERATOR > > If you don't find that scary as heck, I suggest reading the very long > comment just in front of the cited lines of gram.y. The argument why > assigning these keywords a precedence at all is OK depends heavily > on it being the same precedence as IDENT, yet here's ECPG randomly > breaking that. Before 7f380c59f (Reduce size of backend scanner's tables), it was even more spread out: # add two more tokens to the list $str = $str . "\n%nonassoc CSTRING\n%nonassoc UIDENT"; ...giving: %nonassoc UNBOUNDED %nonassoc IDENT %nonassoc CSTRING %nonassoc UIDENT GENERATED NULL_P PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP > We seem to have avoided problems though, because if I fix things > by manually editing preproc.y to re-join the lines: > > %nonassoc IDENT CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE > ROLLUP > > the generated preproc.c doesn't change at all. On a whim I tried rejoining on v12 and the .c doesn't change there, either. > Actually, I can > take CSTRING out of this list altogether and it still doesn't > change the results ... although looking at how CSTRING is used, > it looks safer to give it the same precedence as IDENT. Doing that on v12 on top of rejoining results in a shift-reduce conflict, so I imagine that's why it's there. Maybe it's outdated, but this backs up your inclination that it's safer to keep.
Re: ECPG cleanup and fix for clang compile-time problem
One other bit of randomness that I noticed: ecpg's parse.pl has this undocumented bit of logic: if ($a eq 'IDENT' && $prior eq '%nonassoc') { # add more tokens to the list $str = $str . "\n%nonassoc CSTRING"; } The net effect of that is that, where gram.y writes %nonassocUNBOUNDED NESTED /* ideally would have same precedence as IDENT */ %nonassocIDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH %leftOp OPERATOR /* multi-character ops and user-defined operators */ preproc.c has %nonassoc UNBOUNDED NESTED %nonassoc IDENT %nonassoc CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH %left Op OPERATOR If you don't find that scary as heck, I suggest reading the very long comment just in front of the cited lines of gram.y. The argument why assigning these keywords a precedence at all is OK depends heavily on it being the same precedence as IDENT, yet here's ECPG randomly breaking that. We seem to have avoided problems though, because if I fix things by manually editing preproc.y to re-join the lines: %nonassoc IDENT CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP the generated preproc.c doesn't change at all. Actually, I can take CSTRING out of this list altogether and it still doesn't change the results ... although looking at how CSTRING is used, it looks safer to give it the same precedence as IDENT. I think we should change parse.pl to give one or the other of these results before something more serious breaks there. regards, tom lane
Re: ECPG cleanup and fix for clang compile-time problem
Andres Freund writes: > On 2024-04-18 23:11:52 -0400, Tom Lane wrote: >> I was just looking locally at what I got by removing that, and sadly >> I don't think I believe it: there are a lot of places where it claims >> we hit lines we don't, and vice versa. That might be partially >> blamable on old tools on my RHEL8 workstation, but it sure seems >> that flex output confuses lcov to some extent. > Hm. Here it mostly looks reasonable, except that at least things seem off by > 1. Yeah, now that you mention it what I'm seeing looks like the line numbering might be off-by-one. Time for a bug report? regards, tom lane
Re: ECPG cleanup and fix for clang compile-time problem
On 2024-04-18 23:11:52 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2024-04-18 22:18:34 -0400, Tom Lane wrote: > >> (If our code coverage tools worked on bison/flex stuff, > >> maybe this'd be less scary ... but they don't.) > > > For bison coverage seems to work, see e.g.: > > Yeah, I'd just noticed that --- I had it in my head that we'd put > LCOV_EXCL_START/STOP into bison files too, but nope they are only > in flex files. That's good for this specific problem, because the > code I'm worried about is all in the bison file. At least locally the coverage seems to make sense too, both for the main grammar and for ecpg's. > > around the scanner "body". Without that I get reasonable-looking, albeit > > not > > very comforting, coverage for pgc.l as well. > > I was just looking locally at what I got by removing that, and sadly > I don't think I believe it: there are a lot of places where it claims > we hit lines we don't, and vice versa. That might be partially > blamable on old tools on my RHEL8 workstation, but it sure seems > that flex output confuses lcov to some extent. Hm. Here it mostly looks reasonable, except that at least things seem off by 1. And sure enough, if I look at pgc.l it has code like case 2: YY_RULE_SETUP #line 465 "/home/andres/src/postgresql/src/interfaces/ecpg/preproc/pgc.l" { token_start = yytext; state_before_str_start = YYSTATE; However line 465 is actually the "token_start" line. Further down this seems to get worse, by "<>" it's off by 4 lines. $ apt policy flex flex: Installed: 2.6.4-8.2+b2 Candidate: 2.6.4-8.2+b2 Version table: *** 2.6.4-8.2+b2 500 500 http://mirrors.ocf.berkeley.edu/debian unstable/main amd64 Packages 100 /var/lib/dpkg/status Greetings, Andres Freund
Re: ECPG cleanup and fix for clang compile-time problem
Andres Freund writes: > On 2024-04-18 22:18:34 -0400, Tom Lane wrote: >> (If our code coverage tools worked on bison/flex stuff, >> maybe this'd be less scary ... but they don't.) > For bison coverage seems to work, see e.g.: Yeah, I'd just noticed that --- I had it in my head that we'd put LCOV_EXCL_START/STOP into bison files too, but nope they are only in flex files. That's good for this specific problem, because the code I'm worried about is all in the bison file. > around the scanner "body". Without that I get reasonable-looking, albeit not > very comforting, coverage for pgc.l as well. I was just looking locally at what I got by removing that, and sadly I don't think I believe it: there are a lot of places where it claims we hit lines we don't, and vice versa. That might be partially blamable on old tools on my RHEL8 workstation, but it sure seems that flex output confuses lcov to some extent. regards, tom lane
Re: ECPG cleanup and fix for clang compile-time problem
Hi, On 2024-04-18 22:18:34 -0400, Tom Lane wrote: > Here is a patch series that aims to clean up ecpg's preprocessor > code a little and fix the compile time problems we're seeing with > late-model clang [1]. I guess whether it's a cleanup is in the eye of > the beholder, but it definitely succeeds at fixing compile time: for > me, the time needed to compile preproc.o with clang 16 drops from > 104 seconds to less than 1 second. It might be a little faster at > processing input too, though that wasn't the primary goal. Nice! I'll look at this more later. For now I just wanted to point one minor detail: > (If our code coverage tools worked on bison/flex stuff, > maybe this'd be less scary ... but they don't.) For bison coverage seems to work, see e.g.: https://coverage.postgresql.org/src/interfaces/ecpg/preproc/preproc.y.gcov.html#10638 I think the only reason it doesn't work for flex is that we have /* LCOV_EXCL_START */ /* LCOV_EXCL_STOP */ around the scanner "body". Without that I get reasonable-looking, albeit not very comforting, coverage for pgc.l as well. |Lines |Functions|Branches Filename |RateNum|Rate Num|Rate Num src/interfaces/ecpg/preproc/pgc.l |65.9% 748|87.5% 8|-0 src/interfaces/ecpg/preproc/preproc.y |29.9% 4964|66.7% 15|-0 This has been introduced by commit 421167362242ce1fb46d6d720798787e7cd65aad Author: Peter Eisentraut Date: 2017-08-10 23:33:47 -0400 Exclude flex-generated code from coverage testing Flex generates a lot of functions that are not actually used. In order to avoid coverage figures being ruined by that, mark up the part of the .l files where the generated code appears by lcov exclusion markers. That way, lcov will typically only reported on coverage for the .l file, which is under our control, but not for the .c file. Reviewed-by: Michael Paquier but I don't think it's working as intended, as it's also preventing coverage for the actual scanner definition. Greetings, Andres Freund