Re: ECPG cleanup and fix for clang compile-time problem

2024-04-19 Thread Tom Lane
One other bit of randomness that I noticed: ecpg's parse.pl has
this undocumented bit of logic:

if ($a eq 'IDENT' && $prior eq '%nonassoc')
{

# add more tokens to the list
$str = $str . "\n%nonassoc CSTRING";
}

The net effect of that is that, where gram.y writes

%nonassocUNBOUNDED NESTED /* ideally would have same precedence as IDENT */
%nonassocIDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
 SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
%leftOp OPERATOR  /* multi-character ops and user-defined operators 
*/

preproc.c has

 %nonassoc UNBOUNDED NESTED
 %nonassoc IDENT
%nonassoc CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
 SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
 %left Op OPERATOR

If you don't find that scary as heck, I suggest reading the very long
comment just in front of the cited lines of gram.y.  The argument why
assigning these keywords a precedence at all is OK depends heavily
on it being the same precedence as IDENT, yet here's ECPG randomly
breaking that.

We seem to have avoided problems though, because if I fix things
by manually editing preproc.y to re-join the lines:

 %nonassoc IDENT CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE 
ROLLUP

the generated preproc.c doesn't change at all.  Actually, I can
take CSTRING out of this list altogether and it still doesn't
change the results ... although looking at how CSTRING is used,
it looks safer to give it the same precedence as IDENT.

I think we should change parse.pl to give one or the other of these
results before something more serious breaks there.

regards, tom lane




Re: ECPG cleanup and fix for clang compile-time problem

2024-04-18 Thread Tom Lane
Andres Freund  writes:
> On 2024-04-18 23:11:52 -0400, Tom Lane wrote:
>> I was just looking locally at what I got by removing that, and sadly
>> I don't think I believe it: there are a lot of places where it claims
>> we hit lines we don't, and vice versa.  That might be partially
>> blamable on old tools on my RHEL8 workstation, but it sure seems
>> that flex output confuses lcov to some extent.

> Hm. Here it mostly looks reasonable, except that at least things seem off by
> 1.

Yeah, now that you mention it what I'm seeing looks like the line
numbering might be off-by-one.  Time for a bug report?

regards, tom lane




Re: ECPG cleanup and fix for clang compile-time problem

2024-04-18 Thread Andres Freund
On 2024-04-18 23:11:52 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2024-04-18 22:18:34 -0400, Tom Lane wrote:
> >> (If our code coverage tools worked on bison/flex stuff,
> >> maybe this'd be less scary ... but they don't.)
>
> > For bison coverage seems to work, see e.g.:
>
> Yeah, I'd just noticed that --- I had it in my head that we'd put
> LCOV_EXCL_START/STOP into bison files too, but nope they are only
> in flex files.  That's good for this specific problem, because the
> code I'm worried about is all in the bison file.

At least locally the coverage seems to make sense too, both for the main
grammar and for ecpg's.


> > around the scanner "body".  Without that I get reasonable-looking, albeit 
> > not
> > very comforting, coverage for pgc.l as well.
>
> I was just looking locally at what I got by removing that, and sadly
> I don't think I believe it: there are a lot of places where it claims
> we hit lines we don't, and vice versa.  That might be partially
> blamable on old tools on my RHEL8 workstation, but it sure seems
> that flex output confuses lcov to some extent.

Hm. Here it mostly looks reasonable, except that at least things seem off by
1. And sure enough, if I look at pgc.l it has code like

case 2:
YY_RULE_SETUP
#line 465 "/home/andres/src/postgresql/src/interfaces/ecpg/preproc/pgc.l"
{
token_start = yytext;
state_before_str_start = YYSTATE;

However line 465 is actually the "token_start" line.

Further down this seems to get worse, by "<>" it's off by 4 lines.


$ apt policy flex
flex:
  Installed: 2.6.4-8.2+b2
  Candidate: 2.6.4-8.2+b2
  Version table:
 *** 2.6.4-8.2+b2 500
500 http://mirrors.ocf.berkeley.edu/debian unstable/main amd64 Packages
100 /var/lib/dpkg/status

Greetings,

Andres Freund




Re: ECPG cleanup and fix for clang compile-time problem

2024-04-18 Thread Tom Lane
Andres Freund  writes:
> On 2024-04-18 22:18:34 -0400, Tom Lane wrote:
>> (If our code coverage tools worked on bison/flex stuff,
>> maybe this'd be less scary ... but they don't.)

> For bison coverage seems to work, see e.g.:

Yeah, I'd just noticed that --- I had it in my head that we'd put
LCOV_EXCL_START/STOP into bison files too, but nope they are only
in flex files.  That's good for this specific problem, because the
code I'm worried about is all in the bison file.

> around the scanner "body".  Without that I get reasonable-looking, albeit not
> very comforting, coverage for pgc.l as well.

I was just looking locally at what I got by removing that, and sadly
I don't think I believe it: there are a lot of places where it claims
we hit lines we don't, and vice versa.  That might be partially
blamable on old tools on my RHEL8 workstation, but it sure seems
that flex output confuses lcov to some extent.

regards, tom lane




Re: ECPG cleanup and fix for clang compile-time problem

2024-04-18 Thread Andres Freund
Hi,

On 2024-04-18 22:18:34 -0400, Tom Lane wrote:
> Here is a patch series that aims to clean up ecpg's preprocessor
> code a little and fix the compile time problems we're seeing with
> late-model clang [1].  I guess whether it's a cleanup is in the eye of
> the beholder, but it definitely succeeds at fixing compile time: for
> me, the time needed to compile preproc.o with clang 16 drops from
> 104 seconds to less than 1 second.  It might be a little faster at
> processing input too, though that wasn't the primary goal.

Nice! I'll look at this more later.


For now I just wanted to point one minor detail:

> (If our code coverage tools worked on bison/flex stuff,
> maybe this'd be less scary ... but they don't.)

For bison coverage seems to work, see e.g.:

https://coverage.postgresql.org/src/interfaces/ecpg/preproc/preproc.y.gcov.html#10638

I think the only reason it doesn't work for flex is that we have
/* LCOV_EXCL_START */
/* LCOV_EXCL_STOP */

around the scanner "body".  Without that I get reasonable-looking, albeit not
very comforting, coverage for pgc.l as well.

   |Lines  |Functions|Branches
Filename   |RateNum|Rate  Num|Rate   Num
src/interfaces/ecpg/preproc/pgc.l  |65.9%   748|87.5%   8|-0
src/interfaces/ecpg/preproc/preproc.y  |29.9%  4964|66.7%  15|-0


This has been introduced by

commit 421167362242ce1fb46d6d720798787e7cd65aad
Author: Peter Eisentraut 
Date:   2017-08-10 23:33:47 -0400

Exclude flex-generated code from coverage testing

Flex generates a lot of functions that are not actually used.  In order
to avoid coverage figures being ruined by that, mark up the part of the
.l files where the generated code appears by lcov exclusion markers.
That way, lcov will typically only reported on coverage for the .l file,
which is under our control, but not for the .c file.

Reviewed-by: Michael Paquier 


but I don't think it's working as intended, as it's also preventing coverage
for the actual scanner definition.

Greetings,

Andres Freund