On Mon, Sep 5, 2022 at 4:11 AM Andres Freund <and...@anarazel.de> wrote: > > Hi, > > On 2022-09-04 13:12:52 +0700, John Naylor wrote: > > On Fri, Sep 2, 2022 at 11:35 PM Andres Freund <and...@anarazel.de> wrote: > > > > > > Hi, > > > > > > On 2022-09-02 14:17:26 +0700, John Naylor wrote: > > > > On Thu, Sep 1, 2022 at 1:12 AM Andres Freund <and...@anarazel.de> wrote: > > > > > [v12] > > > > > > > > +# Build a small utility static lib for the parser. This makes it > > > > easier to not > > > > +# depend on gram.h already having been generated for most of the other > > > > code > > > > +# (which depends on generated headers having been generated). The > > > > generation > > > > +# of the parser is slow... > > > > > > > > It's not obvious whether this is intended to be a Meson-only > > > > optimization or a workaround for something awkward to specify. > > > > > > It is an optimization. The parser generation is by far the slowest part > > > of a > > > build. If other files can only be compiled once gram.h is generated, > > > there's a > > > long initial period where little can happen. So instead of having all .c > > > files > > > have a dependency on gram.h having been generated, the above makes only > > > scan.c, gram.c compilation depend on gram.h. It only matters for the > > > first > > > compilation, because such dependencies are added as order-only > > > dependencies, > > > supplanted by more precise compiler generated dependencies after. > > > > Okay, I think the comment could include some of this info for clarity. > > Working on that. > > > > > It's still pretty annoying that so much of the build is initially idle, > > > waiting for genbki.pl to finish. > > > > > > Part of that is due to some ugly dependencies of src/common on backend > > > headers > > > that IMO probably shouldn't exist (e.g. src/common/relpath.c includes > > > catalog/pg_tablespace_d.h). > > > > Technically, *_d.h headers are not backend, that's why it's safe to > > include them anywhere. relpath.c in its current form has to know the > > tablespace OIDs, which I guess is what you think is ugly. (I agree > > it's not great) > > Yea, I'm not saying it's unsafe in a produces-wrong-results way, just that it > seems architecturally dubious / circular. > > > > > Looks like it'd not be hard to get at least the > > > _shlib version of src/common and libpq build without waiting for that. > > > But for > > > all the backend code I don't really see a way, so it'd be nice to make > > > genbki > > > faster at some point. > > > > The attached gets me a ~15% reduction in clock time by having > > Catalog.pm parse the .dat files in one sweep, when we don't care about > > formatting, i.e. most of the time: > > Cool. Seems worthwhile.
Okay, here's a cleaned up version with more idiomatic style and a new copy of the perlcritic exception. Note that the indentation hasn't changed. My thought there: perltidy will be run again next year, at which time it will be part of a listed whitespace-only commit. Any objections, since that could confuse someone before then? -- John Naylor EDB: http://www.enterprisedb.com
diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index e91a8e10a8..a71f5b05a8 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -287,6 +287,8 @@ sub ParseData my $catname = $1; my $data = []; + if ($preserve_formatting) + { # Scan the input file. while (<$ifd>) { @@ -346,11 +348,25 @@ sub ParseData { push @$data, $hash_ref if !$hash_ref->{autogenerated}; } - elsif ($preserve_formatting) + else { push @$data, $_; } } + } + else + { + # When we only care about the contents, it's faster to read and eval + # the whole file at once. + local $/; + my $full_file = <$ifd>; + eval '$data = ' . $full_file; ## no critic (ProhibitStringyEval) + foreach my $hash_ref (@{$data}) + { + AddDefaultValues($hash_ref, $schema, $catname); + } + } + close $ifd; # If this is pg_type, auto-generate array types too.