On Mon, Sep 5, 2022 at 4:11 AM Andres Freund <and...@anarazel.de> wrote:
>
> Hi,
>
> On 2022-09-04 13:12:52 +0700, John Naylor wrote:
> > On Fri, Sep 2, 2022 at 11:35 PM Andres Freund <and...@anarazel.de> wrote:
> > >
> > > Hi,
> > >
> > > On 2022-09-02 14:17:26 +0700, John Naylor wrote:
> > > > On Thu, Sep 1, 2022 at 1:12 AM Andres Freund <and...@anarazel.de> wrote:
> > > > > [v12]
> > > >
> > > > +# Build a small utility static lib for the parser. This makes it 
> > > > easier to not
> > > > +# depend on gram.h already having been generated for most of the other 
> > > > code
> > > > +# (which depends on generated headers having been generated). The 
> > > > generation
> > > > +# of the parser is slow...
> > > >
> > > > It's not obvious whether this is intended to be a Meson-only
> > > > optimization or a workaround for something awkward to specify.
> > >
> > > It is an optimization. The parser generation is by far the slowest part 
> > > of a
> > > build. If other files can only be compiled once gram.h is generated, 
> > > there's a
> > > long initial period where little can happen. So instead of having all .c 
> > > files
> > > have a dependency on gram.h having been generated, the above makes only
> > > scan.c, gram.c compilation depend on gram.h.  It only matters for the 
> > > first
> > > compilation, because such dependencies are added as order-only 
> > > dependencies,
> > > supplanted by more precise compiler generated dependencies after.
> >
> > Okay, I think the comment could include some of this info for clarity.
>
> Working on that.
>
>
> > > It's still pretty annoying that so much of the build is initially idle,
> > > waiting for genbki.pl to finish.
> > >
> > > Part of that is due to some ugly dependencies of src/common on backend 
> > > headers
> > > that IMO probably shouldn't exist (e.g. src/common/relpath.c includes
> > > catalog/pg_tablespace_d.h).
> >
> > Technically, *_d.h headers are not backend, that's why it's safe to
> > include them anywhere. relpath.c in its current form has to know the
> > tablespace OIDs, which I guess is what you think is ugly. (I agree
> > it's not great)
>
> Yea, I'm not saying it's unsafe in a produces-wrong-results way, just that it
> seems architecturally dubious / circular.
>
>
> > > Looks like it'd not be hard to get at least the
> > > _shlib version of src/common and libpq build without waiting for that. 
> > > But for
> > > all the backend code I don't really see a way, so it'd be nice to make 
> > > genbki
> > > faster at some point.
> >
> > The attached gets me a ~15% reduction in clock time by having
> > Catalog.pm parse the .dat files in one sweep, when we don't care about
> > formatting, i.e. most of the time:
>
> Cool. Seems worthwhile.

Okay, here's a cleaned up version with more idiomatic style and a new
copy of the perlcritic exception.

Note that the indentation hasn't changed. My thought there: perltidy
will be run again next year, at which time it will be part of a listed
whitespace-only commit. Any objections, since that could confuse
someone before then?

-- 
John Naylor
EDB: http://www.enterprisedb.com
diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index e91a8e10a8..a71f5b05a8 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -287,6 +287,8 @@ sub ParseData
 	my $catname = $1;
 	my $data    = [];
 
+	if ($preserve_formatting)
+	{
 	# Scan the input file.
 	while (<$ifd>)
 	{
@@ -346,11 +348,25 @@ sub ParseData
 		{
 			push @$data, $hash_ref if !$hash_ref->{autogenerated};
 		}
-		elsif ($preserve_formatting)
+		else
 		{
 			push @$data, $_;
 		}
 	}
+	}
+	else
+	{
+		# When we only care about the contents, it's faster to read and eval
+		# the whole file at once.
+		local $/;
+		my $full_file = <$ifd>;
+		eval '$data = ' . $full_file;    ## no critic (ProhibitStringyEval)
+		foreach my $hash_ref (@{$data})
+		{
+			AddDefaultValues($hash_ref, $schema, $catname);
+		}
+	}
+
 	close $ifd;
 
 	# If this is pg_type, auto-generate array types too.

Reply via email to