Re: inconsistency and inefficiency in setup_conversion()

2019-01-03 Thread Tom Lane
John Naylor  writes:
> Since it's been a few months since last discussion, I'd like to
> summarize the purpose of this patch and advocate for its inclusion in
> v12:

Pushed after some review and correction.  Notably, I didn't like
leaving the info about the encoding lookup out of bki.sgml, so
I changed that.

regards, tom lane



Re: inconsistency and inefficiency in setup_conversion()

2018-12-14 Thread John Naylor
On 12/1/18, Dmitry Dolgov <9erthali...@gmail.com> wrote:
> I see that the author keeps patch updated, but I'm a bit worried because of
> lack of full review since probably May. I'm moving it to the next CF, let's
> see
> if there would be more feedback.
>
> P.S. adding Daniel, since he is assigned as a reviewer.

Having heard nothing in a while, I've removed Daniel as a reviewer to
make room for someone else. He is, of course free to re-add himself.
v8 is attached.

Since it's been a few months since last discussion, I'd like to
summarize the purpose of this patch and advocate for its inclusion in
v12:

1. Correctness
In the intro thread [1], I showed that object comments on some
conversions are wrong, and hard to fix given the current setup. This
is a documentation bug of sorts.

2. Clean-up
Currently, utils/mb/conversion_procs/Makefile has an ad-hoc script to
generate the SQL file, which has to be duplicated in the MSVC tooling,
and executed by initdb.c. Storing the conversions in .dat files
removes the need for any of that.

3. Performance
This patch shaves 5-6% off of initdb. Not as much as hoped, but still
a nice bonus.

[1] 
https://www.postgresql.org/message-id/CAJVSVGWtUqxpfAaxS88vEGvi%2BjKzWZb2EStu5io-UPc4p9rSJg%40mail.gmail.com


--John Naylor
From bdbdb36129708d1d417f8db9009b377c3d4e3e90 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Fri, 14 Dec 2018 15:19:54 -0500
Subject: [PATCH v8 1/2] Add pg_language lookup.

This didn't seem worth doing before, but an upcoming commit will add
88 entries to pg_proc whose languge is 'c'. In addition, we can
now use 'internal' in Gen_fmgr.pl instead of looking up the OID.
This simplifies that script and its Makefile a bit.
---
 doc/src/sgml/bki.sgml|  6 +--
 src/backend/catalog/genbki.pl|  8 +++
 src/backend/utils/Gen_fmgrtab.pl |  7 ++-
 src/backend/utils/Makefile   |  8 +--
 src/include/catalog/pg_proc.dat  | 92 
 src/include/catalog/pg_proc.h|  2 +-
 src/tools/msvc/Solution.pm   |  4 +-
 7 files changed, 64 insertions(+), 63 deletions(-)

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 786abb95d4..37bb7a2346 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -409,8 +409,8 @@
 that's error-prone and hard to understand, so for frequently-referenced
 catalogs, genbki.pl provides mechanisms to write
 symbolic references instead.  Currently this is possible for references
-to access methods, functions, operators, opclasses, opfamilies, and
-types.  The rules are as follows:
+to access methods, functions, languages, operators, opclasses, opfamilies,
+and types.  The rules are as follows:

 

@@ -420,7 +420,7 @@
   Use of symbolic references is enabled in a particular catalog column
   by attaching BKI_LOOKUP(lookuprule)
   to the column's definition, where lookuprule
-  is pg_am, pg_proc,
+  is pg_am, pg_proc, pg_language,
   pg_operator, pg_opclass,
   pg_opfamily, or pg_type.
   BKI_LOOKUP can be attached to columns of
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 8e2a2480be..8ccfcb7724 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -182,6 +182,13 @@ foreach my $row (@{ $catalog_data{pg_am} })
 	$amoids{ $row->{amname} } = $row->{oid};
 }
 
+# language OID lookup
+my %langoids;
+foreach my $row (@{ $catalog_data{pg_language} })
+{
+	$langoids{ $row->{lanname} } = $row->{oid};
+}
+
 # opclass OID lookup
 my %opcoids;
 foreach my $row (@{ $catalog_data{pg_opclass} })
@@ -256,6 +263,7 @@ foreach my $row (@{ $catalog_data{pg_type} })
 # Map catalog name to OID lookup.
 my %lookup_kind = (
 	pg_am   => \%amoids,
+	pg_language => \%langoids,
 	pg_opclass  => \%opcoids,
 	pg_operator => \%operoids,
 	pg_opfamily => \%opfoids,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index ed16737a6a..b6809c7277 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -59,6 +59,8 @@ die "No include path; you must specify -I.\n" if !$include_path;
 # Note: We pass data file names as arguments and then look for matching
 # headers to parse the schema from. This is backwards from genbki.pl,
 # but the Makefile dependencies look more sensible this way.
+# We currently only need pg_proc, but retain the possibility of reading
+# more than one data file.
 my %catalogs;
 my %catalog_data;
 foreach my $datfile (@input_files)
@@ -82,9 +84,6 @@ foreach my $datfile (@input_files)
 my $FirstGenbkiObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', $include_path,
 	'FirstGenbkiObjectId');
-my $INTERNALlanguageId =
-  Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
-	'INTERNALlanguageId');
 
 # Collect certain fields from pg_proc.dat.
 my @fmgr = ();
@@ -94,7 +93,7 @@ foreach my $row (@{ $catalog_data{pg_proc} })
 	my %bki_values = %$row;
 
 	# Select out just the rows for 

Re: inconsistency and inefficiency in setup_conversion()

2018-12-01 Thread Dmitry Dolgov
> On Mon, Nov 26, 2018 at 9:31 AM John Naylor  wrote:
>
> v7 is a rebase over recent catalog changes.

Thanks for working on this,

I see that the author keeps patch updated, but I'm a bit worried because of
lack of full review since probably May. I'm moving it to the next CF, let's see
if there would be more feedback.

P.S. adding Daniel, since he is assigned as a reviewer.



Re: inconsistency and inefficiency in setup_conversion()

2018-11-26 Thread John Naylor
v7 is a rebase over recent catalog changes.

-John Naylor
From 52aff43dcc91733c1b941fed4582d9c0602004d0 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Sat, 13 Oct 2018 19:28:08 +0700
Subject: [PATCH v7 1/2] Add pg_language lookup.

This didn't seem worth doing before, but an upcoming commit will add
88 entries to pg_proc whose languge is 'c'. In addition, we can
now use 'internal' in Gen_fmgr.pl instead of looking up the OID.
This simplifies that script and its Makefile a bit.
---
 doc/src/sgml/bki.sgml|  6 +--
 src/backend/catalog/genbki.pl|  8 +++
 src/backend/utils/Gen_fmgrtab.pl |  9 ++--
 src/backend/utils/Makefile   |  8 +--
 src/include/catalog/pg_proc.dat  | 92 
 src/include/catalog/pg_proc.h|  2 +-
 src/tools/msvc/Solution.pm   |  4 +-
 7 files changed, 65 insertions(+), 64 deletions(-)

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 786abb95d4..37bb7a2346 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -409,8 +409,8 @@
 that's error-prone and hard to understand, so for frequently-referenced
 catalogs, genbki.pl provides mechanisms to write
 symbolic references instead.  Currently this is possible for references
-to access methods, functions, operators, opclasses, opfamilies, and
-types.  The rules are as follows:
+to access methods, functions, languages, operators, opclasses, opfamilies,
+and types.  The rules are as follows:

 

@@ -420,7 +420,7 @@
   Use of symbolic references is enabled in a particular catalog column
   by attaching BKI_LOOKUP(lookuprule)
   to the column's definition, where lookuprule
-  is pg_am, pg_proc,
+  is pg_am, pg_proc, pg_language,
   pg_operator, pg_opclass,
   pg_opfamily, or pg_type.
   BKI_LOOKUP can be attached to columns of
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index edc8ea9f53..3410816882 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -175,6 +175,13 @@ foreach my $row (@{ $catalog_data{pg_am} })
 	$amoids{ $row->{amname} } = $row->{oid};
 }
 
+# language OID lookup
+my %langoids;
+foreach my $row (@{ $catalog_data{pg_language} })
+{
+	$langoids{ $row->{lanname} } = $row->{oid};
+}
+
 # opclass OID lookup
 my %opcoids;
 foreach my $row (@{ $catalog_data{pg_opclass} })
@@ -249,6 +256,7 @@ foreach my $row (@{ $catalog_data{pg_type} })
 # Map catalog name to OID lookup.
 my %lookup_kind = (
 	pg_am   => \%amoids,
+	pg_language => \%langoids,
 	pg_opclass  => \%opcoids,
 	pg_operator => \%operoids,
 	pg_opfamily => \%opfoids,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index ca28291355..87d250f318 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -59,6 +59,8 @@ die "No include path; you must specify -I.\n" if !$include_path;
 # Note: We pass data file names as arguments and then look for matching
 # headers to parse the schema from. This is backwards from genbki.pl,
 # but the Makefile dependencies look more sensible this way.
+# We currently only need pg_proc, but retain the possibility of reading
+# more than one data file.
 my %catalogs;
 my %catalog_data;
 foreach my $datfile (@input_files)
@@ -78,13 +80,10 @@ foreach my $datfile (@input_files)
 	$catalog_data{$catname} = Catalog::ParseData($datfile, $schema, 0);
 }
 
-# Fetch some values for later.
+# Fetch a value for later.
 my $FirstBootstrapObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', $include_path,
 	'FirstBootstrapObjectId');
-my $INTERNALlanguageId =
-  Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
-	'INTERNALlanguageId');
 
 # Collect certain fields from pg_proc.dat.
 my @fmgr = ();
@@ -94,7 +93,7 @@ foreach my $row (@{ $catalog_data{pg_proc} })
 	my %bki_values = %$row;
 
 	# Select out just the rows for internal-language procedures.
-	next if $bki_values{prolang} ne $INTERNALlanguageId;
+	next if $bki_values{prolang} ne 'internal';
 
 	push @fmgr,
 	  {
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index e797539d09..da40f6b6c4 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -31,15 +31,11 @@ generated-header-symlinks: $(top_builddir)/src/include/utils/header-stamp $(top_
 
 $(SUBDIRS:%=%-recursive): fmgr-stamp errcodes.h
 
-FMGR_DATA := $(addprefix $(top_srcdir)/src/include/catalog/,\
-	pg_language.dat pg_proc.dat \
-	)
-
 # fmgr-stamp records the last time we ran Gen_fmgrtab.pl.  We don't rely on
 # the timestamps of the individual output files, because the Perl script
 # won't update them if they didn't change (to avoid unnecessary recompiles).
-fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm $(FMGR_DATA) $(top_srcdir)/src/include/access/transam.h
-	$(PERL) -I $(catalogdir) $< -I $(top_srcdir)/src/include/ $(FMGR_DATA)
+fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm 

Re: inconsistency and inefficiency in setup_conversion()

2018-10-13 Thread John Naylor
Attached is v6, a simple rebase.

-John Naylor
From e37cb80ab8e7baaa5231fc3b8dbc9d96ec253018 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Sat, 13 Oct 2018 19:28:08 +0700
Subject: [PATCH v6 1/2] Add pg_language lookup.

This didn't seem worth doing before, but an upcoming commit will add
88 entries to pg_proc whose languge is 'c'. In addition, we can
now use 'internal' in Gen_fmgr.pl instead of looking up the OID.
This simplifies that script and its Makefile a bit.
---
 doc/src/sgml/bki.sgml|  6 +--
 src/backend/catalog/genbki.pl|  8 +++
 src/backend/utils/Gen_fmgrtab.pl |  9 ++--
 src/backend/utils/Makefile   |  8 +--
 src/include/catalog/pg_proc.dat  | 92 
 src/include/catalog/pg_proc.h|  2 +-
 src/tools/msvc/Solution.pm   |  4 +-
 7 files changed, 65 insertions(+), 64 deletions(-)

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 0fb309a1bd..176fa8bf4d 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -408,8 +408,8 @@
 that's error-prone and hard to understand, so for frequently-referenced
 catalogs, genbki.pl provides mechanisms to write
 symbolic references instead.  Currently this is possible for references
-to access methods, functions, operators, opclasses, opfamilies, and
-types.  The rules are as follows:
+to access methods, functions, languages, operators, opclasses, opfamilies,
+and types.  The rules are as follows:

 

@@ -419,7 +419,7 @@
   Use of symbolic references is enabled in a particular catalog column
   by attaching BKI_LOOKUP(lookuprule)
   to the column's definition, where lookuprule
-  is pg_am, pg_proc,
+  is pg_am, pg_proc, pg_language,
   pg_operator, pg_opclass,
   pg_opfamily, or pg_type.
   BKI_LOOKUP can be attached to columns of
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 649200260a..f321662695 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -167,6 +167,13 @@ foreach my $row (@{ $catalog_data{pg_am} })
 	$amoids{ $row->{amname} } = $row->{oid};
 }
 
+# language OID lookup
+my %langoids;
+foreach my $row (@{ $catalog_data{pg_language} })
+{
+	$langoids{ $row->{lanname} } = $row->{oid};
+}
+
 # opclass OID lookup
 my %opcoids;
 foreach my $row (@{ $catalog_data{pg_opclass} })
@@ -241,6 +248,7 @@ foreach my $row (@{ $catalog_data{pg_type} })
 # Map catalog name to OID lookup.
 my %lookup_kind = (
 	pg_am   => \%amoids,
+	pg_language => \%langoids,
 	pg_opclass  => \%opcoids,
 	pg_operator => \%operoids,
 	pg_opfamily => \%opfoids,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index fa30436895..b288c5e5fd 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -59,6 +59,8 @@ die "No include path; you must specify -I.\n" if !$include_path;
 # Note: We pass data file names as arguments and then look for matching
 # headers to parse the schema from. This is backwards from genbki.pl,
 # but the Makefile dependencies look more sensible this way.
+# We currently only need pg_proc, but retain the possibility of reading
+# more than one data file.
 my %catalogs;
 my %catalog_data;
 foreach my $datfile (@input_files)
@@ -78,13 +80,10 @@ foreach my $datfile (@input_files)
 	$catalog_data{$catname} = Catalog::ParseData($datfile, $schema, 0);
 }
 
-# Fetch some values for later.
+# Fetch a value for later.
 my $FirstBootstrapObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', $include_path,
 	'FirstBootstrapObjectId');
-my $INTERNALlanguageId =
-  Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
-	'INTERNALlanguageId');
 
 # Collect certain fields from pg_proc.dat.
 my @fmgr = ();
@@ -94,7 +93,7 @@ foreach my $row (@{ $catalog_data{pg_proc} })
 	my %bki_values = %$row;
 
 	# Select out just the rows for internal-language procedures.
-	next if $bki_values{prolang} ne $INTERNALlanguageId;
+	next if $bki_values{prolang} ne 'internal';
 
 	push @fmgr,
 	  {
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index e797539d09..da40f6b6c4 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -31,15 +31,11 @@ generated-header-symlinks: $(top_builddir)/src/include/utils/header-stamp $(top_
 
 $(SUBDIRS:%=%-recursive): fmgr-stamp errcodes.h
 
-FMGR_DATA := $(addprefix $(top_srcdir)/src/include/catalog/,\
-	pg_language.dat pg_proc.dat \
-	)
-
 # fmgr-stamp records the last time we ran Gen_fmgrtab.pl.  We don't rely on
 # the timestamps of the individual output files, because the Perl script
 # won't update them if they didn't change (to avoid unnecessary recompiles).
-fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm $(FMGR_DATA) $(top_srcdir)/src/include/access/transam.h
-	$(PERL) -I $(catalogdir) $< -I $(top_srcdir)/src/include/ $(FMGR_DATA)
+fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm 

Re: inconsistency and inefficiency in setup_conversion()

2018-10-02 Thread John Naylor
On 10/2/18, Michael Paquier  wrote:
> v4 does not apply anymore.  I am moving this patch to next commit fest,
> waiting on author.

v5 attached.

-John Naylor
From ea0a180bde325b0383ce7f0b3d48d1ce9e941393 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Mon, 2 Jul 2018 12:52:07 +0700
Subject: [PATCH v5 1/2] Add pg_language lookup.

This didn't seem worth doing before, but an upcoming commit will add
88 entries to pg_proc whose languge is 'c'. In addition, we can
now use 'internal' in Gen_fmgr.pl instead of looking up the OID.
This simplifies that script and its Makefile a bit.
---
 doc/src/sgml/bki.sgml|  6 +-
 src/backend/catalog/genbki.pl|  8 +++
 src/backend/utils/Gen_fmgrtab.pl |  9 ++-
 src/backend/utils/Makefile   |  8 +--
 src/include/catalog/pg_proc.dat  | 98 
 src/include/catalog/pg_proc.h|  2 +-
 src/tools/msvc/Solution.pm   |  4 +-
 7 files changed, 68 insertions(+), 67 deletions(-)

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 0fb309a1bd..176fa8bf4d 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -408,8 +408,8 @@
 that's error-prone and hard to understand, so for frequently-referenced
 catalogs, genbki.pl provides mechanisms to write
 symbolic references instead.  Currently this is possible for references
-to access methods, functions, operators, opclasses, opfamilies, and
-types.  The rules are as follows:
+to access methods, functions, languages, operators, opclasses, opfamilies,
+and types.  The rules are as follows:

 

@@ -419,7 +419,7 @@
   Use of symbolic references is enabled in a particular catalog column
   by attaching BKI_LOOKUP(lookuprule)
   to the column's definition, where lookuprule
-  is pg_am, pg_proc,
+  is pg_am, pg_proc, pg_language,
   pg_operator, pg_opclass,
   pg_opfamily, or pg_type.
   BKI_LOOKUP can be attached to columns of
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 649200260a..f321662695 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -167,6 +167,13 @@ foreach my $row (@{ $catalog_data{pg_am} })
 	$amoids{ $row->{amname} } = $row->{oid};
 }
 
+# language OID lookup
+my %langoids;
+foreach my $row (@{ $catalog_data{pg_language} })
+{
+	$langoids{ $row->{lanname} } = $row->{oid};
+}
+
 # opclass OID lookup
 my %opcoids;
 foreach my $row (@{ $catalog_data{pg_opclass} })
@@ -241,6 +248,7 @@ foreach my $row (@{ $catalog_data{pg_type} })
 # Map catalog name to OID lookup.
 my %lookup_kind = (
 	pg_am   => \%amoids,
+	pg_language => \%langoids,
 	pg_opclass  => \%opcoids,
 	pg_operator => \%operoids,
 	pg_opfamily => \%opfoids,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index fa30436895..b288c5e5fd 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -59,6 +59,8 @@ die "No include path; you must specify -I.\n" if !$include_path;
 # Note: We pass data file names as arguments and then look for matching
 # headers to parse the schema from. This is backwards from genbki.pl,
 # but the Makefile dependencies look more sensible this way.
+# We currently only need pg_proc, but retain the possibility of reading
+# more than one data file.
 my %catalogs;
 my %catalog_data;
 foreach my $datfile (@input_files)
@@ -78,13 +80,10 @@ foreach my $datfile (@input_files)
 	$catalog_data{$catname} = Catalog::ParseData($datfile, $schema, 0);
 }
 
-# Fetch some values for later.
+# Fetch a value for later.
 my $FirstBootstrapObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', $include_path,
 	'FirstBootstrapObjectId');
-my $INTERNALlanguageId =
-  Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
-	'INTERNALlanguageId');
 
 # Collect certain fields from pg_proc.dat.
 my @fmgr = ();
@@ -94,7 +93,7 @@ foreach my $row (@{ $catalog_data{pg_proc} })
 	my %bki_values = %$row;
 
 	# Select out just the rows for internal-language procedures.
-	next if $bki_values{prolang} ne $INTERNALlanguageId;
+	next if $bki_values{prolang} ne 'internal';
 
 	push @fmgr,
 	  {
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index e797539d09..da40f6b6c4 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -31,15 +31,11 @@ generated-header-symlinks: $(top_builddir)/src/include/utils/header-stamp $(top_
 
 $(SUBDIRS:%=%-recursive): fmgr-stamp errcodes.h
 
-FMGR_DATA := $(addprefix $(top_srcdir)/src/include/catalog/,\
-	pg_language.dat pg_proc.dat \
-	)
-
 # fmgr-stamp records the last time we ran Gen_fmgrtab.pl.  We don't rely on
 # the timestamps of the individual output files, because the Perl script
 # won't update them if they didn't change (to avoid unnecessary recompiles).
-fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm $(FMGR_DATA) $(top_srcdir)/src/include/access/transam.h
-	$(PERL) -I $(catalogdir) $< -I 

Re: inconsistency and inefficiency in setup_conversion()

2018-10-01 Thread Michael Paquier
Hi John,

On Mon, Jul 02, 2018 at 01:59:03PM +0700, John Naylor wrote:
> I've attached v4, which is a rebase plus some comment revisions.

v4 does not apply anymore.  I am moving this patch to next commit fest,
waiting on author.
--
Michael


signature.asc
Description: PGP signature


Re: inconsistency and inefficiency in setup_conversion()

2018-07-02 Thread John Naylor
I've attached v4, which is a rebase plus some comment revisions.

-John Naylor
From 4d1cb1d40c5c79c732e5433e95f8560fb41e20bd Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Mon, 2 Jul 2018 12:52:07 +0700
Subject: [PATCH v4 1/2] Add pg_language lookup.

This didn't seem worth doing before, but an upcoming commit will add
88 entries to pg_proc whose languge is 'c'. In addition, we can
now use 'internal' in Gen_fmgr.pl instead of looking up the OID.
This simplifies that script and its Makefile a bit.
---
 doc/src/sgml/bki.sgml|  6 +--
 src/backend/catalog/genbki.pl|  8 
 src/backend/utils/Gen_fmgrtab.pl |  9 ++--
 src/backend/utils/Makefile   |  8 +---
 src/include/catalog/pg_proc.dat  | 98 
 src/include/catalog/pg_proc.h|  2 +-
 src/tools/msvc/Solution.pm   |  4 +-
 7 files changed, 68 insertions(+), 67 deletions(-)

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index e3ba73a..ec7e617 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -405,8 +405,8 @@
 that's error-prone and hard to understand, so for frequently-referenced
 catalogs, genbki.pl provides mechanisms to write
 symbolic references instead.  Currently this is possible for references
-to access methods, functions, operators, opclasses, opfamilies, and
-types.  The rules are as follows:
+to access methods, functions, languages, operators, opclasses, opfamilies,
+and types.  The rules are as follows:

 

@@ -416,7 +416,7 @@
   Use of symbolic references is enabled in a particular catalog column
   by attaching BKI_LOOKUP(lookuprule)
   to the column's definition, where lookuprule
-  is pg_am, pg_proc,
+  is pg_am, pg_proc, pg_language,
   pg_operator, pg_opclass,
   pg_opfamily, or pg_type.
   BKI_LOOKUP can be attached to columns of
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 9be51d2..8b0bcb9 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -167,6 +167,13 @@ foreach my $row (@{ $catalog_data{pg_am} })
 	$amoids{ $row->{amname} } = $row->{oid};
 }
 
+# language OID lookup
+my %langoids;
+foreach my $row (@{ $catalog_data{pg_language} })
+{
+	$langoids{ $row->{lanname} } = $row->{oid};
+}
+
 # opclass OID lookup
 my %opcoids;
 foreach my $row (@{ $catalog_data{pg_opclass} })
@@ -241,6 +248,7 @@ foreach my $row (@{ $catalog_data{pg_type} })
 # Map catalog name to OID lookup.
 my %lookup_kind = (
 	pg_am   => \%amoids,
+	pg_language => \%langoids,
 	pg_opclass  => \%opcoids,
 	pg_operator => \%operoids,
 	pg_opfamily => \%opfoids,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index fa30436..d456e0b 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -59,6 +59,8 @@ die "No include path; you must specify -I.\n" if !$include_path;
 # Note: We pass data file names as arguments and then look for matching
 # headers to parse the schema from. This is backwards from genbki.pl,
 # but the Makefile dependencies look more sensible this way.
+# We currently only need pg_proc, but retain the possibility of reading
+# more than one data file.
 my %catalogs;
 my %catalog_data;
 foreach my $datfile (@input_files)
@@ -78,13 +80,10 @@ foreach my $datfile (@input_files)
 	$catalog_data{$catname} = Catalog::ParseData($datfile, $schema, 0);
 }
 
-# Fetch some values for later.
+# Fetch a value for later.
 my $FirstBootstrapObjectId =
   Catalog::FindDefinedSymbol('access/transam.h', $include_path,
 	'FirstBootstrapObjectId');
-my $INTERNALlanguageId =
-  Catalog::FindDefinedSymbolFromData($catalog_data{pg_language},
-	'INTERNALlanguageId');
 
 # Collect certain fields from pg_proc.dat.
 my @fmgr = ();
@@ -94,7 +93,7 @@ foreach my $row (@{ $catalog_data{pg_proc} })
 	my %bki_values = %$row;
 
 	# Select out just the rows for internal-language procedures.
-	next if $bki_values{prolang} ne $INTERNALlanguageId;
+	next if $bki_values{prolang} ne 'internal';
 
 	push @fmgr,
 	  {
diff --git a/src/backend/utils/Makefile b/src/backend/utils/Makefile
index e797539..da40f6b 100644
--- a/src/backend/utils/Makefile
+++ b/src/backend/utils/Makefile
@@ -31,15 +31,11 @@ generated-header-symlinks: $(top_builddir)/src/include/utils/header-stamp $(top_
 
 $(SUBDIRS:%=%-recursive): fmgr-stamp errcodes.h
 
-FMGR_DATA := $(addprefix $(top_srcdir)/src/include/catalog/,\
-	pg_language.dat pg_proc.dat \
-	)
-
 # fmgr-stamp records the last time we ran Gen_fmgrtab.pl.  We don't rely on
 # the timestamps of the individual output files, because the Perl script
 # won't update them if they didn't change (to avoid unnecessary recompiles).
-fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm $(FMGR_DATA) $(top_srcdir)/src/include/access/transam.h
-	$(PERL) -I $(catalogdir) $< -I $(top_srcdir)/src/include/ $(FMGR_DATA)
+fmgr-stamp: Gen_fmgrtab.pl $(catalogdir)/Catalog.pm 

Re: inconsistency and inefficiency in setup_conversion()

2018-05-18 Thread Thomas Munro
On Fri, May 18, 2018 at 10:53 PM, John Naylor  wrote:
> On 5/17/18, Thomas Munro  wrote:
>> Hi John,
>>
>> This failed for my patch testing robot on Windows, with this message[1]:
> ...
>> I see that you changed src/backend/catalog/Makefile to pass the new -I
>> switch to genbki.pl.  I think for Windows you might need to add it to
>> the line in src/tools/msvc/Solution.pm that invokes genbki.pl via
>> system()?
>
> Yes, you're quite right. Thanks for the report. I've attached an
> updated v2 patchset, with some additional revisions:

Looks good on Windows[1] but now it's broken on Linux[2] (also
reproducible on my macOS laptop):

make[2]: *** No rule to make target
`../../../src/include/pg_proc.dat', needed by `fmgr-stamp'.  Stop.

[1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.75
[2] https://travis-ci.org/postgresql-cfbot/postgresql/builds/380898356

-- 
Thomas Munro
http://www.enterprisedb.com



Re: inconsistency and inefficiency in setup_conversion()

2018-05-18 Thread John Naylor
On 5/17/18, Thomas Munro  wrote:
> Hi John,
>
> This failed for my patch testing robot on Windows, with this message[1]:
...
> I see that you changed src/backend/catalog/Makefile to pass the new -I
> switch to genbki.pl.  I think for Windows you might need to add it to
> the line in src/tools/msvc/Solution.pm that invokes genbki.pl via
> system()?

Yes, you're quite right. Thanks for the report. I've attached an
updated v2 patchset, with some additional revisions:

I wrote:

> -I have not done performance testing of initdb yet. I'll do so at a
> later date unless someone is excited enough to beat me to it.

Tom Lane reported [1] that setup_conversion() took ~70ms out of
1300ms. With this patch, on hardware with a similar total runtime, the
total is ~65ms faster, as one would expect based on the number of new
entries.

> -I piggy-backed on the OID lookup machinery for the encoding lookup,
> but haven't changed all the comments that refer only to catalogs and
> OIDs.

I considered changing the SGML documentation referring to OID
references [2], since this patch invalidates some details, but in the
end I thought that would make the docs harder to follow for the sake
of a small corner case. Instead, I added some comments around the
encoding lookups to alert the reader we're repurposing the OID lookup
machinery. I can revisit this later if desired.

> -With the 88 pg_proc entries with prolang=13 along with the 50 or so
> with prolang=14, it might be worth it to create a language lookup.
> This patch does not do so, however.

In version 2, patch 0001 adds a pg_language lookup. It turns out doing
so has a side benefit of simplifying Gen_fmgr.pl and its Makefile,
too.

--
[1] https://www.postgresql.org/message-id/7408.1525812528%40sss.pgh.pa.us
[2] 
https://www.postgresql.org/docs/devel/static/system-catalog-initial-data.html#SYSTEM-CATALOG-OID-REFERENCES


-John Naylor
From 686dd18ae238e1660ed3b2a7f6071cd5af20369d Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Fri, 18 May 2018 16:43:22 +0700
Subject: [PATCH v2 1/2] Add pg_language lookup.

This didn't seem worth doing before, but an upcoming commit will add
88 entries to pg_proc whose languge is 'c'. In addition, we can
now use 'internal' in Gen_fmgr.pl instead of looking up the OID.
This simplifies that script and its Makefile a bit.
---
 doc/src/sgml/bki.sgml|  6 +--
 src/backend/catalog/genbki.pl|  8 
 src/backend/utils/Gen_fmgrtab.pl |  9 ++--
 src/backend/utils/Makefile   |  8 +---
 src/include/catalog/pg_proc.dat  | 98 
 src/include/catalog/pg_proc.h|  2 +-
 src/tools/msvc/Solution.pm   |  4 +-
 7 files changed, 68 insertions(+), 67 deletions(-)

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 5b557ff..fa96bd3 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -403,8 +403,8 @@
 that's error-prone and hard to understand, so for frequently-referenced
 catalogs, genbki.pl provides mechanisms to write
 symbolic references instead.  Currently this is possible for references
-to access methods, functions, operators, opclasses, opfamilies, and
-types.  The rules are as follows:
+to access methods, functions, languages, operators, opclasses, opfamilies,
+and types.  The rules are as follows:

 

@@ -414,7 +414,7 @@
   Use of symbolic references is enabled in a particular catalog column
   by attaching BKI_LOOKUP(lookuprule)
   to the column's definition, where lookuprule
-  is pg_am, pg_proc,
+  is pg_am, pg_proc, pg_language,
   pg_operator, pg_opclass,
   pg_opfamily, or pg_type.
   BKI_LOOKUP can be attached to columns of
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index fb61db0..0440fdc 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -167,6 +167,13 @@ foreach my $row (@{ $catalog_data{pg_am} })
 	$amoids{ $row->{amname} } = $row->{oid};
 }
 
+# language OID lookup
+my %langoids;
+foreach my $row (@{ $catalog_data{pg_language} })
+{
+	$langoids{ $row->{lanname} } = $row->{oid};
+}
+
 # opclass OID lookup
 my %opcoids;
 foreach my $row (@{ $catalog_data{pg_opclass} })
@@ -241,6 +248,7 @@ foreach my $row (@{ $catalog_data{pg_type} })
 # Map catalog name to OID lookup.
 my %lookup_kind = (
 	pg_am   => \%amoids,
+	pg_language => \%langoids,
 	pg_opclass  => \%opcoids,
 	pg_operator => \%operoids,
 	pg_opfamily => \%opfoids,
diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl
index 5fd5313..0d7bd83 100644
--- a/src/backend/utils/Gen_fmgrtab.pl
+++ b/src/backend/utils/Gen_fmgrtab.pl
@@ -59,6 +59,8 @@ die "No include path; you must specify -I at least once.\n" if !@include_path;
 # Note: We pass data file names as arguments and then look for matching
 # headers to parse the schema from. This is backwards from genbki.pl,
 # but the Makefile 

Re: inconsistency and inefficiency in setup_conversion()

2018-05-17 Thread Thomas Munro
On Thu, May 3, 2018 at 12:19 AM, John Naylor  wrote:
> Attached is a draft patch to do this, along with the conversion script
> used to create the entries. In writing this, a few points came up that
> are worth bringing up:

Hi John,

This failed for my patch testing robot on Windows, with this message[1]:

Generating pg_config_paths.h...
No include path; you must specify -I.
Could not open src/backend/catalog/schemapg.h at
src/tools/msvc/Mkvcbuild.pm line 832.

I see that you changed src/backend/catalog/Makefile to pass the new -I
switch to genbki.pl.  I think for Windows you might need to add it to
the line in src/tools/msvc/Solution.pm that invokes genbki.pl via
system()?

[1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.48

-- 
Thomas Munro
http://www.enterprisedb.com



Re: inconsistency and inefficiency in setup_conversion()

2018-05-02 Thread John Naylor
On 4/28/18, Tom Lane  wrote:
> John Naylor  writes:
>> Solution #1 - As alluded to in [1], turn the conversions into
>> pg_proc.dat and pg_conversion.dat entries. Teach genbki.pl to parse
>> pg_wchar.h to map conversion names to numbers.
>> Pros:
>> -likely easy to do
>> -allows for the removal of an install target in the Makefile as well
>> as ad hoc logic in MSVC
>> -uses a format that developers need to use anyway
>> Cons:
>> -immediately burns up 88 hard-coded OIDs and one for each time a
>> conversion proc is created
>> -would require editing data in two catalogs every time a conversion
>> proc is created
>
> Given the rate at which new conversion procs have been created
> historically (ie, next door to zero, after the initial feature addition),
> I don't think that second "con" argument has any force.  Eating a batch
> of manually-assigned OIDs seems risky mainly just in that it might force
> adjustment of pending patches --- but we deal with that all the time.
> So I like this answer, I think.

Attached is a draft patch to do this, along with the conversion script
used to create the entries. In writing this, a few points came up that
are worth bringing up:

-In the original SQL file the functions were not declared with an
explicit volatility, so by default they are 'volatile'. That seems
wrong for this kind of function, so I changed it to 'immutable'. It
seems the CREATE CONVERSION facility was created shortly after the
volatility classes were created, and I couldn't find any discussion
about it.

-I have not done performance testing of initdb yet. I'll do so at a
later date unless someone is excited enough to beat me to it.

-I piggy-backed on the OID lookup machinery for the encoding lookup,
but haven't changed all the comments that refer only to catalogs and
OIDs.

-With the 88 pg_proc entries with prolang=13 along with the 50 or so
with prolang=14, it might be worth it to create a language lookup.
This patch does not do so, however.

-This actually uses up 220 OIDs (88 + 132), since the conversions need
them for their comments to be loaded.

> However, there is a "con" you didn't mention that perhaps ought to be
> accounted for.  The way things are done now, neither these C functions
> nor the pg_conversion entries are "pinned"; it's possible to drop and/or
> recreate them.  That perhaps had significant value during development
> of the conversions feature, but I'm doubtful that it's worth much
> anymore.  Still, it's worth pointing out in case somebody disagrees.

-For this draft, I let them get pinned, and changed the sanity test to
reflect that. It'd be easy enough to add exceptions to setup_depend(),
though. (one for pg_conversion, and one to change the pg_proc query to
exclude C language functions)

I'll create a commitfest entry soon.

-John Naylor
From 27605cd5ced2fd0a69598de93491cfb20a74836c Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Wed, 2 May 2018 17:50:53 +0700
Subject: [PATCH v1] Replace ad hoc format for conversion functions

Convert info for conversion functions into entries in pg_proc.dat
and pg_conversion.dat. This fixes wrong comments on the functions
and removes cruft from utils/mb/conversion_procs/Makefile,
initdb.c, and msvc/Install.pm.

Functional changes:
1. Conversions are now pinned. This can be reverted, but it's not
clear there would be any benefit in doing so.
2. The functions are now declared IMMUTABLE.
---
 src/backend/catalog/Makefile   |   2 +-
 src/backend/catalog/genbki.pl  |  56 +++-
 src/backend/utils/mb/conversion_procs/Makefile | 177 +--
 src/bin/initdb/initdb.c|  26 --
 src/include/catalog/pg_conversion.dat  | 417 +
 src/include/catalog/pg_conversion.h|  42 +--
 src/include/catalog/pg_proc.dat| 408 
 src/test/regress/expected/misc_sanity.out  |   1 -
 src/tools/msvc/Install.pm  |  38 ---
 9 files changed, 906 insertions(+), 261 deletions(-)
 create mode 100644 src/include/catalog/pg_conversion.dat

diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a54197d..c987ca0 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -85,7 +85,7 @@ generated-header-symlinks: $(top_builddir)/src/include/catalog/header-stamp
 # instead is cheating a bit, but it will achieve the goal of updating the
 # version number when it changes.
 bki-stamp: genbki.pl Catalog.pm $(POSTGRES_BKI_SRCS) $(POSTGRES_BKI_DATA) $(top_srcdir)/configure.in
-	$(PERL) -I $(catalogdir) $< --set-version=$(MAJORVERSION) $(POSTGRES_BKI_SRCS)
+	$(PERL) -I $(catalogdir) $< -I $(top_srcdir)/src/include/ --set-version=$(MAJORVERSION) $(POSTGRES_BKI_SRCS)
 	touch $@
 
 # The generated headers must all be symlinked into builddir/src/include/,
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index 

Re: inconsistency and inefficiency in setup_conversion()

2018-04-28 Thread Tom Lane
John Naylor  writes:
> Taking a close look at the result of setup_conversion(), wrong or at
> least confusing comments are applied to the functions.

Ugh.  Between that and the large chunk of initdb runtime eaten by
setup_conversion(), that seems like plenty of reason to redo it.

> Solution #1 - As alluded to in [1], turn the conversions into
> pg_proc.dat and pg_conversion.dat entries. Teach genbki.pl to parse
> pg_wchar.h to map conversion names to numbers.
> Pros:
> -likely easy to do
> -allows for the removal of an install target in the Makefile as well
> as ad hoc logic in MSVC
> -uses a format that developers need to use anyway
> Cons:
> -immediately burns up 88 hard-coded OIDs and one for each time a
> conversion proc is created
> -would require editing data in two catalogs every time a conversion
> proc is created

Given the rate at which new conversion procs have been created
historically (ie, next door to zero, after the initial feature addition),
I don't think that second "con" argument has any force.  Eating a batch
of manually-assigned OIDs seems risky mainly just in that it might force
adjustment of pending patches --- but we deal with that all the time.
So I like this answer, I think.

However, there is a "con" you didn't mention that perhaps ought to be
accounted for.  The way things are done now, neither these C functions
nor the pg_conversion entries are "pinned"; it's possible to drop and/or
recreate them.  That perhaps had significant value during development
of the conversions feature, but I'm doubtful that it's worth much
anymore.  Still, it's worth pointing out in case somebody disagrees.

regards, tom lane