Hi, On Mon, 16 Feb 2026 at 03:47, Hayato Kuroda (Fujitsu) <[email protected]> wrote: > > Dear Nazir, > > I created a similar patch with yours because I did not recognize this thread. > Not sure you are still interested in, but here are my comments based on my > experience.
Thank you for looking into this! > 00. > Needs rebase. Done. > 01. > I found that postgres core was built when I ran `meson test --suite doc` > command, but ideally it's not needed. Per my experiment, we must clarify > dependencies for the test, something like below. > > ``` > --- a/doc/src/sgml/meson.build > +++ b/doc/src/sgml/meson.build > @@ -337,7 +337,8 @@ test( > 'doc' / 'sgml_syntax_check', > meson_bin, > args: ['compile', postgres_full_xml_target], > - suite: 'doc' > + suite: 'doc', > + depends: doc_generated, > ) > ``` Sorry, I didn't understand what you meant, could you please elaborate? 'postgres_full_xml_target' target already depends on 'doc_generated' and 'sgml_check_stamp' targets. > 02. > Missing update of copyright. Done. > 03. > I found that for the autoconf build system, syntax checking cannot work again > if nothing is changed. This behavior is different from the HEAD. Per my > experiment, below can fix the issue. > > ``` > --- a/doc/src/sgml/Makefile > +++ b/doc/src/sgml/Makefile > @@ -201,6 +201,7 @@ MAKEINFO = makeinfo > ## > > # Quick syntax check without style processing > +.PHONY: postgres-full.xml > check: postgres-full.xml > ``` Yes, there is a behavior change but I don't think this is an issue. If nothing is changed, syntax check just reports it as: ``` $ make check make: Nothing to be done for 'check'. ``` > 04. > `make check` is slightly slower than HEAD. I think it is OK but others can > also > confirm and can test on their env. > > HEAD: > ``` > $ time make check > ... > > real 0m0.592s > user 0m0.507s > sys 0m0.084s > ``` > > patched: > ``` > $ time make check > ... > > real 0m1.217s > user 0m0.973s > sys 0m0.237s > ``` It is slower on my end too: HEAD: real 0m0.344s user 0m0.301s sys 0m0.041s Patched: real 0m0.451s user 0m0.373s sys 0m0.076s The difference is not that big on my end, though. I think this is expected and acceptable but I would like to hear other people's thoughts too. -- Regards, Nazir Bilal Yavuz Microsoft
From 71071e4ce5eacce5d63ec25508029425716911bf Mon Sep 17 00:00:00 2001 From: Nazir Bilal Yavuz <[email protected]> Date: Mon, 16 Feb 2026 09:54:13 +0300 Subject: [PATCH v9] Improve docs syntax checking Move the checks out of the Makefile into a perl script that can be called from both the Makefile and meson.build. The set of files checked is simplified, so it is just all the sgml and xsl files found in docs/src/sgml directory tree. Along the way make some adjustments to .cirrus.tasks.yml to support this better in CI. Also ensure that the checks are run while generating the postgres-full.xml. Author: Nazir Bilal Yavuz <[email protected]> Co-Author: Andrew Dunstan <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Reviewed-by: Hayato Kuroda <[email protected]> Discussion: https://postgr.es/m/CAN55FZ1qzoDcaKqsR3DwE%3DX6FL%2Bwpm%2B%3DKLvH6ahrRXNhjU53DQ%40mail.gmail.com --- .cirrus.tasks.yml | 3 + doc/src/sgml/Makefile | 16 +---- doc/src/sgml/meson.build | 35 ++++++++- doc/src/sgml/sgml_syntax_check.pl | 115 ++++++++++++++++++++++++++++++ 4 files changed, 154 insertions(+), 15 deletions(-) create mode 100755 doc/src/sgml/sgml_syntax_check.pl diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml index 2a821593ce5..78f167656f8 100644 --- a/.cirrus.tasks.yml +++ b/.cirrus.tasks.yml @@ -640,6 +640,8 @@ task: TEST_JOBS: 8 IMAGE: ghcr.io/cirruslabs/macos-runner:sequoia + XML_CATALOG_FILES: /opt/local/share/xml/docbook/4.5/catalog.xml + CIRRUS_WORKING_DIR: ${HOME}/pgsql/ CCACHE_DIR: ${HOME}/ccache MACPORTS_CACHE: ${HOME}/macports-cache @@ -654,6 +656,7 @@ task: MACOS_PACKAGE_LIST: >- ccache + docbook-xml-4.5 icu kerberos5 lz4 diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile index b53b2694a6b..50de8fab931 100644 --- a/doc/src/sgml/Makefile +++ b/doc/src/sgml/Makefile @@ -69,6 +69,7 @@ ALL_IMAGES := $(wildcard $(srcdir)/images/*.svg) # files into one big file). This helps tools that don't understand # vpath builds (such as dbtoepub). postgres-full.xml: postgres.sgml $(ALL_SGML) + $(PERL) $(srcdir)/sgml_syntax_check.pl --srcdir $(srcdir) $(XMLLINT) $(XMLINCLUDE) --output $@ --noent --valid $< @@ -200,8 +201,7 @@ MAKEINFO = makeinfo ## # Quick syntax check without style processing -check: postgres.sgml $(ALL_SGML) check-tabs check-nbsp - $(XMLLINT) $(XMLINCLUDE) --noout --valid $< +check: postgres-full.xml ## @@ -261,18 +261,6 @@ clean-man: endif # sqlmansectnum != 7 -# tabs are harmless, but it is best to avoid them in SGML files -check-tabs: - @( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \ - (echo "Tabs appear in SGML/XML files" 1>&2; exit 1) - -# Non-breaking spaces are harmless, but it is best to avoid them in SGML files. -# Use perl command because non-GNU grep or sed could not have hex escape sequence. -check-nbsp: - @ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END {exit($$n>0)}' \ - $(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl $(srcdir)/images/*.xsl) ) || \ - (echo "Non-breaking spaces appear in SGML/XML files" 1>&2; exit 1) - ## ## Clean ## diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build index a1ae5c54ed6..a0051b97825 100644 --- a/doc/src/sgml/meson.build +++ b/doc/src/sgml/meson.build @@ -98,6 +98,26 @@ xmllint = xmltools_wrapper + [ '--tool', xmllint_bin, '--', ] +sgml_syntax_check = files( + 'sgml_syntax_check.pl' +) + +# Create a stamp file for sgml syntax check and make postgres_full_xml +# depend on it. So, this check will run before the generation of +# postgres_full_xml. +sgml_check_stamp = custom_target('sgml-check-stamp', + output: 'sgml-check-stamp', + depfile: 'sgml-check-stamp.d', + command: [ + perl, sgml_syntax_check, + '--srcdir', meson.current_source_dir(), + '--builddir', meson.current_build_dir(), + '--dep_file', '@DEPFILE@', + '--stamp_file', '@OUTPUT@', + ], + build_by_default: false, +) + # Run validation only once, common to all subsequent targets. While # we're at it, also resolve all entities (that is, copy all included # files into one big file). This helps tools that don't understand @@ -108,12 +128,15 @@ postgres_full_xml = custom_target('postgres-full.xml', depfile: 'postgres-full.xml.d', command: [xmllint, '--nonet', '--noent', '--valid', '--path', '@OUTDIR@', '-o', '@OUTPUT@', '@INPUT@'], - depends: doc_generated, + depends: [doc_generated, sgml_check_stamp], build_by_default: false, ) docs += postgres_full_xml alldocs += postgres_full_xml +postgres_full_xml_target = 'postgres_full_xml' +alias_target(postgres_full_xml_target, postgres_full_xml) + if xsltproc_bin.found() xsltproc_flags = [ @@ -308,3 +331,13 @@ endif if alldocs.length() != 0 alias_target('alldocs', alldocs) endif + +# Test creation of 'postgres_full_xml'. +test( + 'doc' / 'sgml_syntax_check', + meson_bin, + args: ['compile', postgres_full_xml_target], + suite: 'doc' +) + +testprep_targets += doc_generated diff --git a/doc/src/sgml/sgml_syntax_check.pl b/doc/src/sgml/sgml_syntax_check.pl new file mode 100755 index 00000000000..61b4cd203c0 --- /dev/null +++ b/doc/src/sgml/sgml_syntax_check.pl @@ -0,0 +1,115 @@ +#!/usr/bin/perl +# +# Perl script that checks SGML/XSL syntax and formatting issues. +# +# doc/src/sgml/sgml_syntax_check.pl +# +# Copyright (c) 2026, PostgreSQL Global Development Group + +use strict; +use warnings FATAL => 'all'; +use Getopt::Long; + +use File::Find; + +my $srcdir; +my $builddir; +my $dep_file; +my $stamp_file; + +GetOptions( + 'srcdir:s' => \$srcdir, + 'builddir:s' => \$builddir, + 'dep_file:s' => \$dep_file, + 'stamp_file:s' => \$stamp_file) or die "$0: wrong arguments"; + +die "$0: --srcdir must be specified\n" unless defined $srcdir; + +# Find files to process - all the sgml and xsl files (including in subdirectories) +my @files_to_process; +my @dirs_to_search = ($srcdir); +push @dirs_to_search, $builddir if defined $builddir; +find( + sub { + return unless -f $_; + return if $_ !~ /\.(sgml|xsl)$/; + push @files_to_process, $File::Find::name; + }, + @dirs_to_search,); + +# tabs and non-breaking spaces are harmless, but it is best to avoid them in SGML files +sub check_tabs_and_nbsp +{ + my $errors = 0; + for my $f (@files_to_process) + { + open my $fh, "<:encoding(UTF-8)", $f or die "Can't open $f: $!"; + my $line_no = 0; + while (<$fh>) + { + $line_no++; + if (/\t/) + { + print STDERR "Tab found in $f:$line_no\n"; + $errors++; + } + if (/\xC2\xA0/) + { + print STDERR "$f:$line_no: contains non-breaking space\n"; + $errors++; + } + } + close($fh); + } + + if ($errors) + { + remove_stamp_file(); + die "Tabs and/or non-breaking spaces appear in SGML/XML files\n"; + } +} + +sub write_dep_file() +{ + open my $fh, '>', $dep_file or die "can't open $dep_file: $!"; + + foreach my $file (@files_to_process) + { + print $fh "$dep_file : $file\n"; + } + + close $fh; +} + +sub create_stamp_file +{ + # Avoid touching existing stamp file to prevent unnecessary rebuilds + if (!(-f $stamp_file)) + { + open my $fh, '>', $stamp_file + or die "can't open $stamp_file: $!"; + close $fh; + } +} + +sub remove_stamp_file +{ + if ($stamp_file) + { + unlink $stamp_file; + } +} + +# Create a dependency file for meson build +if ($dep_file) +{ + write_dep_file(); +} + +check_tabs_and_nbsp(); + +# All checks are passed, we can create a stamp file meson build now +if ($stamp_file) +{ + create_stamp_file(); +} -- 2.47.3
