On 10/26/17 16:10, Tom Lane wrote: > Peter Eisentraut <peter.eisentr...@2ndquadrant.com> writes: >> On 10/16/17 03:19, Thomas Kellerer wrote: >>> I don't know if this is intentional, but the Postgres 10 manual started to >>> use lowercase IDs as anchors in the manual. > >> Here is a patch that can be applied to PG 10 to put the upper case >> anchors back. >> The question perhaps is whether we want to maintain this patch >> indefinitely, or whether a clean break is better. > > In view of commit 1ff01b390, aren't we more or less locked into > lower-case anchors going forward? I'm not sure I see the point > of changing v10 back to the old way if v11 will be incompatible > anyhow.
The details are more complicated. The IDs in DocBook documents have two purposes. One is to ensure non-broken links between things like <sect1 id="foo"> and <xref linkend="foo">. This is set up in the DTD and checked during parsing (validation, more precisely). In DocBook SGML, many things including tag names, attribute names, and IDs are case insensitive. But in DocBook XML, everything is case sensitive. So in order to make things compatible for a conversion, we had to consolidate some variant spellings that have accumulated in our sources. For simplicity, I have converted everything to lower case. The other purpose is that the DocBook XSL and DSSSL stylesheets use the IDs for creating anchors in HTML documents (and also for the HTML file names themselves). This is merely a useful choice of those stylesheets. In PG 9.6 and earlier, we used a straight SGML toolchain, using Jade and DSSSL. The internal representation of a DocBook SGML document after parsing converts all the case insensitive bits to upper case. (This might be configured somewhere; I'm not sure.) So the stylesheets see all the IDs as upper case to begin with, and that's why all the anchors come out in upper case in the HTML output. In PG 10, the build first converts the SGML sources to XML, redeclares them as DocBook XML, then builds using XSLT. Because DocBook XML requires lower-case tags and attribute names, we have to use the osx -x lower option to convert all the case-insensitive bits to lower case instead of the default upper case. That's why the XSLT stylesheets see the IDs as lower case and that's why they are like that in the output. (If there were options more detailed than -x lower, that could have been useful.) The proposed patch works much later in the build process and converts IDs to upper case only when they are being considered for making an HTML anchor. The structure of the document as far as the XML parser is concerned stays the same. For PG 11, the idea is to convert the sources to a pure XML document. XML is case insensitive, so the XML parser would see the IDs as what they are. Without the mentioned patch to convert all IDs to lower case in the source, the XSL processor would see the IDs in whatever case they are, and anchors would end up in the HTML output using whatever case they are. So the conversion to lower case in the source also ensured anchor compatibility to PG 10. Otherwise, someone might well have complained in a similar manner a year from now. Applying the proposed patch to master/PG 11 would have the same effect as in PG 10. It would convert anchors to upper case in the HTML output but leave the logical and physical structure of the XML document alone. So the options are simply 1) Use the patch and keep indefinitely, keeping anchors compatible back to forever and forward indefinitely. 2) Don't use the patch, breaking anchors from <=9.6, but keeping them compatible going forward. Considering how small the patch is compared to some other customizations we carry, #1 seems reasonable to me. I just didn't know to what extent people had actually bookmarked fragment links. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers