On 20.04.2016 20:41, Simon Riggs wrote:
On 20 April 2016 at 15:30, Jürgen Purtz <juer...@purtz.de <mailto:juer...@purtz.de>> wrote:

    What I have done so far is:

      * Conversion of sgml files to valid xml syntax with a perl
        skript. I failed to use 'osx' or 'spam'.
      * Conversion of these xml files to Docbook5.x format using
        xsltproc and Docbooks xslt-migration skripts.
      * Creation of html files using xsltproc and Docbooks xslt skripts.
      * Creation of fo files using xsltproc and Docbooks xslt skripts.
      * Creation of pdf files using fop.
      * The conversions needs less than 10 minutes on a Intel i5
        processor.

So you believe you have/can convert between the two formats accurately, so we can change things in a single commit?

What verification is offered? Possible?

And that is ready to go now? Will you post your perl script, or the patch? Other projects use the same file formats, e.g. Slony, XL etc

If an automatic migration is possible do we need to change at all?

--
Simon Riggs http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Hi,

actually I have done only a first raw round-trip to evaluate that there is no showstopper for my plans. If we find a consensus in the community that this work is valuable for the postgres documentation I will continue to work on it in the near future. To answer your questions:

 * "do we need to change at all?". This question has to be discussed in
   the community. I tried to use the recommended tools like 'osx' and
   'spam' - and failed (not at all but in details like newline
   processing). This may be a my fault, or it results from the fact
   that we still use sgml instead of xml. But over time this task will
   get harder and harder: sgml knowledge gets lost, sgml-tools are no
   longer actively developed, xml move foreward, ...
 * Actually I don't see any showstopper. Therefore I believe that the
   conversion from Docbook 4 to 5 is manageable. The plan is that we
   will have one xml-file in db5 format per every sgml file in db4 format.
 * To support the repository in a continuous way we shall do something
   like 'git mv file.sgml file.xml', put the new content to 'file.xml'
   and 'git commit'. Additionally the newlines must be kept during all
   conversation steps.
 * Maybe some very individual (manual) steps are necessary, but it
   shall be possible that also this can be scripted. Therefore the
   conversion shall run fast and a single commit shall work on the
   complete documentation.
 * There are no special "Postgres" tasks in the Perl script or at any
   other places. It depends on docbook only. Therefore other projects
   can use it in the same way. Of course I will publish all sources.
 * Actually I try to generate well-formed xml. Validation against the
   Docbook 5 schema will follow.



Alexander Law posted additional suggestions and questions:

   Hello Jürgen,

   Please look at the discussion that we had some time ago:
   http://www.postgresql.org/message-id/56337365.2080...@postgrespro.ru

   And we (postgrespro) still have plans to migrate to XML as soon as
   we get documentation translated.
   We had no issues with SGML->XML conversion, "make postgres.xml"
   creates XML (with entities and alike), which we use.

   When you talking about "conversion of html, fo, pdf, ..." do you
   mean using docs/sgml/Makefile or some other scripts?

   As to conversion SGML to XML, we need to decide whether to generate
   a single XML, or a set of XMLs (corresponding to current SGMLs).
   In the latter case - how to include XML-fragments into the main
   document (as entities or with xi:include)?

   Please, can you explain what are "Docbooks xslt-migration scripts"?
   Is Docbook 4.x incompatible with Docbook 5.x and we need to convert
   it additionally?


   Best regards,
   Alexander

   -----
   Alexander Lakhin
   Postgres Professional: http://www.postgrespro.com
   The Russian Postgres Company


My answers:

 * Docbook 4 and 5 are not compatible. There are new elements, others
   have gone and are replaced by more generic ones. But the Docbook
   project offers xslt's to convert Docbook 4 xml-files to Docbook5
   xml-files.
 * There are pros and cons using postgres.xml as a starting point. PRO:
   well formed (and valid?) xml format. Entities keeps alive. No more
   "<![CDATA[", "<![%include" and similar sgml constructs. CON: Only
   one file. Ugly line break algorithm.
 * Actually I don't use the existing Makefile. I start Perl, xsltproc
   and fop with a different script. If I continue to work, I have to
   change the Makefile.
 * "how to include XML-fragments into the main document (as entities or
   with xi:include) ?". As described above, I prefer one file per
   existing sgml-file. But some of those sgml-files have more than one
   root element. It such situations (and without further processing)
   the resulting xml-files will have fragments. In general it will be
   more "Docbook 5 compliant" to use xi:include instead of entities.
 * "Docbooks xslt-migration scripts": see:
   http://docbook.org/docs/howto/#convert4to5


Kind regards
Jürgen Purtz


Reply via email to