Re: [docbook-apps] Preserving entities and language translation

Warren Block Tue, 19 Jan 2016 07:12:25 -0800

On Mon, 18 Jan 2016, Richard Hamilton wrote:

Hi Warren,


I'm not sure if I can come up with a standard, or truly elegant way of doing 
this, but here are a few possibilities:

1) If the only entities in your files are the ones that shouldn't be 
translated, just don't have xmllint resolve the entities when you normalize the 
file, then use it again to expand entities when you get the translated files 
back. Of course, this option won't work if you have other entities that contain 
content you need to translate.

Unfortunately, we use lots of entities for lots of things, and almostall of them need to be translated.

2) Place the large blocks in files that would be included using xinclude. You 
could then translate the file before resolving the inclusions. Of course, this 
one doesn't work if you have other xincludes that contain content you need to 
translate.

The good aspect of that is that we currently don't have anything thatuses xinclude. It seems like we would not be able to use xincludefor anything else afterward, though.

3) Create a simple XSL stylesheet or script (Perl, PHP, ...) that would convert 
the code as you have shown below, leaving everything else as is, then another 
script that converts the translated file back. You could then run xmllint to 
resolve those entities. This is a five-step process:

 - Run the first script to transform the translate on/off processing 
instructions into non-entities.
 - Run xmllint to normalize everything else
 - Translate
 - Run the second script on the translated file to restore the entities inside 
the translate instructions
 - Run xmllint to resolve those entities.

Right. It would be really nice if xmllint could call external programsto handle processing instruction filters. Then the entity-to-text andtext-to-entity conversion would be done in memory.

Otherwise, we would have to temporarily modify the original source fileson disk (!), run xmllint, then restore them:


  make backup copies of document XML files
  transform marked entities to text in original files
  create normalized file with xmllint
  restore backups

I hope that helps.


It does, yes.  Thank you!

PS: I added my FreeBSD.org address to Cc. Apologies if my other addressbounced anyone's mail.

On Jan 18, 2016, at 20:42, Warren Block <[email protected]> wrote:

Some of the articles in the FreeBSD documentation use entities to include large 
blocks of data.  For example, one article is just a very large list of PGP keys 
for developers: 
https://www.freebsd.org/doc/en_US.ISO8859-1/articles/pgpkeys/index.html

The DocBook article.xml is only 2K, because it does things like this:

 <sect1 xml:id="pgpkeys-officers">
   <title>Officers</title>

   &section.pgpkeys-officers;
 </sect1>

We use xmllint to normalize the article into a single XML file for use with PO 
translation tools.  Of course, all entities are expanded into text at that 
point.

It would be really nice to mark that particular entity as one that should be 
preserved in the translated file.  Is it possible to do that with processing 
instructions or some other method?  For example:

 <sect1 xml:id="pgpkeys-officers">
   <title>Officers</title>

   <?translate off?>
   &section.pgpkeys-officers;
   <?translate on?>
 </sect1>

In the normalized file, it could be a string to indicate to translators that it 
should be left alone:

 <sect1 xml:id="pgpkeys-officers">
   <title>Officers</title>

   <?translate off?>
   do not translate: section.pgpkeys-officers
   <?translate on?>
 </sect1>

Of course, it has to be changed back when the translated XML file is generated.

Is there a standard or elegant way to do this?

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [docbook-apps] Preserving entities and language translation

Reply via email to