On Mon, 18 Jan 2016, Richard Hamilton wrote:
Hi Warren,
I'm not sure if I can come up with a standard, or truly elegant way of doing
this, but here are a few possibilities:
1) If the only entities in your files are the ones that shouldn't be
translated, just don't have xmllint resolve the entities when you normalize the
file, then use it again to expand entities when you get the translated files
back. Of course, this option won't work if you have other entities that contain
content you need to translate.
Unfortunately, we use lots of entities for lots of things, and almost
all of them need to be translated.
2) Place the large blocks in files that would be included using xinclude. You
could then translate the file before resolving the inclusions. Of course, this
one doesn't work if you have other xincludes that contain content you need to
translate.
The good aspect of that is that we currently don't have anything that
uses xinclude. It seems like we would not be able to use xinclude
for anything else afterward, though.
3) Create a simple XSL stylesheet or script (Perl, PHP, ...) that would convert
the code as you have shown below, leaving everything else as is, then another
script that converts the translated file back. You could then run xmllint to
resolve those entities. This is a five-step process:
- Run the first script to transform the translate on/off processing
instructions into non-entities.
- Run xmllint to normalize everything else
- Translate
- Run the second script on the translated file to restore the entities inside
the translate instructions
- Run xmllint to resolve those entities.
Right. It would be really nice if xmllint could call external programs
to handle processing instruction filters. Then the entity-to-text and
text-to-entity conversion would be done in memory.
Otherwise, we would have to temporarily modify the original source files
on disk (!), run xmllint, then restore them:
make backup copies of document XML files
transform marked entities to text in original files
create normalized file with xmllint
restore backups
I hope that helps.
It does, yes. Thank you!
PS: I added my FreeBSD.org address to Cc. Apologies if my other address
bounced anyone's mail.
On Jan 18, 2016, at 20:42, Warren Block <[email protected]> wrote:
Some of the articles in the FreeBSD documentation use entities to include large
blocks of data. For example, one article is just a very large list of PGP keys
for developers:
https://www.freebsd.org/doc/en_US.ISO8859-1/articles/pgpkeys/index.html
The DocBook article.xml is only 2K, because it does things like this:
<sect1 xml:id="pgpkeys-officers">
<title>Officers</title>
§ion.pgpkeys-officers;
</sect1>
We use xmllint to normalize the article into a single XML file for use with PO
translation tools. Of course, all entities are expanded into text at that
point.
It would be really nice to mark that particular entity as one that should be
preserved in the translated file. Is it possible to do that with processing
instructions or some other method? For example:
<sect1 xml:id="pgpkeys-officers">
<title>Officers</title>
<?translate off?>
§ion.pgpkeys-officers;
<?translate on?>
</sect1>
In the normalized file, it could be a string to indicate to translators that it
should be left alone:
<sect1 xml:id="pgpkeys-officers">
<title>Officers</title>
<?translate off?>
do not translate: section.pgpkeys-officers
<?translate on?>
</sect1>
Of course, it has to be changed back when the translated XML file is generated.
Is there a standard or elegant way to do this?
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]