Re: [docbook-apps] Preserving entities and language translation

Fekete , Róbert Tue, 19 Jan 2016 23:09:08 -0800

Hi,

where we need to translate docbook content, we use separate branches or
subdirectories for the different language versions, and our build system
takes care of finding the right content.


HTH,

Robert

On Tue, Jan 19, 2016 at 11:18 PM, Warren Block <[email protected]> wrote:

> On Tue, 19 Jan 2016, Shaun McCance wrote:
>
> I'm not sure if you've successfully made the switch to itstool yet for
>> your PO round-tripping. I think we talked a bit about entity expansion
>> at the Open Help Conference last year.
>>
>
> Yes, we have:
>
> https://www.freebsd.org/news/status/report-2015-07-2015-09.html#PO-Translation-Project
>
> You are even mentioned there. :)
>
> The default behavior in itstool is that it expands entities, but does
>> not do XInclude. So I recommend using entities for words, phrases, and
>> other mid-sentence substitutions. Use XInclude for entire blocks or
>> sections.
>>
>
> But again, doesn't that limit when we can use XInclude?  Right now, we use
> entities for entire chapters in books.  Those chapters can include inline
> entities that need to be translated, too.  Here is our documentation manual
> in the repository:
> https://svnweb.freebsd.org/doc/head/en_US.ISO8859-1/books/fdp-primer/
>
> The chapters are defined as system entities here:
>
> https://svnweb.freebsd.org/doc/head/en_US.ISO8859-1/books/fdp-primer/chapters.ent?view=markup
>
> The main book.xml uses those entities at the end:
>
> https://svnweb.freebsd.org/doc/head/en_US.ISO8859-1/books/fdp-primer/book.xml?view=markup
>
> The idea is that translators have a difficult time with mid-sentence
>> substitutions when they need to do inflections and declensions on words.
>> For example:
>>
>> <!ENTITY b "button">
>> <para>Click the &b;.</para>
>> <para>The &b; is blue.</para>
>>
>> A translator can't reliably translate this without entity substitution,
>> because she might need to translate "button" differently in each case.
>> You can override this with the -k option, but I don't recommend it.
>>
>
> I never did get -k to work.  It left entities unexpanded but then would
> choke on them, possibly due to our use of XML catalogs.  With that and the
> translation issues, we just went with expanding all entities.  That is fine
> except for these special-case uses for big lists of things that should
> *not* be translated.  Like the PGP keys entities:
> https://svnweb.freebsd.org/doc/head/share/pgpkeys/pgpkeys.ent?view=markup
>
> If we could somehow figure out which entities are inline and which are
>> block-level, that might help. Perhaps not expanding SYSTEM entities
>> (likely used for large blocks), expanding regular entities (likely used
>> for words and phrases), and expanding SYSTEM parameter entities (likely
>> used to pull in entity definitions).
>>
>
> Given the way our stuff is defined, that might work for some things.
>
> I don't know if libxml2's API allows for that. libxml2 can do a lot that
>> xmllint doesn't expose.
>>
>
> For us, xmllint is not so much a linter as an XML processing tool. Maybe
> for others, too.  A mechanism for it to use PIs to say "transform these
> entities this way" seems logical.  (Er, that's "transform" in the text or
> PCRE sense, not in the XSLT "dude, let's make XML into a programming
> language" sense.)  Right now it only has --noent, so it's all or nothing.
>
> Even being able to give xmllint a list of entities to preserve/not expand
> would be fine.  Those could be changed to non-entities by postprocessing
> the output file before the PO information is extracted.
>
>
> On Mon, 2016-01-18 at 21:42 -0700, Warren Block wrote:
>>
>>> Some of the articles in the FreeBSD documentation use entities to
>>> include large blocks of data.  For example, one article is just a very
>>> large list of PGP keys for developers:
>>> https://www.freebsd.org/doc/en_US.ISO8859-1/articles/pgpkeys/index.html
>>>
>>> The DocBook article.xml is only 2K, because it does things like this:
>>>
>>>    <sect1 xml:id="pgpkeys-officers">
>>>      <title>Officers</title>
>>>
>>>      &section.pgpkeys-officers;
>>>    </sect1>
>>>
>>> We use xmllint to normalize the article into a single XML file for use
>>> with PO translation tools.  Of course, all entities are expanded into
>>> text at that point.
>>>
>>> It would be really nice to mark that particular entity as one that
>>> should be preserved in the translated file.  Is it possible to do that
>>> with processing instructions or some other method?  For example:
>>>
>>>    <sect1 xml:id="pgpkeys-officers">
>>>      <title>Officers</title>
>>>
>>>      <?translate off?>
>>>      &section.pgpkeys-officers;
>>>      <?translate on?>
>>>    </sect1>
>>>
>>> In the normalized file, it could be a string to indicate to translators
>>> that it should be left alone:
>>>
>>>    <sect1 xml:id="pgpkeys-officers">
>>>      <title>Officers</title>
>>>
>>>      <?translate off?>
>>>      do not translate: section.pgpkeys-officers
>>>      <?translate on?>
>>>    </sect1>
>>>
>>> Of course, it has to be changed back when the translated XML file is
>>> generated.
>>>
>>> Is there a standard or elegant way to do this?
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: [docbook-apps] Preserving entities and language translation

Reply via email to