[docbook-apps] RE: [External Email] Re: [docbook-apps] UI strings vs manual strings ?

2022-12-20 Thread Riffel, Patrick
Hi all,

the issue of keeping UI strings and documentation in sync sounds familiar and 
I'd like to let you know how we've been doing it for last 12 years or so - not 
perfect but robust enough to publish several dozen documents with up to 1300 
pages in two (was four) languages as PDF and webhelp.

The software we are writing documentation for is using dozens of XML files for 
its configuration and user interface. Each UI string comes with an ID that is 
more or less unique, and we can therefore reuse it in our documentation. 
However, almost every configuration file has a slightly different content model 
to the one before and therefore we have to pre-process everything before we can 
actually use it.

In this pre-processing step, a script pulls together all the UI strings from 
all the different configurations into a "normalized" xml file. This step also 
pulls in the translated versions of strings. You end up with a rather big xml 
file ("alltext.xml"), in which you have the ID, the original string and 
translations in one place. If a translation is missing, we add a placeholder 
text.

On the Docbook side, we are using guilabel/guimenu/guimenuitem to tag UI text. 
By our convention, each of these elements must have a @remap attribute with its 
value set to one of the IDs you can find in the software configuration files 
(or rather in that big xml file from the previous step). Referencing the ID is 
the important bit for the automation. For readability, the Technical Writer 
would also use the UI string as content for guilabel/guimenu/guimenuitem. 
However, this isn't strictly necessary as the element's content will be 
replaced at a later stage.

Our publishing process is automated via ant/jenkins. Once all xincludes have 
been resolved, we use some XSLT on the resulting temporary file and replace the 
UI strings with the most recent versions from "alltext.xml". And that's more or 
less it - an ID-based lookup and string replacement.

To assist our Technical Writers, we also provide modified versions of the 
software configuration. This configs can be used to run the software in a way 
that will display the internal ID alongside the UI text. By doing this, the 
Technical Writer is provided with sufficient context to find the right ID to be 
used for @remap. It also helps to avoid confusion where the configuration files 
have several copies of the same UI string but with different ID values.

Since the UI strings are translated separately and keep their original ID, the 
remapping also works when we are publishing in other languages (our primary 
language is German, and we translate into English. This process has been used 
in the past to translate into French, Italian and Russian as well).

A word on the authoring process: We are using oXygen XML and have put together 
some schematron rules that will flag up when a remap attribute is missing or 
when the content of a guilabel is different from the current value in the 
lookup file. This was done as a proof-of-concept and isn't required for 
authoring, but we encourage using it. We are also experimenting with schematron 
quickfixes to replace UI strings where necessary - this also is experimental 
and not a feature we are using all the time, but it comes in handy every now 
and then.

We do most of our translation inhouse (we used OmegaT in the past, but not 
anymore, sorry!). The documentation and the UI strings are translated 
separately (by the same Translator) into two separate translation memories. 
When translating the documentation, all guilabel/guimenu/guimenuitem are set to 
be non-translatable. However, our Translator sees the element content and thus 
has all the context information they need. Once they export their files to the 
target language, the UI strings will still be in German. The German text is 
then replaced via a combination of @lang and @remap when publishing to their 
respective output format.

For those cases where UI text does need to be translated/localized, a Technical 
Writer can set a @translate attribute to "yes" on guilabel elements and 
override the replacement. The TMS will unlock those elemetns and the Translator 
can modify the content of those elements.

I am aware that this process might be problematic with certain languages and/or 
if the source material is not well maintained and probably
for many other reasons, too. We had trouble in the past with UI text that was 
split into several individual strings or when placeholders are
being used. With regards to mnemonics and shortcuts - those are indeed 
annoying. Our specific problem is that the software configuration wasn't 
designed to be translation friendly, and we basically have to strip away things 
before going into translation. I do believe that this problem could be solved 
if we put in some effort in redesigning the configuration files. Screenshots 
also matter and need manual updates when the UI text changes.

Bottom line is: We manage to keep our 

Re: [docbook-apps] UI strings vs manual strings ?

2022-12-16 Thread Jean-Christophe Helary



> On Dec 13, 2022, at 19:28, Tony Graham  wrote:
> 
> What result are you looking for?

I am looking for an authoring process where software UI strings can 
easily be handled in the documentation.

I'm imagining that there would be an editor that uses a UI strings 
"library" as reference and calls its contents when required in the doc 
during the build process.

What would be the best way to achieve that in a DocBook centered process?

> Are you treating one language (say, English) as the main language (which
> has empty elements for value lookups) and the other languages as end
> products (which have all text filled in), where you'd use OmegaT's
> translation memory to keep translations consistent across revisions?

I'm not sure I understand the above question, even though I've been 
using OmegaT almost daily for the past 20 years.

> Or do you want the other languages to be structurally equivalent to the
> main version (apart from inline elements moved around because of
> sentence structure), where elements containing text are turned back into
> empty elements?

I don't understand the second part "where elements containing text are 
turned back into empty elements?".

Jean-Christophe 

> 
> Regards,
> 
> 
> Tony Graham.
> -- 
> Senior Architect
> XML Division
> Antenna House, Inc.
> 
> Skerries, Ireland
> tgra...@antenna.co.jp
> 
> -
> To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
> 

-- 
Jean-Christophe Helary @jchel...@emacs.ch
https://traductaire-libre.org
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org



Re: [docbook-apps] UI strings vs manual strings ?

2022-12-13 Thread Tony Graham

What result are you looking for?

Are you treating one language (say, English) as the main language (which
has empty elements for value lookups) and the other languages as end
products (which have all text filled in), where you'd use OmegaT's
translation memory to keep translations consistent across revisions?

Or do you want the other languages to be structurally equivalent to the
main version (apart from inline elements moved around because of
sentence structure), where elements containing text are turned back into
empty elements?

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.

Skerries, Ireland
tgra...@antenna.co.jp

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org



Re: [docbook-apps] UI strings vs manual strings ?

2022-12-06 Thread Jean-Christophe Helary
Thank you all for the replies so far.

Let me reply in one mail.

> On Dec 6, 2022, at 21:44, Tony Graham  wrote:
> 
>> Problem at hand:
>> - a Java application with ~2k UI strings (not all users facing), in
>> a Bundle.properties file
> 
> Java also has an XML format for properties files.

Interesting. It could be part of a solution (esp. considering Florimond's 
reply).

>> - a ~80K words DocBook manual
>> It is not trivial to keep track of the whole string set (searches, etc.)
>> Also, the l10n process takes place on the DocBook sources, not on
>> the HTML output, so tricks like  don't work because 
>> translators don't see the target terms.
> 
> Before translation, replace each  with the replacement text from
> the XML properties file wrapped in a well-known element that still
> carries the identifier for the properties file entry.
> 
> After translation, if necessary, convert the well-known elements back
> into  and also do something to handle the strings that have been
> translated differently in different places.

The problem is that it's not possible to do that for a lot of languages. There 
are inflected forms that transform the text of the "endterm" part and the 
translation targets 3 dozen languages, including BiDi documents.

That process would add another layer of transformation+verification.

Or maybe I missed something?

> Once you have the properties file for a second language, you could
> insert the translated strings in place of  when preparing for
> translation.  Alternatively, or as well, you could set up your
> computer-aided translation tool to not translate the well-known elements
> for the strings and insert the translated strings after everything else
> is translated.

It looks feasible but only with a small set of target languages.

>> I'm left with having to rewrite the strings explicitly and that's a pain, 
>> and also adds risks of mistakes in translations.
> 
> The more that you can automate, the better.

Hence the question ;-)


> On Dec 6, 2022, at 22:04, Alemps Florimond  wrote:
> 
> Hello,
> 
> I would transform the bundle.properties in a document (article, book or 
> section whatever)
> Each line of the file corresponds to somethine like :
> My message
> 
> One element simpara for one guilabel is useless : it is just to make it 
> readable in a DocBook parse.

Interesting.

Considering that Java properties can also be expressed as XML there could be 
some automation here.

> In the document, you include the message - something like :
> You should see  xpointer="messageId"> after clicking on the button.
> 
> The French, English, German version of the document will take advantage of 
> the corresponding translated version of bundle.properties.xml

Why only those 3 languages?

My understanding of xi:include is that it is not required to be resolved before 
the actual documentation build process.

Which means that the document to translate (and the way it is displayed in the 
tool) is actually

> You should see  after clicking on the 
> button.

Which is not different from what we have now with 

> As far as no id message starts with a number (NC Name for xml:id) you are ok.
> With an XSLT 2.0 processor, it might even be possible to transform the 
> bundle.properties in XML.

It looks like Java properties can be expressed as XML natively (see above) so 
there is something to explore here.

> On Dec 7, 2022, at 5:13, Jan Tosovsky  wrote:
> 
> On 05/12/2022 23:05, Jean-Christophe Helary wrote:
>> What's the best way in a DocBook centered process to ensure that the 
>> list of terms used in a software UI is (semi-automatically?) taken 
>> into account in the DocBook sources that describe that software?
> 
> In your document you can use  and other  which can indicate the content must match the GUI label. You can then
> instruct the localization agency to follow this rule.
> But there is no way to avoid human error so this still has to be checked
> manually which is inefficient. 

The problem is not instructions, the problem is to lower the burden of the 
translators by explicitly displaying the strings in the DocBook sources.

Creating a normative glossary from the UI strings first could be something, but 
there are Windows/Linux mnemonics (&) characters in the strings so we'd need to 
remove them to create that glossary and that would add another step (which can 
be automatized I guess).

Full disclosure: the manual is for OmegaT, a free software solution for 
translators, that supports DocBook out of the box, and Java properties too. I 
am project leader, also in charge of the manual, I made a close to full rewrite 
of the thing this summer/fall to prepare for our next release but I know that 
the solution that I chose (link linkend endterm) is not optimal because the 
link contents/target is not available for inflected modifications required in 
some languages. (And I also happen to be a translation company, so I understand 
those issues quite well, but it was my first time 

RE: [docbook-apps] UI strings vs manual strings ?

2022-12-06 Thread Jan Tosovsky
On 05/12/2022 23:05, Jean-Christophe Helary wrote:
> What's the best way in a DocBook centered process to ensure that the 
> list of terms used in a software UI is (semi-automatically?) taken 
> into account in the DocBook sources that describe that software?

In your document you can use  and other 

Re: [docbook-apps] UI strings vs manual strings ?

2022-12-06 Thread Alemps Florimond
 Hello,
I would transform the bundle.properties in a document (article, book or section 
whatever)Each line of the file corresponds to somethine like :
My message
One element simpara for one guilabel is useless : it is just to make it 
readable in a DocBook parse.

In the document, you include the message - something like :You should see 
 after clicking 
on the button.
The French, English, German version of the document will take advantage of the 
corresponding translated version of bundle.properties.xml

As far as no id message starts with a number (NC Name for xml:id) you are 
ok.With an XSLT 2.0 processor, it might even be possible to transform the 
bundle.properties in XML.

Regards,Florimond
Le mardi 6 décembre 2022 à 00:05:49 UTC+1, Jean-Christophe Helary 
 a écrit :  
 
 What's the best way in a DocBook centered process to ensure that the list of 
terms used in a software UI is (semi-automatically?) taken into account in the 
DocBook sources that describe that software?

Problem at hand:

- a Java application with ~2k UI strings (not all users facing), in a 
Bundle.properties file
- a ~80K words DocBook manual

It is not trivial to keep track of the whole string set (searches, etc.)

Also, the l10n process takes place on the DocBook sources, not on the HTML 
output, so tricks like  don't work because translators 
don't see the target terms.

I'm left with having to rewrite the strings explicitly and that's a pain, and 
also adds risks of mistakes in translations.

-- 
Jean-Christophe Helary @jchel...@emacs.ch
https://traductaire-libre.org
https://mac4translators.blogspot.com
https://sr.ht/~brandelune/omegat-as-a-book/


-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org

  

Re: [docbook-apps] UI strings vs manual strings ?

2022-12-06 Thread Tony Graham

On 05/12/2022 23:05, Jean-Christophe Helary wrote:
What's the best way in a DocBook centered process to ensure that the 
list of terms used in a software UI is (semi-automatically?) taken 
into account in the DocBook sources that describe that software?


I haven't had to do this, but since no-one else has responded yet...


Problem at hand:

- a Java application with ~2k UI strings (not all users facing), in
a Bundle.properties file


Java also has an XML format for properties files.


- a ~80K words DocBook manual

It is not trivial to keep track of the whole string set (searches, 
etc.)


Also, the l10n process takes place on the DocBook sources, not on
the HTML output, so tricks like  don't work 
because translators don't see the target terms.


Before translation, replace each  with the replacement text from
the XML properties file wrapped in a well-known element that still
carries the identifier for the properties file entry.

After translation, if necessary, convert the well-known elements back
into  and also do something to handle the strings that have been
translated differently in different places.

Once you have the properties file for a second language, you could
insert the translated strings in place of  when preparing for
translation.  Alternatively, or as well, you could set up your
computer-aided translation tool to not translate the well-known elements
for the strings and insert the translated strings after everything else
is translated.

I'm left with having to rewrite the strings explicitly and that's a 
pain, and also adds risks of mistakes in translations.


The more that you can automate, the better.

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.

Skerries, Ireland
tgra...@antenna.co.jp

-
To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org