[docbook-apps] RE: [External Email] Re: [docbook-apps] UI strings vs manual strings ?
Hi all, the issue of keeping UI strings and documentation in sync sounds familiar and I'd like to let you know how we've been doing it for last 12 years or so - not perfect but robust enough to publish several dozen documents with up to 1300 pages in two (was four) languages as PDF and webhelp. The software we are writing documentation for is using dozens of XML files for its configuration and user interface. Each UI string comes with an ID that is more or less unique, and we can therefore reuse it in our documentation. However, almost every configuration file has a slightly different content model to the one before and therefore we have to pre-process everything before we can actually use it. In this pre-processing step, a script pulls together all the UI strings from all the different configurations into a "normalized" xml file. This step also pulls in the translated versions of strings. You end up with a rather big xml file ("alltext.xml"), in which you have the ID, the original string and translations in one place. If a translation is missing, we add a placeholder text. On the Docbook side, we are using guilabel/guimenu/guimenuitem to tag UI text. By our convention, each of these elements must have a @remap attribute with its value set to one of the IDs you can find in the software configuration files (or rather in that big xml file from the previous step). Referencing the ID is the important bit for the automation. For readability, the Technical Writer would also use the UI string as content for guilabel/guimenu/guimenuitem. However, this isn't strictly necessary as the element's content will be replaced at a later stage. Our publishing process is automated via ant/jenkins. Once all xincludes have been resolved, we use some XSLT on the resulting temporary file and replace the UI strings with the most recent versions from "alltext.xml". And that's more or less it - an ID-based lookup and string replacement. To assist our Technical Writers, we also provide modified versions of the software configuration. This configs can be used to run the software in a way that will display the internal ID alongside the UI text. By doing this, the Technical Writer is provided with sufficient context to find the right ID to be used for @remap. It also helps to avoid confusion where the configuration files have several copies of the same UI string but with different ID values. Since the UI strings are translated separately and keep their original ID, the remapping also works when we are publishing in other languages (our primary language is German, and we translate into English. This process has been used in the past to translate into French, Italian and Russian as well). A word on the authoring process: We are using oXygen XML and have put together some schematron rules that will flag up when a remap attribute is missing or when the content of a guilabel is different from the current value in the lookup file. This was done as a proof-of-concept and isn't required for authoring, but we encourage using it. We are also experimenting with schematron quickfixes to replace UI strings where necessary - this also is experimental and not a feature we are using all the time, but it comes in handy every now and then. We do most of our translation inhouse (we used OmegaT in the past, but not anymore, sorry!). The documentation and the UI strings are translated separately (by the same Translator) into two separate translation memories. When translating the documentation, all guilabel/guimenu/guimenuitem are set to be non-translatable. However, our Translator sees the element content and thus has all the context information they need. Once they export their files to the target language, the UI strings will still be in German. The German text is then replaced via a combination of @lang and @remap when publishing to their respective output format. For those cases where UI text does need to be translated/localized, a Technical Writer can set a @translate attribute to "yes" on guilabel elements and override the replacement. The TMS will unlock those elemetns and the Translator can modify the content of those elements. I am aware that this process might be problematic with certain languages and/or if the source material is not well maintained and probably for many other reasons, too. We had trouble in the past with UI text that was split into several individual strings or when placeholders are being used. With regards to mnemonics and shortcuts - those are indeed annoying. Our specific problem is that the software configuration wasn't designed to be translation friendly, and we basically have to strip away things before going into translation. I do believe that this problem could be solved if we put in some effort in redesigning the configuration files. Screenshots also matter and need manual updates when the UI text changes. Bottom line is: We manage to keep our
Re: [docbook-apps] UI strings vs manual strings ?
> On Dec 13, 2022, at 19:28, Tony Graham wrote: > > What result are you looking for? I am looking for an authoring process where software UI strings can easily be handled in the documentation. I'm imagining that there would be an editor that uses a UI strings "library" as reference and calls its contents when required in the doc during the build process. What would be the best way to achieve that in a DocBook centered process? > Are you treating one language (say, English) as the main language (which > has empty elements for value lookups) and the other languages as end > products (which have all text filled in), where you'd use OmegaT's > translation memory to keep translations consistent across revisions? I'm not sure I understand the above question, even though I've been using OmegaT almost daily for the past 20 years. > Or do you want the other languages to be structurally equivalent to the > main version (apart from inline elements moved around because of > sentence structure), where elements containing text are turned back into > empty elements? I don't understand the second part "where elements containing text are turned back into empty elements?". Jean-Christophe > > Regards, > > > Tony Graham. > -- > Senior Architect > XML Division > Antenna House, Inc. > > Skerries, Ireland > tgra...@antenna.co.jp > > - > To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org > For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org > -- Jean-Christophe Helary @jchel...@emacs.ch https://traductaire-libre.org https://mac4translators.blogspot.com https://sr.ht/~brandelune/omegat-as-a-book/ - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] UI strings vs manual strings ?
What result are you looking for? Are you treating one language (say, English) as the main language (which has empty elements for value lookups) and the other languages as end products (which have all text filled in), where you'd use OmegaT's translation memory to keep translations consistent across revisions? Or do you want the other languages to be structurally equivalent to the main version (apart from inline elements moved around because of sentence structure), where elements containing text are turned back into empty elements? Regards, Tony Graham. -- Senior Architect XML Division Antenna House, Inc. Skerries, Ireland tgra...@antenna.co.jp - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] UI strings vs manual strings ?
Thank you all for the replies so far. Let me reply in one mail. > On Dec 6, 2022, at 21:44, Tony Graham wrote: > >> Problem at hand: >> - a Java application with ~2k UI strings (not all users facing), in >> a Bundle.properties file > > Java also has an XML format for properties files. Interesting. It could be part of a solution (esp. considering Florimond's reply). >> - a ~80K words DocBook manual >> It is not trivial to keep track of the whole string set (searches, etc.) >> Also, the l10n process takes place on the DocBook sources, not on >> the HTML output, so tricks like don't work because >> translators don't see the target terms. > > Before translation, replace each with the replacement text from > the XML properties file wrapped in a well-known element that still > carries the identifier for the properties file entry. > > After translation, if necessary, convert the well-known elements back > into and also do something to handle the strings that have been > translated differently in different places. The problem is that it's not possible to do that for a lot of languages. There are inflected forms that transform the text of the "endterm" part and the translation targets 3 dozen languages, including BiDi documents. That process would add another layer of transformation+verification. Or maybe I missed something? > Once you have the properties file for a second language, you could > insert the translated strings in place of when preparing for > translation. Alternatively, or as well, you could set up your > computer-aided translation tool to not translate the well-known elements > for the strings and insert the translated strings after everything else > is translated. It looks feasible but only with a small set of target languages. >> I'm left with having to rewrite the strings explicitly and that's a pain, >> and also adds risks of mistakes in translations. > > The more that you can automate, the better. Hence the question ;-) > On Dec 6, 2022, at 22:04, Alemps Florimond wrote: > > Hello, > > I would transform the bundle.properties in a document (article, book or > section whatever) > Each line of the file corresponds to somethine like : > My message > > One element simpara for one guilabel is useless : it is just to make it > readable in a DocBook parse. Interesting. Considering that Java properties can also be expressed as XML there could be some automation here. > In the document, you include the message - something like : > You should see xpointer="messageId"> after clicking on the button. > > The French, English, German version of the document will take advantage of > the corresponding translated version of bundle.properties.xml Why only those 3 languages? My understanding of xi:include is that it is not required to be resolved before the actual documentation build process. Which means that the document to translate (and the way it is displayed in the tool) is actually > You should see after clicking on the > button. Which is not different from what we have now with > As far as no id message starts with a number (NC Name for xml:id) you are ok. > With an XSLT 2.0 processor, it might even be possible to transform the > bundle.properties in XML. It looks like Java properties can be expressed as XML natively (see above) so there is something to explore here. > On Dec 7, 2022, at 5:13, Jan Tosovsky wrote: > > On 05/12/2022 23:05, Jean-Christophe Helary wrote: >> What's the best way in a DocBook centered process to ensure that the >> list of terms used in a software UI is (semi-automatically?) taken >> into account in the DocBook sources that describe that software? > > In your document you can use and other which can indicate the content must match the GUI label. You can then > instruct the localization agency to follow this rule. > But there is no way to avoid human error so this still has to be checked > manually which is inefficient. The problem is not instructions, the problem is to lower the burden of the translators by explicitly displaying the strings in the DocBook sources. Creating a normative glossary from the UI strings first could be something, but there are Windows/Linux mnemonics (&) characters in the strings so we'd need to remove them to create that glossary and that would add another step (which can be automatized I guess). Full disclosure: the manual is for OmegaT, a free software solution for translators, that supports DocBook out of the box, and Java properties too. I am project leader, also in charge of the manual, I made a close to full rewrite of the thing this summer/fall to prepare for our next release but I know that the solution that I chose (link linkend endterm) is not optimal because the link contents/target is not available for inflected modifications required in some languages. (And I also happen to be a translation company, so I understand those issues quite well, but it was my first time
RE: [docbook-apps] UI strings vs manual strings ?
On 05/12/2022 23:05, Jean-Christophe Helary wrote: > What's the best way in a DocBook centered process to ensure that the > list of terms used in a software UI is (semi-automatically?) taken > into account in the DocBook sources that describe that software? In your document you can use and other
Re: [docbook-apps] UI strings vs manual strings ?
Hello, I would transform the bundle.properties in a document (article, book or section whatever)Each line of the file corresponds to somethine like : My message One element simpara for one guilabel is useless : it is just to make it readable in a DocBook parse. In the document, you include the message - something like :You should see after clicking on the button. The French, English, German version of the document will take advantage of the corresponding translated version of bundle.properties.xml As far as no id message starts with a number (NC Name for xml:id) you are ok.With an XSLT 2.0 processor, it might even be possible to transform the bundle.properties in XML. Regards,Florimond Le mardi 6 décembre 2022 à 00:05:49 UTC+1, Jean-Christophe Helary a écrit : What's the best way in a DocBook centered process to ensure that the list of terms used in a software UI is (semi-automatically?) taken into account in the DocBook sources that describe that software? Problem at hand: - a Java application with ~2k UI strings (not all users facing), in a Bundle.properties file - a ~80K words DocBook manual It is not trivial to keep track of the whole string set (searches, etc.) Also, the l10n process takes place on the DocBook sources, not on the HTML output, so tricks like don't work because translators don't see the target terms. I'm left with having to rewrite the strings explicitly and that's a pain, and also adds risks of mistakes in translations. -- Jean-Christophe Helary @jchel...@emacs.ch https://traductaire-libre.org https://mac4translators.blogspot.com https://sr.ht/~brandelune/omegat-as-a-book/ - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org
Re: [docbook-apps] UI strings vs manual strings ?
On 05/12/2022 23:05, Jean-Christophe Helary wrote: What's the best way in a DocBook centered process to ensure that the list of terms used in a software UI is (semi-automatically?) taken into account in the DocBook sources that describe that software? I haven't had to do this, but since no-one else has responded yet... Problem at hand: - a Java application with ~2k UI strings (not all users facing), in a Bundle.properties file Java also has an XML format for properties files. - a ~80K words DocBook manual It is not trivial to keep track of the whole string set (searches, etc.) Also, the l10n process takes place on the DocBook sources, not on the HTML output, so tricks like don't work because translators don't see the target terms. Before translation, replace each with the replacement text from the XML properties file wrapped in a well-known element that still carries the identifier for the properties file entry. After translation, if necessary, convert the well-known elements back into and also do something to handle the strings that have been translated differently in different places. Once you have the properties file for a second language, you could insert the translated strings in place of when preparing for translation. Alternatively, or as well, you could set up your computer-aided translation tool to not translate the well-known elements for the strings and insert the translated strings after everything else is translated. I'm left with having to rewrite the strings explicitly and that's a pain, and also adds risks of mistakes in translations. The more that you can automate, the better. Regards, Tony Graham. -- Senior Architect XML Division Antenna House, Inc. Skerries, Ireland tgra...@antenna.co.jp - To unsubscribe, e-mail: docbook-apps-unsubscr...@lists.oasis-open.org For additional commands, e-mail: docbook-apps-h...@lists.oasis-open.org