[authors] OD format and localizable contents

JC Helary Tue, 20 Dec 2005 18:52:50 -0800

Sorry to bug Authors with that, but I can't seem to have a reactionon OD-users...

Since Authors is also involved in the process of translating, Ithought we could discuss that as a separate thread, not so muchdirectly related to the contents of a document but to access to itscontents for translation.

I wrote this mail with OmegaT in mind since it seems to be used moreand more within the OOo community to translate OOo produced documents-and not so much imported documents from the MS world.

The Italian documentation translation group (I think) seems to beusing OmegaT and thanks to their work we have discovered glitches inthe OOo tags support. I think the reason is that before, when OmegaTwas used to translate mainly texts imported from MS, OOo filteringwas very conservative and would produce a very small quantity of OOotags, that were all properly handled by OmegaT. Now that people useOOo to create documents (and it is especially true not for the OOodocumentation process), authors are free to use the full creativityoptions of OOo and the generated documents are now much more complex.

As written in the mail below (originally 3 mails sent to xml-dev,recombined into one sent to OD-users) most xml based translationstandards expect three types of string formats in a document:

-block level format->equivalent of <text:p> in OOo or <p> in HTML,sets a property for a whole block of string-inline level format->equivalent of <text:span> in OOo or <span> inHTML, sets a property for a subset of the block-subflows->would be alternative text for a picture (that appears in abox when the mouse comes on the picture etc), mostly is _within_ thetag, as a attribute value, like <whatever:whatever name="alternatetext"> in OOo similar to <h2 id="identity of this title"> in HTML.

So, that was the background of the mail. If you have any interest insuch topics please go on reading :)


Regards,
Jean-Christophe Helary

==================================

(3 mails combined in one. Already sent to xml-dev on OOo, but Ithought maybe this list was more relevant)


I would like to know if there is a "simple" way to identify:

-block level textual information
-inline level textual information
-localizable subflows present in tags attributes values.

This, to parse an OD file and be able to properly segment the textfor use in a CAT tool.

By "subflows" I mean information that is not between tags but insidetags as attributes values ( text:name="something like alternativetext for graphic items for ex").

It looks like there is a very wide range of possible<text:"attributes" and I found that <text:sequence seems to be inlineas well as <text:user-defined

I am looking for an extensive list of such attributes and theircharacteristics (block or inline).

The point being that OD needs a proper parsing of its localisabledata so that computer assisted translation tool (the apps translatorsuse everyday) give proper access to the proper data within theirtranslation framework.

It would be nice if OD provided such "meta" information as too whatis localizable, what is not and how that fits in the block/inline/subflow that most translation related standards (TMX/XLIFF etc) arebased on.

Of course, developers can always check the specification and takeguesses as to what is what, just like what I trying to do right now,but I'd say that's could be part of the OD specification to provideeasy access to such meta information to make sure there is properimplementation on the CAT tool development side.


Regards,
Jean-Christophe Helary

---------------------------------------------------------------------

To unsubscribe, e-mail: [EMAIL PROTECTED]open.orgFor additional commands, e-mail: [EMAIL PROTECTED]open.org

[authors] OD format and localizable contents

Reply via email to