Hi Julen,

thanks for your proposal. I would like to clarify some questions before
we can decide whether we can assign a possible mentor to this project
and who this might be.

Julen wrote:

> Hello,
> 
> I'm writing to make a proposal for the upcoming Google Summer of Code.
> I'm not sure whether this is the right mailing list to send this
> information, so my apologies if I'm posting in the wrong place.
> 
> This would be, more or less, my proposal:
> 
> Project Title
> TM based CAT tool for Writer
> 
> Summary
> Develop a TM (Translation Memory) based CAT  (Computer-aided
> Translation) tool as an extension for Writer. This would be something
> similar to MS Word-based propietary addon Wordfast[1] or OmegaT[2], an
> open source tool written in Java which works as a desktop application.
> 
> Abstract
> TM programs store previously translated source and target texts into a
> database in order to use them in the translation of new texts. Source
> text is split into translation units called segments. TMs are easily
> exportable and can be exchanged using an open standard format called
> TMX (Translation Memory eXchange)[3], which is implemented on top of
> XML.
> Any text file OpenOffice.org can open could be translatable using this
> tool just applying the appropriate segmentation rules for each
> filetype.

If I understand TMX correctly, this approach will lose structural
information and attributes of the translated text. It would be nice to
have an extension that can retain as much of that information as
possible. This would require a solution that utilizes our new text
checking and markup API and stores some information along with the
generated TMX files that enables the extension to reestablish the
document structure and attributes. More or less it would mean that the
number and order of paragraphs could be stored along with the text that
is going to be translated. What do you think?

> Professional translators use CAT tools from some many years ago, thus
> taking advantage of new technologies applied to natural language and
> having in this tools a significant help for their day-to-day work.
> Nowadays, translators have a wide variety of documents to translate,
> including text documents or even files related to software
> localization. Since many translators work in an office environment,
> Word-based solutions are widely used, e.g. Wordfast.
> OpenOffice.org lacks of this kind of tools, and therefore, it would be
> an opening door for translators to the open source community. This
> would benefit both translators and specially OpenOffice.org, having
> its popularity extended.

Are there any Open Source translation tools available that use TMX, at
least in development? For a GSOC project it would be better to have
something directly usable in the end. It could also create a bridge
between the Open Source communities.

Best regards,
Mathias

-- 
Mathias Bauer (mba) - Project Lead OpenOffice.org Writer
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't reply to "[EMAIL PROTECTED]".
I use it for the OOo lists and only rarely read other mails sent to it.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to