El 2018-02-23 18:59, Antonio Toral escribió:
Dear apertiumers,
I've an idea for GSOC about subtitles. Before putting it on the wiki,
I wanted to check with _yous_, as it seems that there has been
previous related work [1].
In a nutshell, it is about translating subtitles from OpenSubtitles
with Apertium, for closely-related pairs of languages A and B such
that (i) there are mature A-->B systems in Apertium and (ii) on
OpenSubtiles there are many subtitles for A and very few for B. An
example is A=ES, B=CA. There could be 2 tasks:
1. Development of a tool to translate subtitles. Given a translation
direction A-->B:
1.1. Use OpenSubtitles' API to find subtitles S in A not
translated yet into B
1.2. Translate S from A to B using Apertium's API.
1.3. Upload the translated subtitles to OpenSubtitles. These
subtitles could have a preamble such as "Warning: this subtitle is
machine translated! Powered by Apertium.org"
2. Evaluation of Apertium's quality for subtitles (for 1 or 2
translation directions) and improvements/modifications in Apertium's
systems for those directions based on that evaluation.
I'd be happy to hear opinions, experiences from related previous work,
criticism, etc :)
My main question would be about licensing, as far as I'm aware most of
the stuff in OpenSubtitles is not available under a free licence.
Other comments:
* The type of text in OpenSubtitles is typically of a very different
domain
to what Apertium is usually used for, so might result in a lot of
weirdness
in expressions (unless extra work is done on the MT systems for coping
with dialogue type texts)
* It would be a nice way to advertise Apertium.
Fran
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff