On Tue, Jun 9, 2009 at 23:42, Brian<[email protected]> wrote: > Google has built in support for using its machine translation technology to > help bootstrap human translations of Wikipedia articles. > > http://translate.google.com/toolkit/docupload > > The benefit to Google is clear - they need sentence-aligned text in multiple > languages in order to bootstrap their automated system. > > This is a great example of machines helping people help machines help > people, etc... I'm sure this is now the most efficient way to produce high > quality translations of Wikipedia articles en masse. > > We should take the ToS to make sure the translated text can be CC-BY-SA > licensed.
OK, after a bit of drama in this discussion, i actually tried this toolkit. First i tried to translate the Hebrew article [[שלום גד]] into English (that's Shalom Gad, one of my favorite Israeli musicians). Apparently, it can only translate from English. I am more interested in translating Wikipedia articles from Hebrew into English, so it was quite disappointing, but they'll probably fix it soon enough. Then i tried to translate [[Art critic]] from English into Hebrew. There were a few pleasant surprises, but on the whole the machine translation was bad to the point of being unusable. It is much easier to translate it using vi. Google want side by side translations. It is not quite possible. A grammar of a language is not just subjects, objects, tenses and adjectives. Google seem to ignore [[Text linguistics]] - rules which apply way beyond the word and the sentence. And these are *grammar rules*, not just "style". (Disclaimer: The Department of Linguistics in the Hebrew University of Jerusalem, where i study, is very keen on this subject.) I *had* to make very deep changes to paragraph structure - not to mention sentence structure -, and not just because the Hebrew Wikipedia has a different MOS, but because it's the basis of the Hebrew language. A text without these changes would be next to unreadable. I doubt that a document which is changed so deeply is very useful to Google at this point. I certainly know that it is not useful to me - i gave up after two paragraphs. So yes, Google can revise the legalese of their TOS, but this is not a very urgent problem. The uselessness of the technology makes the TOS pretty irrelevant. -- אמיר אלישע אהרוני Amir Elisha Aharoni http://aharoni.wordpress.com "We're living in pieces, I want to live in peace." - T. Moore _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
