El dj 29 de 08 de 2013 a les 10:13 +0200, en/na Per Tunedal va escriure: > Hi, > the design of Apertium has some resemblance with the outdated > word-to-word statistical translations models, especially the simplest: > IBM model 1: > 1 The translation is made word by word. > 2. The most probable translation of a word is chosen (developers are > advised to have only one translation in the bidix - the most common). > 3. The translation is supposed to work best for closely related > languages. > > Point 2 makes Apertium quite similar to IBM model 1 without the language > model: then only the most probable word is chosen. Unfortunately, this > often leads to terrible translations.
Except: * You can use the lexical selection module, which can give equivalent results to using a target-language model. * In IBM model 1 there is no reordering. > Thus, adding the language model to ensure "fluent" output should > outperform Apertium. And it does. On closely related languages. > > I've written my own IBM model 1 training program and decoder > (translator). I trained on the Block World Corpus and built 3-gram > language models with IRSTML (available at > http://www.tunedal.nu/download/block_world_corpus/). Finally I > translated the evaluation files (available at the above site) from da to > sv (and the other way around) and from sv to en (and the other way > around). > > Results: > 1. The translation between Swedish and English is mostly terrible (to a > large extent due to that IBM 1 doesn't use any fertility i.e. one word > only produces one translated word). > 2. The translation between Swedish and Danish is in most cases > acceptable. Only a few sentences are terrible. On the whole it looks > much better than the translations from Apertium - in spite of my efforts > since last year. The English data for the corpus is kind of weird (borderline ungrammatical) in some places. Your efforts since last year have mostly made the pair worse not better. This is probably unintentional, but was my impression last time I looked at it. Fran ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
