Hello all! Out of the 39 or so language pairs that we have in trunk/, only two or three could be considered to offer "state of the art" performance with respect to vocabulary coverage and translation quality -- where there is a competitor.
Spanish-Catalan Norwegian Bokmål-Norwegian Nynorsk (Spanish-Portuguese) The objective of this task would be to take another language pair, one that is already quite developed, and make it "Google-beating", and if not Google-beating then at least improve coverage by at least 10-15% on a range of corpora, and a WER reduction of at least 10-15%. This would be a lot of work, but I don't think that given a "well resourced" language (e.g. English, Spanish, French, Catalan, Italian, Portuguese) that it would be much more than the language pair students do each year. In general we try to have no more than 50% language pairs as GSOC projects, which I think is a pretty good idea. If we think that this task is a good idea then we could decide to make at least one language pair a "state of the art" one. Any thoughts ? Fran ------------------------------------------------------------------------------ Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
