Hello all!

Out of the 39 or so language pairs that we have in trunk/, only two or
three could be considered to offer "state of the art" performance with
respect to vocabulary coverage and translation quality -- where there is
a competitor.

Spanish-Catalan
Norwegian Bokmål-Norwegian Nynorsk 
(Spanish-Portuguese)

The objective of this task would be to take another language pair, one
that is already quite developed, and make it "Google-beating", and if
not Google-beating then at least improve coverage by at least 10-15% on
a range of corpora, and a WER reduction of at least 10-15%. 

This would be a lot of work, but I don't think that given a "well
resourced" language (e.g. English, Spanish, French, Catalan, Italian,
Portuguese) that it would be much more than the language pair students
do each year.

In general we try to have no more than 50% language pairs as GSOC
projects, which I think is a pretty good idea. If we think that this
task is a good idea then we could decide to make at least one language
pair a "state of the art" one.

Any thoughts ? 

Fran


------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to