El 2018-03-12 12:10, Marc Riera Irigoyen escribió:
Have you done any evaluation ? How does it compare to other systems
(and
the old system too) ? :)
The pair works fairly well with encyclopedia-like texts, and has a
good Wikipedia coverage (92% for English and 87% for Catalan). The
reference translation (an English article on Greece not used during
development) shows a WER/PER of 51%/35%, better than the old pair's
56%/40% with the same text. Yandex is slightly better than Apertium,
with 56%/34%, and Google stands with the best results (43%/26%). I
have not really evaluated translations from Catalan (most of the
development has taken place in the other direction), but it should be
more or less the same as the old pair.
Good to know that we are approaching the quality of Yandex! :)
What kind of effort/work do you think needs to be done to approach
Google's
quality?
What kind of lexical coverage do Google/Yandex have ?
While the pair still needs a lot of work and love, the rewrite has
eased development. With good taggers on both sides, trained with
diverse texts (including dialogues to reflect oral language
constructions), as well as a reorganization/rewrite of the transfer
rules (inherited from the messy old pair), we should have a very
decent and useful language pair.
What would you say the main needs are now ?
Fran
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff