Call for Participation Similar Language Translation Task at WMT 2020 (co-located with EMNLP 2020) URL: http://www.statmt.org/wmt20/similar.html
The training/dev sets are available. The test data will be released on June 8, 2020. Please visit the website for more information. Task Description Within the MT and NLP communities, English is by far the most resource-rich language. MT systems are most often trained to translate texts from and to English or they use English as a pivot language to translate between resource-poorer languages. The interest in English is reflected, for example, in the WMT translation tasks (e.g. News, Biomedical) which have always included language pairs in which texts are translated to and/or from English. With the widespread use of MT technology, there is more and more interest in training systems to translate between languages other than English. One evidence of this is the need of directly translating between pairs of similar languages. The main challenge here is how to take advantage of the similarity between languages to overcome the limitation given the low amount of available parallel data to produce an accurate output. Given the interest of the community in this topic we organize, for the second time at WMT, the shared task on "Similar Language Translation" to evaluate the performance of state-of-the-art translation systems on translating between pairs of languages from the same language family. This year we provide participants with training and testing data in five language pairs from three language families listed below. Evaluation will be carried out using automatic evaluation metrics and human evaluation. Language Pairs This year we have five pairs of similar languages from three different language families: Indo-Aryan, Romance, and South-Slavic. Translations will be evaluated in both directions (e.g. from Spanish to Catalan and from Catalan to Spanish). - Indo-Aryan Languages: Hindi - Marathi - Romance Languages: Spanish - Catalan and Spanish - Portuguese - South-Slavic Languages: Slovene - Croatian and Slovene - Serbian Organizers Marta Costa-jussà, Universitat Politècnica de Catalunya Magdalena Biesialska, Universitat Politècnica de Catalunya Santanu Pal, Wipro AI Lab Nikola Ljubešić, Jožef Stefan Institute and University of Zagreb Marcos Zampieri, Rochester Institute of Technology Contact martaruizcostajussa(at)gmail.com _______________________________________________ Mt-list site list [email protected] http://lists.eamt.org/mailman/listinfo/mt-list
