>>> >>> WMT 2021 Shared Task: >>> >>> Machine Translation using Terminologies >>> >>> November 10-11 , 2021 >>> Punta Cana, Dominican Republic >>> Language domains that require very careful use of terminology are abundant. >>> The need to adequately translate within such domains is undeniable, as >>> shown by e.g. the different WMT shared tasks on biomedical translation. >>> >>> More interestingly, as the abundance of research on domain adaptation >>> shows, such language domains are (a) not adequately covered by existing >>> data and models, while (b) new (or “surge”) domains arise and models need >>> to be adapted, often with significant downstream implications: consider the >>> new COVID-19 domain and the large efforts for translation of critical >>> information regarding pandemic handling and infection prevention strategies. >>> >>> In the case of newly developed domains, while parallel data are hard to >>> come by, it is fairly straightforward to create word- or phrase-level >>> terminologies, which can be used to guide professional translators and >>> ensure both accuracy and consistency. >>> >>> This shared task will replicate such a scenario, and invites participants >>> to explore methods to incorporate terminologies into either the training or >>> the inference process, in order to improve both the accuracy and >>> consistency of MT systems on a new domain. >>> >>> IMPORTANT DATES >>> >>> Release of training data and terminologies April 2021 >>> Surprise languages announced: June 28, 2021 >>> Test set available July 19, 2021 >>> Submission of translations July 23, 2021 >>> System descriptions due August 5, 2021 >>> Camera-ready for system descriptions September 15, 2021 >>> Conference in Punta Cana November 10-11, 2021 >>> SETTINGS >>> >>> In this shared task, we will distinguish submissions that use the >>> terminology only at inference time (e.g., for constrained decoding or >>> something similar) and submissions that use the terminology at training >>> time (e.g., for data selection, data augmentation, explicit training, etc). >>> Note that basic linguistic tools such as taggers, parsers, or morphological >>> analyzers are allowed in the constrained condition. >>> The submission report should highlight in which ways participants’ methods >>> and data differ from the standard MT approach. They should make clear which >>> tools were used, and which training sets were used. >>> >>> LANGUAGE PAIRS >>> >>> The shared task will focus on four language pairs, with systems evaluated: >>> English to French >>> English to Chinese >>> Two surprise language pairs English-X (announced 3 weeks before the >>> evaluation deadline) >>> We will provide training/development data and terminologies for the above >>> language pairs. Test sets will be released at the beginning of the >>> evaluation period. The goal of this setting (with both development and >>> surprise language pairs) is to avoid approaches that overfit on language >>> selection, and instead evaluate the more realistic scenario of needing to >>> tackle the new domain in a new language in a limited amount of time. The >>> surprise language pairs will be announced 3 weeks before the start of the >>> evaluation campaigns. At the same time we will provide training data and >>> terminologies for the surprise language pairs. >>> You may participate in any or all of the language pairs. >>> >>> ORGANIZERS >>> >>> Antonis Anastasopoulos, George Mason University >>> Md Mahfuz ibn Alam, George Mason University >>> Laurent Besacier, NAVER >>> James Cross, Facebook >>> Georgiana Dinu, AWS >>> Marcello Federico, AWS >>> Matthias Gallé, NAVER >>> Philipp Koehn, Facebook / Johns Hopkins University >>> Vassilina Nikoulina, NAVER >>> Kweon Woo Jung, NAVER >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Workshop on Statistical Machine Translation" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to wmt-tasks+unsubscr...@googlegroups.com >>> <mailto:wmt-tasks+unsubscr...@googlegroups.com>. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/wmt-tasks/CAAFADDBpQCyXGOdFTYMN185fB_iKKYn%3DCqFDoqsRnoj3XXwDEQ%40mail.gmail.com >>> >>> <https://groups.google.com/d/msgid/wmt-tasks/CAAFADDBpQCyXGOdFTYMN185fB_iKKYn%3DCqFDoqsRnoj3XXwDEQ%40mail.gmail.com?utm_medium=email&utm_source=footer>. >> >> >> >
_______________________________________________ Mt-list site list Mt-list@eamt.org http://lists.eamt.org/mailman/listinfo/mt-list