[Apologies for cross-posting] We would like to invite submissions to the first shared task on machine translation robustness as part of the ACL 2019 Fourth Conference on Machine Translation (WMT19).
Non-standard, noisy text of the kind that can be found in social media and the internet is ubiquitous. Yet, existing machine translation systems struggle with handling the idiosyncrasies of this type of input. The goal of this shared task is to provide a testbed for improving MT models' robustness to orthographic variations, grammar errors, and other linguistic phenomena common in noisy, user-generated content, via better modelling, adaptation technique or leveraging monolingual training data. Specifically, the shared task aims to bring improvements on the following challenges: - To improve NMT's robustness to orthographic variations, grammatical errors, informal language and other linguistic phenomena or noise common on social media. - To explore effective approaches to leverage abundant out-of-domain parallel data. - To explore novel approaches to leverage abundant monolingual data on the Web (e.g. tweets, Reddit comments, commoncrawl, etc.). - To thoroughly investigate and understand challenges in translating social media text and identify promising future directions. Important dates: - Release of training/dev data: January 21, 2019 - Test data released: April 12, 2019 - Translation submission deadline: April 19, 2019 - System description paper submission deadline: May 3, 2019 - End of evaluation: July 2, 2019 More information is available at http://www.statmt.org/wmt19/robustness.html. As in other WMT tasks, intending participants are encouraged to register to https://groups.google.com/forum/#!forum/wmt-tasks for discussions and announcements. Best regards, Juan Pino (on behalf of the organizers)
_______________________________________________ Mt-list site list Mt-list@eamt.org http://lists.eamt.org/mailman/listinfo/mt-list