CALL FOR PARTICIPATION in the
Third Automatic Post-Editing (APE) shared task at the Second Conference on Machine Translation (WMT17) -------------------------------------------------------------------- OVERVIEW The third round of the APE shared task <http://www.statmt.org/wmt17/ape-task.html> follows the success of the two previous rounds organised in 2015 and 2016. The aim is to examine automatic methods for correcting errors produced by an unknown machine translation (MT) system. This has to be done by exploiting knowledge acquired from human post-edits, which are provided as training material. The general evaluation setting is similar to the last round but this year the task will include a new language direction and a new domain. In addition to the usual *English-German* data covering the IT domain, the APE 2017 task will also cover *German-English *data covering the Medical domain. In both cases, the source sentences have been translated into the target language by an MT system that is unknown to participants and then manually post-edited by professional translators. At training stage, the collected human post-edits have to be used to learn correction rules for the APE systems. At test stage they will be used for system evaluation with automatic metrics (TER and BLEU). -------------------------------------------------------------------- GOALS The aim of the APE task is to improve MT output in black-box scenarios, in which the MT system is used "as is" and cannot be modified. From the application point of view, APE components would make it possible to: - Cope with systematic errors of an MT system whose decoding process is not accessible; - Provide professional translators with improved MT output quality to reduce (human) post-editing effort; - Adapt the output of a general-purpose system to the lexicon/style requested in a specific application domain. -------------------------------------------------------------------- DATA & EVALUATION Training, development and test data consist in English-German and German-English triplets (source, target, and post-edit), respectively belonging to the IT and Medical domains, which are already tokenized. All data is provided by the EU project QT21 (http://www.qt21.eu/). Systems' performance will be evaluated with respect to their capability to reduce the distance that separates an automatic translation from its human-revised version. Such distance will be measured in terms of TER, which will be computed between automatic and human post-edits in case-sensitive mode. Also BLEU will be taken into consideration as a secondary evaluation metric. To gain further insights on final output quality, a subset of the outputs of the submitted systems will also be manually evaluated. -------------------------------------------------------------------- DIFFERENCES FROM THE 2016 ROUND OF THE APE TASK Compared to the 2016 round of the APE task, the main differences are: - Additional domain (Medical); - Additional language direction (German-English); - Larger data set. -------------------------------------------------------------------- IMPORTANT DATES Release of training data: February 16, 2017 Release of test data: April 10, 2017 Submission deadline: May 6 2017 Paper submission deadline: June 2, 2017 Manual evaluation: TBD Notification of acceptance: June 30, 2017 Camera-ready deadline: July 14, 2017 -- Best Regards, The APE task organizers
_______________________________________________ Mt-list site list [email protected] http://lists.eamt.org/mailman/listinfo/mt-list
