Final Call for ParticipationWMT 2020 Shared TaskParallel Corpus Filtering and Alignment for Low-Resource Conditions Deadline: Saturday, August 1, 2020
http://www.statmt.org/wmt20/parallel-corpus-filtering.html We announce and call for participation in the WMT 2020 shared task on assessing the quality of sentence pairs in a parallel corpus. - In the WMT18 shared task on parallel corpus filtering <http://www.statmt.org/wmt18/parallel-corpus-filtering.html>, we posed the challenge of a noisy web-crawled parallel corpus for German-English and asked participants to score each sentence pair. These quality scores were used to select subsets of the corpus, consisting of the highest-scoring sentence pairs, train statistical and neural machine translation systems on them, and evaluate these on a set of test sets. - In the WMT19 shared task on parallel corpus filtering for low resource conditions <http://www.statmt.org/wmt19/parallel-corpus-filtering.html>, we followed the same protocol, but this time for Nepali-English and Sinhala-English. For low-resource language pairs like these, both existing clean parallel corpora and the to-be-scored noisy web-crawled data comes in smaller amounts and lower quality. This year, we pose two different language pairs, Khmer-English and Pashto-English. In addition to the task of computing quality scores for the purpose of filtering, we also allow for the re-alignment of sentence pairs from document pairs. DEADLINES Submission deadline for subsampled sets August 1, 2020 System descriptions due August 15, 2020 Announcement of results August 29, 2020 Paper notification September 29, 2020 Camera-ready for system descriptions October 10, 2020
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support