Hi, I fixed the link ;)
But Matthias is right: The WMT web pages are the best place to look for links to corpora / test sets. -phi On Mon, Apr 27, 2015 at 10:56 AM, Graham Neubig <[email protected]> wrote: > Hi Matthias, > > Thank you, that's exactly what I was looking for! > And also thanks for sending the WMT mailing list, I'll send any further > questions about the evaluation matrix to there from now on. > > Graham > > On Mon, Apr 27, 2015 at 11:31 PM, Matthias Huck <[email protected]> wrote: >> >> Hi Graham, >> >> Did you have a look at the tarballs that were distributed last year? >> http://www.statmt.org/wmt14/translation-task.html >> >> There are three different version: >> >> - Test sets (5.2 MB) These are the source sgm files with extra "filler" >> sentences. They were the actual files released for the campaign. >> http://www.statmt.org/wmt14/test.tgz >> >> - Filtered Test sets (3.2 MB) These are the source and reference sgm >> files used to evaluate, i.e. the Test sets without the "filler" >> sentences. If you want to reproduce results from the campaign, use >> these. >> http://www.statmt.org/wmt14/test-filtered.tgz >> >> - Cleaned Test sets (3.2 MB) These include fixes to minor encoding >> errors, and reinstate around 10% of the en-de data which was excluded >> from the evaluation. For further research, use these. >> http://www.statmt.org/wmt14/test-full.tgz >> >> WMT has a Google Group: >> https://groups.google.com/forum/#!forum/wmt-tasks >> >> Cheers, >> Matthias >> >> >> On Mon, 2015-04-27 at 22:14 +0900, Graham Neubig wrote: >> > Hi Moses List, >> > >> > Sorry about this being a bit off topic, but I have a question about the >> > files on matrix.statmt.org, and couldn't find any information about who >> > to >> > contact on the site and assumed that here would be the next-best place >> > to >> > ask. >> > >> > Specifically, I'm looking for the SGM files for newstest2014 in the same >> > order as the system outputs on matrix.statmt.org. On the "test sets" >> > page, >> > in the place where there should be a link to newstest2014, it seems like >> > the link actually points to newstest2013: >> > http://matrix.statmt.org/test_sets/list >> > >> > And the ones downloadable from the WMT 2015 site seem to be in a >> > different >> > order, and it'd be a bit of a pain (although possible) to match the >> > lines >> > properly: >> > http://www.statmt.org/wmt15/translation-task.html >> > >> > If possible, could someone help out with this, or tell me who's in >> > charge >> > of the evaluation matrix so I can contact them directly? >> > >> > Graham >> > _______________________________________________ >> > Moses-support mailing list >> > [email protected] >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
