Hi Marcin, Wow, that would be really excellent. I'm looking forward to it!
Graham On Fri, Jan 22, 2016 at 10:36 AM, Marcin Junczys-Dowmunt <[email protected] > wrote: > Hi Graham, > At the UN we are now working to release an official version of our data. > As a bonus to the pair-wise alignment, it will contain a 6-way fully > aligned subcorpus for English, French, Spanish, Russian, Chinese, Arabic; > about 13M segments per language. We are waiting for some LREC feedback and > the official greenlight from UN officials, but that should be a matter of a > couple of weeks now (maybe one, maybe two, maybe four). Once it is ready I > can make an announcement here. > Best, > Marcin > > W dniu 22.01.2016 o 16:26, Graham Neubig pisze: > > Dear Moses Mailing List, > > This is not directly related to Moses, but I was wondering if there are > any high-quality, multi-lingually sentence aligned corpora available (i.e. > 3 or more languages with aligned sentences). We're aware of the Europarl > and Bible corpora, but Europarl only covers European languages, and the > Bible corpus is quite small in MT terms. > > TED and MULTI-UN are options, but as far as I know the data is only > bilingually aligned at the moment, and it can be a bit hard to get a clean > multi-lingual corpus from them. If anyone has any experience with this, or > resource available, I'd love some info. > > Thanks in advance, > Graham > > > _______________________________________________ > Moses-support mailing > [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
