Hi Marcin,

Wow, that would be really excellent. I'm looking forward to it!

Graham

On Fri, Jan 22, 2016 at 10:36 AM, Marcin Junczys-Dowmunt <[email protected]
> wrote:

> Hi Graham,
> At the UN we are now working to release an official version of our data.
> As a bonus to the pair-wise alignment, it will contain a 6-way fully
> aligned subcorpus for English, French, Spanish, Russian, Chinese, Arabic;
> about 13M segments per language. We are waiting for some LREC feedback and
> the official greenlight from UN officials, but that should be a matter of a
> couple of weeks now (maybe one, maybe two, maybe four). Once it is ready I
> can make an announcement here.
> Best,
> Marcin
>
> W dniu 22.01.2016 o 16:26, Graham Neubig pisze:
>
> Dear Moses Mailing List,
>
> This is not directly related to Moses, but I was wondering if there are
> any high-quality, multi-lingually sentence aligned corpora available (i.e.
> 3 or more languages with aligned sentences). We're aware of the Europarl
> and Bible corpora, but Europarl only covers European languages, and the
> Bible corpus is quite small in MT terms.
>
> TED and MULTI-UN are options, but as far as I know the data is only
> bilingually aligned at the moment, and it can be a bit hard to get a clean
> multi-lingual corpus from them. If anyone has any experience with this, or
> resource available, I'd love some info.
>
> Thanks in advance,
> Graham
>
>
> _______________________________________________
> Moses-support mailing 
> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to