> Just in case, let me tell you that there seems to be several corpora (and
> acquis) published by the JRC corpora. The one I was referring to in my 
> previous
> message can be downloaded here: http://langtech.jrc.it/DGT-TM.html#Download .

The corpus that I meant is the JRC-Acquis Multilingual Parallel Corpus
(http://langtech.jrc.it/JRC-Acquis.html).

I wasn't talking so much about technical difficulties or corpus text
bugs, and I'm aware of Koehn/Birch/Steinberger paper "462 Machine
Translation Systems for Europe" -- but rather, has anyone had
unexpected conclusions on this corpus, for instance something (like a
method of improving the SMT output) that worked on, say, Europarl and
didn't work on the same (or other) language pairs on the JRC-Acquis
parallel corpus?

Thanks in advance,
Mark & Heiki



>> Dear readers,
>>
>> we keep getting strange, unexpected and sometimes illogical results in
>> more than one series of SMT experiments using the JRC Acquis parallel
>> corpus. Often the same methods work fine on Europarl. Our question is
>
> Hi Mark,
>
> We have been using *extensively* the JRC acquis corpus and I can assure you 
> that
> we had no big problems. Some colleagues, who have used the program that comes
> with the corpus, did have some slight problems. I have chosen to unzip the
> several volumes manually and never had them. For this as well as for other
> corpora, some characters can derail the training. We have developed Moses for
> Mere Mortals (http://code.google.com/p/moses-for-mere-mortals/), that 
> provides a
> Windows add-in (Extract_TMX_Corpus) that helps to clean such things and 
> creates
> corpora that you can directly feed to Moses (UTF-8, Linux newlines, removal of
> control characters and so on). Therefore, I can assure you that the JRC acquis
> definitively works. It seems me that the Moses team has already published data
> about their experiments with this corpus. It covers most, if not all, the
> language pairs of the European Union, what is a plus.
>
> Greetings,
>
> João
>
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to