no.... but my mistake I was comparing with that link for the per year 
files : http://www.statmt.org/wmt15/translation-task.html

what is the difference ? (with the wmt11 files)



Le 04/10/2016 à 21:46, Barry Haddow a écrit :
> Hi Vincent
>
> Are you comparing compressed with uncompressed files?
>
> cheers - Barry
>
> On 04/10/16 14:40, Vincent Nguyen wrote:
>> Hi,
>>
>> on this link:
>>
>> http://www.statmt.org/wmt11/translation-task.html
>>
>> on the download section for monolingual data, there is :
>>
>> one big file : http://www.statmt.org/wmt11/training-monolingual.tgz
>>
>> And separate files, of which news crawls per year.
>>
>> However, when you take a single file for a specific year, it is not the
>> same size as the same name file in the big download.
>>
>> expanded size for english corpus :
>>
>> news2008: 4.3GB vs 1.6GB for single download
>> news2009: 5.3GB vs 1.8GB for single download
>>
>> etc...
>>
>> can someone please explain the difference ?
>>
>> thanks
>>
>> Vincent.
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to