no.... but my mistake I was comparing with that link for the per year files : http://www.statmt.org/wmt15/translation-task.html
what is the difference ? (with the wmt11 files) Le 04/10/2016 à 21:46, Barry Haddow a écrit : > Hi Vincent > > Are you comparing compressed with uncompressed files? > > cheers - Barry > > On 04/10/16 14:40, Vincent Nguyen wrote: >> Hi, >> >> on this link: >> >> http://www.statmt.org/wmt11/translation-task.html >> >> on the download section for monolingual data, there is : >> >> one big file : http://www.statmt.org/wmt11/training-monolingual.tgz >> >> And separate files, of which news crawls per year. >> >> However, when you take a single file for a specific year, it is not the >> same size as the same name file in the big download. >> >> expanded size for english corpus : >> >> news2008: 4.3GB vs 1.6GB for single download >> news2009: 5.3GB vs 1.8GB for single download >> >> etc... >> >> can someone please explain the difference ? >> >> thanks >> >> Vincent. >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
