Hi, The Block World Corpus is a small corpus that might be used in the development of new language pairs. It origins from the Department of Linguistics and Philology at Uppsala University, Sweden, www.lingfil.uu.se
By the kind consent of professor Jörg Tiedemann, I've got the permission to use the corpus, translate it to new languages and licence it as I see fit. The corpus illustrates some linguistic features that are a nuisanse to machine translation. It consists of two parts: 1. Files for training/developing a translation model - these files are named blockworld.parallel + language suffix e.g. blockworld.parallel.en 2. Files for building a language model - these files are named blockworld.full + language suffix e.g. blockworld.full.en Originally the corpus was intended for experiments with statistical machine translation, but it might as well be used with rule based systems e.g. shallow transfer systems like Apertium. The original corpus is in English (en) and Swedish (sv). It would be useful to have the corpus translated to more languages. I would very much appreciate if you translated the corpus to your language and sent the files to me. You can find the files in the download folder at my site tunedal.nu: http://www.tunedal.nu/download/block_world_corpus/ License: GPL v.3 (same as for my programs) I have tried to translate it to Danish with the help of Apertium. If you know Danish, I would be very grateful if you checked the translated file and sent me the corrections. You can find the file here: http://www.tunedal.nu/download/block_world_corpus/blockworld.full.da You'll find more info in my blog: http://www.tradera.tunedal.nu/ Yours, Per Tunedal ------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
