Hi,
The Block World Corpus is a small corpus that might be used in the
development of new language pairs. It origins from the Department of
Linguistics and Philology at Uppsala University, Sweden,
www.lingfil.uu.se

By the kind consent of professor Jörg Tiedemann, I've got the permission
to use the corpus, translate it to new languages and licence it as I see
fit.


The corpus illustrates some linguistic features that are a nuisanse to
machine translation. It consists of two parts:

1. Files for training/developing a translation model

- these files are named blockworld.parallel + language suffix e.g.
blockworld.parallel.en

2. Files for building a language model

- these files are named blockworld.full + language suffix e.g.
blockworld.full.en


Originally the corpus was intended for experiments with statistical
machine translation, but it might as well be used with rule based
systems e.g. shallow transfer systems like Apertium.

The original corpus is in English (en) and Swedish (sv). It would be
useful to have the corpus translated to more languages. I would very
much appreciate if you translated the corpus to your language and sent
the files to me. You can find the files in the download folder at my
site tunedal.nu: http://www.tunedal.nu/download/block_world_corpus/
License: GPL v.3 (same as for my programs)

I have tried to translate it to Danish with the help of Apertium. If you
know Danish, I would be very grateful if you checked the translated file
and sent me the corrections. You can find the file here:
http://www.tunedal.nu/download/block_world_corpus/blockworld.full.da
You'll find more info in my blog: http://www.tradera.tunedal.nu/

Yours,
Per Tunedal

------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to