I wonder if there would be a demand for ready made phrase tables generated from the data. ________________________________________ From: [email protected] [[email protected]] on behalf of Jorg Tiedemann [[email protected]] Sent: 02 November 2013 18:54 To: [email protected] Subject: [Moses-support] 10 years of OPUS
After attending the 20-years-of-bitext workshop at EMNLP I suddenly realized that OPUS (http://opus.lingfil.uu.se) also has its 10-years anniversary this year (send me some champagne if you like). I will celebrate this anniversary by sending out this e-mail with some recent news and highlights. OPUS is a growing collection of parallel corpora for many languages and various domains. The collection becomes pretty big and includes a variety of data sets and tools that are not only useful for statistical machine translation. OPUS has been extended a lot since its first appearance in 2003. Actually the best birthday present would be if anyone would decide to start a mirror of OPUS. Let me know if you are interested. Here some of the highlights: - over 150 languages and language variants - over 5 billion aligned translation units - downloads in XML/XCES, plain text (Moses/SMT) and TMX - raw, tokenized and machine-annotated data - monolingual data sets (for language modeling) - search interfaces Some recent news and data sets: - EUbookshop: a large but noisy corpus (converted from PDF) - Tatoeba: a small but clean corpus with many languages - OpenSubtitles2012: an improved version of the 2011 version - coming soon: OpenSubtitles2013 - an extension of OpenSubtitles2012 - UN, MultiUN, Europarl v7: aligned for all language combinations - word alignments and phrase tables for the majority of bitexts The Web Site: http://opus.lingfil.uu.se More information: http://opus.lingfil.uu.se/trac/wiki Feedback is very welcome! And, be nice to our server! Jörg Tiedemann [email protected] _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
