[Mt-list] ELRA - Language Resources Catalogue - Update

ELRA ELDA Information Wed, 12 Jul 2017 04:20:03 -0700

[Our apologies if you have received multiple copies of this announcement.]


*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

We are happy to announce that 3 new Written Corpora and 1 newDesktop/Microphone Speech Resource are now available in our catalogue.


*ELRA-W0118 English-Persian parallel corpus*
*ISLRN: 074-825-114-781-7 <http://islrn.org/resources/074-825-114-781-7/>*

The English-Persian parallel corpus contains more than 200,000 alignedsentences across a variety of text types from the domains of art, law,culture, science, religion, literature, medicine, idioms, politics andothers. It is an extension of the English-Persian parallel corpusalready distributed by ELRA (Catalogue Reference: ELRA-W0051). This newversion of the corpus is distributed with a concordance program.For more information, see:http://catalog.elra.info/product_info.php?products_id=1306


*ELRA-W0119 Helsinki Corpus of Swahili*
*ISLRN: 941-187-059-145-7 <http://islrn.org/resources/941-187-059-145-7/>*

This is a text corpus of Swahili language of 25 million words, annotatedfor part-of-speech, morphology and syntax. The corpus contains prosetext from domains such as fiction, news media and government documents,from the period between 1953 and 2016.For more information, see:http://catalog.elra.info/product_info.php?products_id=1308


*ELRA-W0120 NUM 5M Mongolian written corpus*
*ISLRN: 492-817-146-504-9 <http://islrn.org/resources/492-817-146-504-9/> *

This is a corpus of Mongolian text mostly from domains like online orprinted daily newspapers, literature, and laws. Part of this corpus,about 2,800 sentences with 100,000 words, has been POS-tagged manuallyand stored in TEI format.For more information, see:http://catalog.elra.info/product_info.php?products_id=1309


*ELRA-S0393 Persian Speech Corpus*
*ISLRN: 068-845-898-304-0 <http://islrn.org/resources/068-845-898-304-0/> *

This speech corpus was recorded through a "Blubbery" model microphone byone male speaker in Persian (Tehrani accent) in a professional studio.Synthesized speech as an output using this corpus has produced a highquality, natural voice. It consists of 399 utterances for a total ofabout 2.5 hours, with orthographic and phonetic transcriptions.For more information, see:http://catalog.elra.info/product_info.php?products_id=1307**

For more information on the catalogue, please contact Valérie Mapellimailto:[email protected]

If you would like to enquire about having your resources distributed byELRA, please do not hesitate to contact us.


Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info

Archives of ELRA Language Resources Catalogue Updates:http://www.elra.info/en/catalogues/language-resources-announcements/

_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list

[Mt-list] ELRA - Language Resources Catalogue - Update

Reply via email to