Dear Sir, Toos Institute of International Information Communication (MARINA) is a center in Iran for managing activities involved in academic translation as well as machine translation systems. I am writing to express our interest in a “Persian-English reciprocal statistical translation system” and request that Moses research group become a partner in our efforts to help develop the project.
We have developed an English-Persian parallel corpus which includes papers, academic proposals, books, dictionaries, glossaries and other resources and have been prepared and translated (free translation models and procedures have been avoided) during 25 years. This unaligned corpus which consists of about 6,000,000 English and Persian words (3 million for each language) is to be aligned at sentence level (approximately 270,000 sentences). The operation of alignment was started in March and is expected to be completed in May. It is in the XML format. The texts in the corpus include a variety of text types: management, aerospace engineering, accounting, medicine, IT and computer science, law and contracts, and so on (i.e. unlike other parallel corpora, which primarily focus on literature, politics, and culture, it focuses on almost all academic disciplines) This English-Persian reciprocal parallel corpus also can be used to develop, strengthen and improve translation aids, including automatic extraction of dictionary, creation of translation memories, as well as example-based machine translation and language modeling. The English-Persian Parallel Corpus is a part of large corpus, which are being produced for rule-based machine translation (1997-2010) and statistical translation (2010-2012) systems developed by our programming team and also contains the following corpora: - a non-technical corpus (including about 470,000 English words and 530,000 Persian words aligned at sentence level [about 83000 sentences], used for example-based machine translation) - a 1.500,000 word collection(dictionaries, glossaries). Our center would like to become involved in your MT project (including using the Moses) for Persian to which less attention has been given in related studies, as we feel it would benefit the Iranian community around the world in regard to machine translation and provide the opportunity to participate in this community. Best regards, Rahim Jelini http://www.w3-i.com/about/rahimjelini.htm Deputy Manager Toos Institute of International Information Communication (MARINA) - [email protected] [email protected] Websites: http://www.w3-i.com http://www.translation.bz http://www.translation24.org http://www.tarjomeh.in _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
