Sharing a few SMT resources for Indian languages.
Center For Indian Language Technology <http://www.cfilt.iitb.ac.in>, IIT
Bombay has hosted Shata-Anuvaadak (100 Translators), a Statisitical Machine
Translation system for Indian languages. It currently supports translation
between 11 Indian languages:
- Indo-Aryan languages: Hindi, Urdu, Bengali, Gujarati, Punjabi,
Marathi, Konkani
- Dravidian languages: Tamil, Telugu, Malayalam
- English
It is a Phrase-Based MT system with pre-processing and post-processing
extensions. The pre-processing includes source-side reordering for English
to Indian language translation. The post-processing includes
transliteration between Indian languages for OOV words. The system can be
accessed at:
http://www.cfilt.iitb.ac.in/indic-translator
For more details, see the following publication:
Anoop Kunchukuttan, Abhijit Mishra, Rajen Chatterjee, Ritesh Shah, Pushpak
Bhattacharyya. 2014. * Shata-Anuvadak: Tackling Multiway Translation of
Indian Languages* . Language and Resources and Evaluation Conference *(LREC
2014)*. 2014.
We are also making available software and resources developed in the Center
for the system and for ongoing research. These are available under an open
source license for research use. These include:
*Software*
- Indian Language, NLP tools: Common NLP tools for Indian languages that
are useful for machine translation. Unicode Normalizers, Tokenizers,
Morphology-analysers and Transliteration systems.
- Source Side Reodering system for SMT
- A simple experiment management system for Moses
*Resources*
- Translation Models for Phrase based SMT systems all language pairs in
Shata-anuvaadak
- Language Models for all language in Shata-anuvaadak
- Transliteration models for some language pairs (Moses-based)
You can access these resources at:
http://www.cfilt.iitb.ac.in/static/download.html
Regards,
Anoop.
http://www.cse.iitb.ac.in/~anoopk
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support