Re "Moses does not have tokenizers for Tamil", actually there is a Tamil
nonbreaking prefix file in the folder
scripts/share/nonbreaking_prefixes/nonbreaking_prefix.ta. You might want
to start simple starting with the scripts/tokenizer/tokenizer.perl file.
Then after you see how it works, escalate to Anoop's suggestions.
Tom
On 7/14/2016 5:14 PM, moses-support-requ...@mit.edu wrote:
Date: Thu, 14 Jul 2016 13:02:57 +0530
From: Anoop (?????) <anoop.kunchukut...@gmail.com>
Subject: Re: [Moses-support] help regarding languages used for translation
To: Selva Nalladurai<selva...@gmail.com>
Cc: moses-support<moses-support@mit.edu>
Hi Selva,
Moses is language-independent, so you can use it for any language pair as
long as you have a parallel corpus. That said, you may have to do language
specific pre and post-processing. For instance,
- Moses does not have tokenizers for Tamil.
- Tamil is agglutinative, so you may want to segment the words to reduce
data sparsity as an additional pre-processing step.
You can the Indic NLP library fo some some simple tokenization as well as
segment the text (http://anoopkunchukuttan.github.io/indic_nlp_library ).
Since English and Tamil have different word order, you should try syntax
based models (which is implemented in the Moses package). Another option
way is to pre-order the English sentence to the Tamil word order before
training a phrase based system. You can use this for pre-ordering:
http://www.cfilt.iitb.ac.in/~moses/download/cfilt_preorder/register.html
Regards,
Anoop.
On Thu, Jul 14, 2016 at 11:43 AM, Selva Nalladurai<selva...@gmail.com>
wrote:
>Hello Team,
> I am Selva Nalladurai, doing my ME(masters of
>engineering) CSE and am from India. I m doing my research in SMT and would
>like to know whether Moses can be used for translating English to Tamil ( a
>regional language spoken in our country), can i follow the same steps as
>other languages translation for upolading my corpus.
> Thankyou.
>
> Regards,
> Selva Nalladurai
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support