Thanks Tom, I didn't know there was one in Moses for Tamil! Regards Anoop.
On Thu, Jul 14, 2016 at 4:17 PM, Tom Hoar <tah...@pttools.net> wrote: > Re "Moses does not have tokenizers for Tamil", actually there is a Tamil > nonbreaking prefix file in the folder > scripts/share/nonbreaking_prefixes/nonbreaking_prefix.ta. You might want to > start simple starting with the scripts/tokenizer/tokenizer.perl file. Then > after you see how it works, escalate to Anoop's suggestions. > > Tom > > > > On 7/14/2016 5:14 PM, moses-support-requ...@mit.edu wrote: > > Date: Thu, 14 Jul 2016 13:02:57 +0530 > From: Anoop (?????) <anoop.kunchukut...@gmail.com> > <anoop.kunchukut...@gmail.com> > Subject: Re: [Moses-support] help regarding languages used for translation > To: Selva Nalladurai <selva...@gmail.com> <selva...@gmail.com> > Cc: moses-support <moses-support@mit.edu> <moses-support@mit.edu> > > Hi Selva, > > Moses is language-independent, so you can use it for any language pair as > long as you have a parallel corpus. That said, you may have to do language > specific pre and post-processing. For instance, > > - Moses does not have tokenizers for Tamil. > - Tamil is agglutinative, so you may want to segment the words to reduce > data sparsity as an additional pre-processing step. > > You can the Indic NLP library fo some some simple tokenization as well as > segment the text ( http://anoopkunchukuttan.github.io/indic_nlp_library ). > > Since English and Tamil have different word order, you should try syntax > based models (which is implemented in the Moses package). Another option > way is to pre-order the English sentence to the Tamil word order before > training a phrase based system. You can use this for > pre-ordering:http://www.cfilt.iitb.ac.in/~moses/download/cfilt_preorder/register.html > > Regards, > Anoop. > > > On Thu, Jul 14, 2016 at 11:43 AM, Selva Nalladurai <selva...@gmail.com> > <selva...@gmail.com> > wrote: > > > > Hello Team,> I am Selva Nalladurai, doing my ME(masters > > of> engineering) CSE and am from India. I m doing my research in SMT and > > would> like to know whether Moses can be used for translating English to > > Tamil ( a> regional language spoken in our country), can i follow the same > > steps as> other languages translation for upolading my corpus.> > > Thankyou.>> > > Regards,> Selva > > Nalladurai>> _______________________________________________> Moses-support > > mailing list> Moses-support@mit.edu> > > http://mailman.mit.edu/mailman/listinfo/moses-support>> > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- I claim to be a simple individual liable to err like any other fellow mortal. I own, however, that I have humility enough to confess my errors and to retrace my steps. http://flightsofthought.blogspot.com
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support