Thanks Tom, I didn't know there was one in Moses for Tamil!

Regards
Anoop.

On Thu, Jul 14, 2016 at 4:17 PM, Tom Hoar <tah...@pttools.net> wrote:

> Re "Moses does not have tokenizers for Tamil", actually there is a Tamil
> nonbreaking prefix file in the folder
> scripts/share/nonbreaking_prefixes/nonbreaking_prefix.ta. You might want to
> start simple starting with the scripts/tokenizer/tokenizer.perl file. Then
> after you see how it works, escalate to Anoop's suggestions.
>
> Tom
>
>
>
> On 7/14/2016 5:14 PM, moses-support-requ...@mit.edu wrote:
>
> Date: Thu, 14 Jul 2016 13:02:57 +0530
> From: Anoop (?????)   <anoop.kunchukut...@gmail.com> 
> <anoop.kunchukut...@gmail.com>
> Subject: Re: [Moses-support] help regarding languages used for translation
> To: Selva Nalladurai <selva...@gmail.com> <selva...@gmail.com>
> Cc: moses-support <moses-support@mit.edu> <moses-support@mit.edu>
>
> Hi Selva,
>
> Moses is language-independent, so you can use it for any language pair as
> long as you have a parallel corpus. That said, you may have to do language
> specific pre and post-processing. For instance,
>
> - Moses does not have tokenizers for Tamil.
> - Tamil is agglutinative, so you may want to segment the words to reduce
> data sparsity as an additional pre-processing step.
>
> You can the Indic NLP library fo some some simple tokenization as well as
> segment the text ( http://anoopkunchukuttan.github.io/indic_nlp_library ).
>
> Since English and Tamil have different word order, you should try syntax
> based models (which is implemented in the  Moses package). Another option
> way is to pre-order the English sentence to the Tamil word order before
> training a phrase based system. You can use this for 
> pre-ordering:http://www.cfilt.iitb.ac.in/~moses/download/cfilt_preorder/register.html
>
> Regards,
> Anoop.
>
>
> On Thu, Jul 14, 2016 at 11:43 AM, Selva Nalladurai <selva...@gmail.com> 
> <selva...@gmail.com>
> wrote:
>
>
> > Hello Team,>                    I am Selva Nalladurai, doing my ME(masters 
> > of> engineering) CSE and am from India. I m doing my research in SMT and 
> > would> like to know whether Moses can be used for translating English to 
> > Tamil ( a> regional language spoken in our country), can i follow the same 
> > steps as> other languages translation for upolading my corpus.>             
> >                            Thankyou.>>                                      
> >           Regards,>                                                Selva 
> > Nalladurai>> _______________________________________________> Moses-support 
> > mailing list> Moses-support@mit.edu> 
> > http://mailman.mit.edu/mailman/listinfo/moses-support>>
>
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
I claim to be a simple individual liable to err like any other fellow
mortal. I own, however, that I have humility enough to confess my errors
and to retrace my steps.

http://flightsofthought.blogspot.com
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to