If you are really interested drop me a mail. Are you familier with Perl programming ?
On Sun, Jul 26, 2009 at 10:29 PM, Varewoolf <[email protected]> wrote: > > so wat might be the next step?? > > On Sat, Jul 25, 2009 at 10:31 AM, JAGANADH G<[email protected]> wrote: > > > > > > On Sat, Jul 25, 2009 at 12:41 AM, Rajeev J Sebastian > > <[email protected]> wrote: > >> > >> On Fri, Jul 24, 2009 at 7:02 PM, JAGANADH G<[email protected]> wrote: > >> > > >> > > >> > On Fri, Jul 24, 2009 at 5:29 PM, Rajeev J Sebastian > >> > <[email protected]> wrote: > >> >> > >> >> On Fri, Jul 24, 2009 at 5:19 PM, Varewoolf<[email protected]> > wrote: > >> >> > > >> >> > i am so much interested to make this happen... i am always > interested > >> >> > in linguistics... > >> >> > anybody tell me wat r the things we need primarily?? > >> >> > >> >> How about ... > >> >> > >> >> 1) 50+ years of research (actually, 2000 if you consider Panini) > >> > > >> > It is history ? If you can work hard you can reduce the zero from it. > >> > >> Huh ? > >> > >> >> > >> >> 2) Extremely large corpus ... if you want to make a practical system > >> > > >> > Only if you adopt copus based model. That is not going to practical in > >> > right > >> > now in the case of English to Malayalam translation > >> > >> It is not practical to make *anything* without a corpus. Even if you > >> use a non-corpus based methodology to perform translation, you still > >> need a large corpus to *validate* that your method works for more than > >> toy examples. This is the biggest problem that faces any NLP work for > >> Indic languages, and one that some glorified institutions in India > >> neither builds up nor shares, most probably because all their systems > >> are capable of are translating toy examples. > > > > I know that thre are non -free systems under dvevelopment which is more > > advanced that Google translate service(English Hindi). But when they will > > relese it I dont know. > > > >> > >> >> > >> >> 3) Large and talented team good in computational linguistics > >> > > >> > Where is it? We can build up this > >> > >> Best of Luck. > >> > >> >> > >> >> 4) a very practical theory that can model language effectively for > >> >> your purposes (seriously lacking for even small use cases in even > >> >> major languages) > >> > > >> > A perfect grammar for Malayalam is required. Especially in Sysntax and > >> > Morphology. Malayalam really lacks such studies. > >> > >> I don't think any language has such an in-depth model that could be > >> used for generic MT. There are of course, special case models ... > >> which can be used for special cases. > > > > The Sanskrit grammar is a perfect model. > > > >> > >> >> > >> >> 5) since you want to do MT, you need one more theory to handle the > >> >> target language ... maybe even an IL model if you go that route > >> >> instead of direct translation. > >> > > >> > First of all we need a good English to Malayalam dict in e-format. > >> > Which > >> > gives excat meaning POS, etc. Not like one saying Science - ശാസ്ത്രം, > >> > തര്ക്കശാസ്ത്രം like. > >> > >> POS tagged dataset is just one component of a complete corpus. > > > > POS Tagged corpus is a variety of corpus. > > > >> > >> Regards > >> Rajeev J Sebastian > >> > >> > > > > > > > > -- > > ********************************** > > JAGANADH G > > http://jaganadhg.freeflux.net/blog > > > > > > > > > > > -- ********************************** JAGANADH G http://jaganadhg.freeflux.net/blog --~--~---------~--~----~------------~-------~--~----~ "Freedom is the only law". "Freedom Unplugged" http://www.ilug-tvm.org You received this message because you are subscribed to the Google Groups "ilug-tvm" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For details visit the website: www.ilug-tvm.org or the google group page: http://groups.google.com/group/ilug-tvm?hl=en -~----------~----~----~----~------~----~------~--~---
