so wat might be the next step?? On Sat, Jul 25, 2009 at 10:31 AM, JAGANADH G<[email protected]> wrote: > > > On Sat, Jul 25, 2009 at 12:41 AM, Rajeev J Sebastian > <[email protected]> wrote: >> >> On Fri, Jul 24, 2009 at 7:02 PM, JAGANADH G<[email protected]> wrote: >> > >> > >> > On Fri, Jul 24, 2009 at 5:29 PM, Rajeev J Sebastian >> > <[email protected]> wrote: >> >> >> >> On Fri, Jul 24, 2009 at 5:19 PM, Varewoolf<[email protected]> wrote: >> >> > >> >> > i am so much interested to make this happen... i am always interested >> >> > in linguistics... >> >> > anybody tell me wat r the things we need primarily?? >> >> >> >> How about ... >> >> >> >> 1) 50+ years of research (actually, 2000 if you consider Panini) >> > >> > It is history ? If you can work hard you can reduce the zero from it. >> >> Huh ? >> >> >> >> >> 2) Extremely large corpus ... if you want to make a practical system >> > >> > Only if you adopt copus based model. That is not going to practical in >> > right >> > now in the case of English to Malayalam translation >> >> It is not practical to make *anything* without a corpus. Even if you >> use a non-corpus based methodology to perform translation, you still >> need a large corpus to *validate* that your method works for more than >> toy examples. This is the biggest problem that faces any NLP work for >> Indic languages, and one that some glorified institutions in India >> neither builds up nor shares, most probably because all their systems >> are capable of are translating toy examples. > > I know that thre are non -free systems under dvevelopment which is more > advanced that Google translate service(English Hindi). But when they will > relese it I dont know. > >> >> >> >> >> 3) Large and talented team good in computational linguistics >> > >> > Where is it? We can build up this >> >> Best of Luck. >> >> >> >> >> 4) a very practical theory that can model language effectively for >> >> your purposes (seriously lacking for even small use cases in even >> >> major languages) >> > >> > A perfect grammar for Malayalam is required. Especially in Sysntax and >> > Morphology. Malayalam really lacks such studies. >> >> I don't think any language has such an in-depth model that could be >> used for generic MT. There are of course, special case models ... >> which can be used for special cases. > > The Sanskrit grammar is a perfect model. > >> >> >> >> >> 5) since you want to do MT, you need one more theory to handle the >> >> target language ... maybe even an IL model if you go that route >> >> instead of direct translation. >> > >> > First of all we need a good English to Malayalam dict in e-format. >> > Which >> > gives excat meaning POS, etc. Not like one saying Science - ശാസ്ത്രം, >> > തര്ക്കശാസ്ത്രം like. >> >> POS tagged dataset is just one component of a complete corpus. > > POS Tagged corpus is a variety of corpus. > >> >> Regards >> Rajeev J Sebastian >> >> > > > > -- > ********************************** > JAGANADH G > http://jaganadhg.freeflux.net/blog > > > >
--~--~---------~--~----~------------~-------~--~----~ "Freedom is the only law". "Freedom Unplugged" http://www.ilug-tvm.org You received this message because you are subscribed to the Google Groups "ilug-tvm" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For details visit the website: www.ilug-tvm.org or the google group page: http://groups.google.com/group/ilug-tvm?hl=en -~----------~----~----~----~------~----~------~--~---
