On Sat, Jul 25, 2009 at 12:41 AM, Rajeev J Sebastian <
[email protected]> wrote:

>
> On Fri, Jul 24, 2009 at 7:02 PM, JAGANADH G<[email protected]> wrote:
> >
> >
> > On Fri, Jul 24, 2009 at 5:29 PM, Rajeev J Sebastian
> > <[email protected]> wrote:
> >>
> >> On Fri, Jul 24, 2009 at 5:19 PM, Varewoolf<[email protected]> wrote:
> >> >
> >> > i am so much interested to make this happen... i am always interested
> >> > in linguistics...
> >> > anybody tell me wat r the things we need primarily??
> >>
> >> How about ...
> >>
> >> 1) 50+ years of research (actually, 2000 if you consider Panini)
> >
> > It is history ? If you can work hard you can reduce the zero from it.
>
> Huh ?
>
> >>
> >> 2) Extremely large corpus ... if you want to make a practical system
> >
> > Only if you adopt copus based model. That is not going to practical in
> right
> > now in the case of English to Malayalam translation
>
> It is not practical to make *anything* without a corpus. Even if you
> use a non-corpus based methodology to perform translation, you still
> need a large corpus to *validate* that your method works for more than
> toy examples. This is the biggest problem that faces any NLP work for
> Indic languages, and one that some glorified institutions in India
> neither builds up nor shares, most probably because all their systems
> are capable of are translating toy examples.

I know that thre are non -free systems under dvevelopment which is more
advanced that Google translate service(English Hindi). But when they will
relese it I dont know.


>
>
> >>
> >> 3) Large and talented team good in computational linguistics
> >
> > Where is it? We can build up this
>
> Best of Luck.
>
> >>
> >> 4) a very practical theory that can model language effectively for
> >> your purposes (seriously lacking for even small use cases in even
> >> major languages)
> >
> > A perfect grammar for Malayalam is required. Especially in Sysntax and
> > Morphology. Malayalam really lacks such studies.
>
> I don't think any language has such an in-depth model that could be
> used for generic MT. There are of course, special case models ...
> which can be used for special cases.
>
The Sanskrit grammar is a perfect model.


>
> >>
> >> 5) since you want to do MT, you need one more theory to handle the
> >> target language ... maybe even an IL model if you go that route
> >> instead of direct translation.
> >
> > First of all we need a good English to Malayalam dict in e-format.  Which
> > gives excat meaning POS, etc. Not like one saying Science - ശാസ്ത്രം,
> > തര്‍ക്കശാസ്ത്രം like.
>
> POS tagged dataset is just one component of a complete corpus.
>
POS Tagged corpus is a variety of corpus.


>
> Regards
> Rajeev J Sebastian
>
> >
>


-- 
**********************************
JAGANADH G
http://jaganadhg.freeflux.net/blog

--~--~---------~--~----~------------~-------~--~----~
"Freedom is the only law". 
"Freedom Unplugged"
http://www.ilug-tvm.org

You received this message because you are subscribed to the Google
Groups "ilug-tvm" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]

For details visit the website: www.ilug-tvm.org or the google group page: 
http://groups.google.com/group/ilug-tvm?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to