Pardon my ignorance. What do you mean by language model ?


A language model is a statistical model which populate from a data set.
Here I think OP is taling about creating language model for Speech
Processing. N-Gram is a kind of language model
http://en.wikipedia.org/wiki/N-gram

And by
> Tamil-corpus do you mean a large collection of tamil text ?
>

Corpus in the context of Natural Language Processing is:
A large collection of text .

There are different types of corpus such as Text Corpus, Speech Corpus,
Image corpus etc..

Here OP requires a text corpus. I think he can use the Tamil Wikipedia dump
as corpus for his research purpose. Or he can populate a corpus from
newspaper RSS feeds and Tamil blog feeds too.

-- 
**********************************
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc

Reply via email to