Hello!

I am a PhD and I am currently analyzing Aspell API to see if I can use
it to produce useful (and free) natural language processing tools. I
want to implement a cool feature using Aspell: NGram based language
detection of a document (it means that given a text, guess the
language of this text). That could be a useful feature to add into the
Aspell API to. In order to do that, I need to build NGram profiles for
a language.

To do that, I have to read the list of all the words of a given
language dictionary and build a table like that:
ngram frequency

with ngram a sequence of letters (with len(ngram) < 5) and frequency
the number of time it appears in the words of the dictionnary.

To do that using Aspell, I need a function in the Aspell API that can
list all the words of a given dictionnary. After having read the whole
aspell.h file, I have reached the conclusion that it does not exist
yet. Am I right? Is there an easy way to do that?

Thanks in advance for the answer, and kudos to all the developer of
this really usefull program.

Regards.
-- 
Jean-Rémy Falleri <[EMAIL PROTECTED]>
PhD Student (http://www.lirmm.fr/~falleri)
LIRMM - CNRS - Université Montpellier 2


_______________________________________________
Aspell-devel mailing list
Aspell-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/aspell-devel

Reply via email to