Hello! I am a PhD and I am currently analyzing Aspell API to see if I can use it to produce useful (and free) natural language processing tools. I want to implement a cool feature using Aspell: NGram based language detection of a document (it means that given a text, guess the language of this text). That could be a useful feature to add into the Aspell API to. In order to do that, I need to build NGram profiles for a language.
To do that, I have to read the list of all the words of a given language dictionary and build a table like that: ngram frequency with ngram a sequence of letters (with len(ngram) < 5) and frequency the number of time it appears in the words of the dictionnary. To do that using Aspell, I need a function in the Aspell API that can list all the words of a given dictionnary. After having read the whole aspell.h file, I have reached the conclusion that it does not exist yet. Am I right? Is there an easy way to do that? Thanks in advance for the answer, and kudos to all the developer of this really usefull program. Regards. -- Jean-Rémy Falleri <[EMAIL PROTECTED]> PhD Student (http://www.lirmm.fr/~falleri) LIRMM - CNRS - Université Montpellier 2 _______________________________________________ Aspell-devel mailing list Aspell-devel@gnu.org http://lists.gnu.org/mailman/listinfo/aspell-devel