I was wondering if anyone has looked at using libjudy (http://judy.sourceforge.net/) for storing words with aspell?
Libjudy provides a number of sparse array data structures which provide very fast lookups, because they are cache aware, and reasonable memory efficiency. There is a function in the libjudy package that provides a string indexed array which is quite space efficient because it is prefix compressed. I don't have a standalone metaphone encoder handy, but just passing /usr/dict/words on my pentium M laptop into judy sl gives 0.304 uS/word lookups using only 10mbyte of core, which is only 2x the size of the file. It would be easy to provide code with this datastructure which quickly found the longest match and all other entries of the same match length, perhaps something which would be useful in aspell as well... _______________________________________________ Aspell-devel mailing list Aspell-devel@gnu.org http://lists.gnu.org/mailman/listinfo/aspell-devel