On 10/23/01, David Burry wrote: > In fact I didn't even compress the inverted index at all, used > Berkley DB (B+tree) instead which is quite wasteful of space (makes > it take about 25 megs per version) but it sure is > ***blazing***fast*** as a result!!!
I've got what may be an "inverted index" for the same purpose, using the Berkeley B+tree database as well. Maybe I just don't understand what an inverted index is, but mine ends up be about 4.3mb per version for the database. Keys are the words themselves (no punctuation except what may exist within a word), in lower case. Values are packed integer arrays, each integer being the verse index where the word is found. If you're recording more than that, I'd like to know what your database looks like. Keep in mind I've only taken a handful of programming courses, so I might be a little hazy on some terms. Jesse -- If we confess our sins, He is faithful and just to forgive us our sins and to cleanse us from all unrighteousness. If we say that we have not sinned, we make Him a liar, and His word is not in us. http://www.grace-els.org