On Fri, Jun 27, 2008 at 12:20:28PM -0700, Brooks, Phil scratched on the wall:
> I created my hashes in a perl script: > > $hash=md5($key); > $hash_num = unpack( "%32N*", $hash ) % 4294967295; > > so they end up being big 32 bit integer numbers. > > This ends up saving a lot of space, but the indexes end > up taking vastly longer to create than the simple creation of string > indices. Perhaps the randomness of the key values? Or perhaps > duplication? The hash values are going to be very "random." If the string values were somewhat sorted, then these indexes will take a lot longer since the values need to be sorted as they are inserted into the index. Duplicates shouldn't be much different (in terms of cost) than the original string duplicates. Were you able to increase the cache size? That will make a big difference in the sort process of the cache creation. See: PRAGMA cache_size=<size> http://www.sqlite.org/pragma.html If you have a typical desktop PC, try setting it to 250000 or so. Be aware that pragma isn't "sticky", so you'll need to issue it in the specific session used to create the indexes. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users