I forgot to say about hash... My personal choice will be MurmurHash2 64 bit function http://murmurhash.googlepages.com/ http://en.wikipedia.org/wiki/MurmurHash2 - lots of implementations here
It's fast (even in managed impls), have good characteristics and free. Don't use CRC64... P.S. You still have a chance ~ 1/10`000`000`000 that two strings in 1 billion dictionary will have same hash. So you probably should make very small table cached in memory that will have collision resolvings - string key that was changed to other string key w/o collision. That's simple to do and will remove a chance of collision while keeping additional checks very fast (due to small size of the collision check table - I believe you will never see anything in that table at all). -- View this message in context: http://www.nabble.com/very-large-SQLite-tables-tp24201098p24219678.html Sent from the SQLite mailing list archive at Nabble.com. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users