alright george, i'm open to new ideas. here's what i've got going. running 64 
bit linux mint OS on a 2 core laptop with 2 gb of ram. my key is 128 bits with 
~256 bits per record. so my 1 gb file contains ~63 million records and ~32 
million keys. about 8% will be dupes leaving me with ~30 million keys. i run a 
custom built hash. i use separate chaining with a vector of bignums. i am 
willing to let my chains run up to 5 keys per chain so i need a vector of 6 
million pointers. that's 48 mb for the array. another 480 mb for the bignums. 
let's round that sum to .5 gb. i have another rather large bignum in memory 
that i use to reduce but not eliminate record duplication of about .5 gb. i'm 
attempting to get this thing to run in 2 places so i need 2 hashes. add this up 
.5+.5+.5 is 1.5 gb and that gets me to about my memory limit. the generated 
keys are random but i use one of the associated fields for sorting during the 
initial write to the hard drive. what goes in each of those files is totally 
random but dupes do not run across files. also, the number of keys is >1e25.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to