gneuner2 (george), you are over thinking this thing. my test data of 1 gb is 
but a small sample file. i can't even hash that small 1 gb at the time of data 
creation. the hashed data won't fit in ram. at the time i put the redundant 
data on the hard drive, i do some constant time sorting so that the redundant 
data on the hard drive is contained in roughly 200 usefully sorted files. some 
of these files will be small and can be hashed with a single read, hash and 
write. some will be massive (data won't fit in ram) and must be split further. 
this produces another another type of single read, hash and write. these split 
files can now be fully hashed which means a second read, hash and write. 
recombining the second level files is virtually instantaneous (copy-port) 
relative to the effort spent to get to that point. all of these operations are 
constant time. it would be nice to cut into that big fat hard drive induced C 
but i can't do it with a single read and write on the larger files.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to