I wrote a table sorting function, which should be quite fast, as it is essentially IO bound. The algorithm produces compact tables, divided into 256*4096 files (buckets). The idea is that lookup should be lightning fast with the help of file system cache or by storing the files on flash memory (copying to flash may be slow)
The structure is keyed from the final chain value, a chain ending at AABBBCCD DDDDD000 will be filed in folder called AA in a file called BBB.dat Each chain occupies 10 bytes in this file. So by calling stat() on the file, the number of entries can be determined. The content of the file is organized as follows: unsigned short index1[entries] uint64 data[entries] The indexes to these two tables correspond, and for the above endpoint, the index1 value would be CC. The finished table would only contain a few hundred entries, and the index1 part should comfortably fit within the first disk block of the file.A quick sequential search would usually eliminate most search values (abort the search) - if the 16 bits of the index matches, then the data entry is read. The data entry is composited of the remaining 13 non-zero bits of the end value, and the 49 non-zero bits of the starting value - thus the start value can be looked up using just the finishing value. In a previous mail it was stated that the sorting time would be several hours, this method should be much faster, resulting in a more usable client. Tarball is available for inspection at: http://traxme.net/a5/ (native endianness is assumed when reading the tables) F _______________________________________________ A51 mailing list [email protected] http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51
