>Importing a 25955 line contents file (2.1 megs) resulted in a 2.3 meg >database. The c_files table had 25955 records, c_pkgs had 193 records, >and c_match had 25759 records. Theoretically, c_match should have had >25955 records, so I'm not 100% sure what happened. I'll have to do some >more digging in my code.
I'm surprised at the small size overhead. Our own attempts were not that successful. Bill asked me about my experiment; it was something like this; I build a small door based server which loaded the database in-memory. The principle behind this was that all package transaction would be against the door server and not the contents file (easy to manage when all you use are the *pkg* tools during install time). Rather than rewriting the file, the daemon would output with equal transactional safety a log file of deltas which simply contained removed or changed contents lines, all with a one byte prefix saying which is which. The daemon could occassionally write out the new contents file. If the daemon was terminated at any point in time, the data would all be recoverable from the logfile and the old contents file. The purpose written in-memory hash was fairly efficient (a small multiplier over the file size) as opposed to our first attempt which consumed 100s of megabytes) The benefit of the approach is that it keeps the contents file (and we found many bits depending on it) but do away with the huge amount of I/O assiociated with it. Casper
