On Mon, 2004-05-10 at 15:28, Jeffrey Stedfast wrote: > you are forgetting the fact that folders are generally not read-only, > and so in order to write any new data to the gzip file, you'd have to > rewrite it from scratch which negates any speed improvements you could > possibly claim. ray:~$ echo hello | gzip >test.gz ray:~$ echo world | gzip >>test.gz ray:~$ zcat test.gz hello world ray:~$ As long as the archive folders only support appending, there's no need to rewrite the entire file. Further, there's no need to even keep it in one big file (and many good reasons not to). Partition the archives by month, or something.
FWIW there is actually a reason to store them in one compressed stream (vs catting them or separate files). It will compress a lot better, one large stream vs many smaller ones, there is a lot more redundant data to compress. Particularly considering the typical size of email messages.
It also depends on other factors like i/o readahead, async i/o etc. I remember doing an async i/o based GIF decoder on an Amiga 500. It could decode raw gif at about the speed it could be loaded off floppy (hmm, 7mhz!), without async i/o it bit, but with async i/o it was much faster than loading the raw image would have been. Still, compression is usually much more expensive.> also, as a curiosity, I actually tested this theory and it doesn't hold > true. reading/inflating a gzip file off disk is no faster than reading > the non-compressed file off disk, *and* inflating the gzip file pegs the > cpu so if the app was doing other things then it would negatively impact > performance of those other operations. This rather obviously depends on CPU speed versus disk speed, yes? If I had a modern CPU with a device that had a transfer speed of 1 byte a second, compressing the stream is an obvious win. If I have a device with a transfer speed of 1 GB/s, it's an obvious loss.
|
<<attachment: zed-48.small.jpg>>
