Apparently, there was a snafu in my subscription and my now-less-relevant reply never got posted: Nitpicking, page pache, not disk cache. If you have sudo privileges on your test machine, 'echo 3 > /proc/sys/vm/drop_caches' will clear your page cache. On the other hand, is clearing the page cache necessary? Letting the uncompressed file live in the cache should be fine since you're essentially benchmarking cpu consumption, not I/O. The compressed output from each run would be treated as new.
By the way, I guess I should introduce myself. I've been on this list for years but never posted. I'm Stan, graduated in 2007, BS in computer science, now PhD candidate in computer science. I study I/O and storage systems with an emphasis on NVM. Nice to meet you. Relevant post: What Dr. Bindner says pretty much sums it up. If your network has good bandwidth, the fastest compression/decompression will be your goal. If your machines are fast but your network is narrow, compression ratio is your concern, which will vary by algorithm and file type. Funny anecdote about transferring scientific computing datasets: the bandwidth of snail mail is king. On Sat, Jul 9, 2011 at 3:08 PM, Don Bindner <don.bind...@gmail.com> wrote: > Of course, if your encoder and decoder are fast enough, then the network is > worth worrying about. But the top 3 had similar ratios. If network really > is the issue, then lz4, Zhuff and some of the others come back into the > game. > > Of course, then the thing to do would be to sort on the min of the network > throughput, the encoding speed, and the decoding speed. That's going to > depend on your network of course. > > Don > -- Stan Park