If you're backing up some large data set, you're probably not going across a slow network link. You're probably going locally disk to disk (or disk to tape), and you probably want the rate of compression to keep pace with the hardware I/O.
In my backups, I found with compression enabled, jobs run slower. At minimum 9% longer (via lzop), typically 16% longer (via gzip --fast) or 37% longer (via gzip), and worst case, 238% longer (via bzip2). The root cause is piping the data stream through a single processor to compress serially. So I created threadzip. http://code.google.com/p/threadzip/ This project is in its infancy now, but it is stable release 1.0. It's tiny, it's simple, and for these reasons, there's not much room for mistakes in the code yet. Based on the results below, the clear winners are: If you care about speed: threadzip If you care about size: pbzip2 164s 930MB tar cf - /usr/share | cat > /dev/null 167s 377MB tar cf - /usr/share | threadzip.py -t 4 --fast > /dev/null 179s 433MB tar cf - /usr/share | lzop -c > /dev/null 190s 378MB tar cf - /usr/share | gzip --fast > /dev/null 200s 301MB tar cf - /usr/share | pbzip2 -c > /dev/null 225s 345MB tar cf - /usr/share | gzip > /dev/null 391s 300MB tar cf - /usr/share | bzip2 -c > /dev/null
_______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
