Re: Using a better compression than .gz for one's CPAN modules
On Sat, 20 Nov 2010 23:22:52 +0100, Aristotle Pagaltzis pagalt...@gmx.de said: It’s gonna be a lot of work to iron out the entire tool chain to support the newer formats; then it will take a lot of time until the work trickles out far enough that people could start relying on it. In the case of bzip2 I couldn't resist after having watched bzip2's acceptance for several years. So I prodded all toolchain authors to support bz2. It is now done and seems to work fine. For quite piddly gains, in absolute numbers. I really don’t see the point. Gzip is Good Enough. Agreed, but since bzip2 support is already done we can welcome it when people actually use it. -- andreas
Re: Using a better compression than .gz for one's CPAN modules
* Andreas J. Koenig andreas.koenig.7os6v...@franz.ak.mind.de [2010-11-22 09:20]: Agreed, but since bzip2 support is already done we can welcome it when people actually use it. I am unwilling to encourage it but I won’t argue if someone wants to use it. And it can be a win for distributions with very large bundled data files so one might as well use it for them since the support exists. I just don’t want to see a campaign against gzip. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Reducing rsync cost (was: Re: Using a better compression than .gz for one's CPAN modules)
On 19/11/2010 20:57, dhu...@hudes.org wrote: source code, even 100KLOC? Once you go to .gz you're already at better than 2:1. What are you going to save by going to even 3:1, 10Kbytes? compared to the nuisance inflicted, it's nothing. Over the entire CPAN archive, it'd be significant... I agree on the individual case it's probably not worth worrying about too much. But if it's easy to use .bz2 or something better it wouldn't hurt to get that word out. (And it may be worth making it easy, though I'm not sure about that.) Daniel T. Staal Disk space is cheap. Bandwidth is cheap. What's rough is the rsync between mirrors. Compressing to .bz2 won't help that: the stress is doing a stat on every single file in CPAN not the transfer. Work toward optimizing the mirror distribution instead of worrying about bz2 vs gz. Remember not Yeah, this is the killer. In an ideal world, we would kill the symlinks such as authors/id/*, modules/by-category/*, modules/by-module/* and so on. These could be recreated via shell scripts locally on mirrors for people who wish to maintain these legacies. Cutting that out would diminish the rsync burden considerably. David -- There's bum trash in my hall and my place is ripped I've totaled another amp, I'm calling in sick
Re: Reducing rsync cost (was: Re: Using a better compression than .gz for one's CPAN modules)
On Mon, Nov 22, 2010 at 4:37 AM, David Landgren da...@landgren.net wrote: Yeah, this is the killer. In an ideal world, we would kill the symlinks such as authors/id/*, modules/by-category/*, modules/by-module/* and so on. These could be recreated via shell scripts locally on mirrors for people who wish to maintain these legacies. Cutting that out would diminish the rsync burden considerably. David or re-engineer CPAN as a sqlite+FTSE database, and re-engineer the mirroring process as a database mirror via a TBD compact database diff protocol (I have no intention of doing any of this myself; good morning) -- It is merely a matter of persistence. -- Albert Camus