On Wed, 31 Mar 2010, Ask Bj?rn Hansen wrote: <snip>
Everyone who doesn't run mirrors says "oh, who cares - it doesn't bother me". Some of us who does run mirrors say "actually, that sort of thing is important and an actual issue.". Others reply "then you're doing it wrong". But nobody came with something reality based that'd be "right".
Some revisionist history here. I run mirrors (not CPAN) and know full well the limitations and inefficiencies of rsync. To date, not one of you have been able to refute that for this scale rsync is hurting you. But most of you have been obstinately against find a more efficient way of doing things. I've made a viable suggestion, and offered some time to work on it. But you've made it abundantly clear that it's not welcome.
The main point here is that we can't use 20 inodes per distribution. It's Just Nuts. Sure, it's only something like 400k files/inodes now - but at the rate it's going it'll be a lot more soon enough.
Thats a problem, but not likely the biggest drag on server I/O you're suffering. Might that be <ahem> rsync?
HOWEVER: Right now more of those are wasted on other things (.readme files, symlinks, ...) -- some of which have solutions in progress already. I don't think anyone is arguing that we NEED to delete the old distributions; only that they do indeed have a cost to keep around in the main CPAN.
You're right, I'm not arguing the need for the cruft. I've only pointed out the obvious reality that trimming files only postpones the I/O management issues that at some time are likely going to have to be addressed, anyway. And that you'll get less bang for the buck (or man hour) by treating the symptoms, not the disease. For the record: if that's what you want to do, have at it. Let's just not be disingenuous about the fact that we're abrogating our responsibilities as technologists by refusing to address the real problems and weaknesses of the platform. --Arthur Corliss Live Free or Die