On Mar 26, 2010, at 8:23 PM, Arthur Corliss wrote:
> 
> Sure, I don't run a CPAN mirror, but I do manage many, many terrabytes of
> storage as part of my day job.  I think it's a tad presumptuous to disregard
> input just because we're not in your inner sanctum.  As I mentioned in a
> follow up e-mail:  this is simply a matter of selecting the correct problem
> domain.  I believe that streamlining the mirroring process will provide
> greater gains for less effort.
> 
> That's not to say that pursuing other efficiencies isn't worthwhile, just
> that you need to prioritize.
> 
> But what the hell do I know.  I don't run a *CPAN* mirror, so I must be
> freaking clueless...

Oh, don't be such a drama queen. I rebuilt and helped run nic.funet.fi for 2 
years which is the canonical mirror for a large number of mirrors and the 
perspective of having a few terabytes spinning in storage changes quite 
dramatically when you are actually serving a few terabytes to thousands of 
clients. CPAN grew to be quite a burden on the site not only because of the 
high demand, but also because of the multitude of small files and I'm sure 
other mirrors feel similarly burdened. 

The sort of pruning Tim brought up has long been an idea, but with the current 
and growing size of the archive, something does need to be done to alleviate 
the burden not only on the canonical mirrors, but also on the random folks who 
want to grab a local mirror for themselves. In my present work environment, 
12gb isn't a lot of disk space, but it's a lot considering I don't need to 
install perl modules daily and the vast majority of it I'll likely never use. 
It would be a kindness to both the mirror operators and to the end-users to 
trim it down to a manageable size. 

As for efficiency, rsync remains a good tool for the job that works on nearly 
every platform which is a rather tall order to match with any other solution. 
Relegating the cruft to BackPAN to make the current CPAN slimmer and less 
demanding on all fronts is an idea that would be welcomed by more than just 
mirror ops.

The only snag I can forsee in trimming back on the abundance of modules is the 
case where some modules have version requirements for other modules where it 
will barf with a mismatch/newer version of the required module (I bumped into 
this recently but can't remember exactly which module it was) but I think it's 
rare and the practise should be discouraged.

e.

Reply via email to