Re: Trimming the CPAN - Automatic Purging

2010-03-27 Thread Jarkko Hietaniemi

On Friday-201003-26 13:20, Arthur Corliss wrote:

On Fri, 26 Mar 2010, Andy Lester wrote:


Absolutely.  This factual info would ideally look like this:

Of the 17,000 distros on CPAN, there are 8,000 that have versions more than a year 
older than the most recent one.  If those distros with versions more than a year out of 
date were purged, the number of files would decrease from 200,000 to 120,000.  This would 
save 7GB out of the 12GB that a full CPAN mirror takes now.  Removing that 7GB would mean 
Benefit X to mirror owners.

Without that, how can module authors be bothered to care?


If you don't mind me interjecting, I still can't be bothered to care.  We
have basically a 12GB data set, and we're worried about that?  I see that a
small barrier to bringing on new mirrors on constrained pipes, but
ultimately that's not that big a deal.  Hell, there's single versions of
some Linux distros that are bigger than that.


The total size is not the problem.  The number of files is.  Vanilla
rsync is horribly inefficient (not the protocol, which is genius, mind)
because a client coming by and asking for updates basically ends up
requiring the moral equivalent of
find . -type f -print.  Let me repeat that: each client.  Not fun.



Re: Trimming the CPAN - Automatic Purging

2010-03-27 Thread Jarkko Hietaniemi

On Friday-201003-26 19:02, Arthur Corliss wrote:

On Fri, 26 Mar 2010, Jarkko Hietaniemi wrote:


The total size is not the problem.  The number of files is.  Vanilla
rsync is horribly inefficient (not the protocol, which is genius, mind)
because a client coming by and asking for updates basically ends up
requiring the moral equivalent of
find . -type f -print.  Let me repeat that: each client.  Not fun.


Why use rsync, then?  Why not have checkpointed logs on cpan with
additions/removals logged by date so you can roll forward on the client,
processing only those files?  It would be trivial to set up and a lot more
efficient.


We wait your implementation breathlessly.  By the time all the CPAN 
mirrors have started using that, we probably will be rather blue in

the face.


--Arthur Corliss
  Live Free or Die





Re: Trimming the CPAN - Automatic Purging

2010-03-27 Thread Jarkko Hietaniemi
  Oh, I understand that fully.  And I'd be happy to lend some of my 
time.  But

you don't make people inclined to help when people are lobbing snarky
comments like we'll wait breathlessly for you to do it.


The time-honored tradition of many open source communities is to talk. 
And talk.  And talk.  The problem is that this solves nothing.  To do, does.


You are free to decide to take this as a personal insult.