so long, CPAN

2010-09-26 Thread Jarkko Hietaniemi

It has become the time for me to admit to what has probably been pretty
obvious for anyone else already for some time - I do not have the time
to give CPAN the attention it deserves.  Time to pass the baton, etc.

First and foremost: CPAN is PAUSE.  So it's actually Andreas that has
been doing most of the work all these years.  All the kudos to him.
The FUNET site is still mirroring some other sources into CPAN, but
they are completely dead and nobody would notice if they stopped.

Secondly: the CPAN mirror database maintenance is very messy,
error-prone, and time-consuming: time to create a ticketing system
for it, where each mirror is a queue, and each mirror maintainer gets
an account?

Thirdly, here's what I've been thinking about who could take over:

brian d foy  Ricardo Signes - project management (policies)
Ask Bjørn Hansen  Robert Spier - tech leads (i.e. running systems)
Henk Penning  David Landgren - the mirror database

Note that I have named always two people - myself being a baaad
example of a single point of failure.  Though: maybe the PAUSE
maintenance could be shared with more people?  Andreas does need
some evenings off.  Likewise, the DNS of .cpan.org is currently
behind Jos.  Not a bad place to be, but again, a single point
of failure sucks.

There are some other smaller parts in CPAN - like maintaining
the FAQ, and maintaining the binaries page.  I don't have any
good ideas on how/whom they should go.

As recipients I chose people who over the years have shown promise
and/or interest in various aspects of CPAN, in reasonably random
order.  Feel free to nominate/denominate yourself/other people.

The concrete first step could be that Ask's develooper starts mirroring
PAUSE directly instead of from FUNET, and then I switch FUNET to mirror
from develooper.  Second step: maybe get kernel.org as the North America
Tier 1 mirror (they have shown interest in the past, and they have the
capacity).  Third step: more Tier 1 mirrors in NA and other continents?
Fourth step: you fill it in.

However Perl 6 will affect CPAN, I leave for younger minds to ponder.

To close off, some random musings, if I may: avoid tight couplings,
like the plague they are.  Avoid single points of failure.
Programming/middleware fads come and go, don't be too eager
to follow them.




Re: so long, CPAN

2010-09-26 Thread Jarkko Hietaniemi

On Sunday-201009-26 8:16, Ask Bjørn Hansen wrote:


On Sep 26, 2010, at 4:49, Jarkko Hietaniemi wrote:


It has become the time for me to admit to what has probably been pretty
obvious for anyone else already for some time - I do not have the time
to give CPAN the attention it deserves.  Time to pass the baton, etc.


Thank you Jarkko -- had it not been for your early invention and work with CPAN 
I don't think many of us would be here or be as productive with Perl as we are.


On a more urgent note: could you and Elaine coordinate on moving/copying
stuff out of gargoyle where e.g. the mirrors.cpan.org runs?
The webster.edu has given us a strong hint of moving out a.s.a.p.
I think the first order of things would be just copying data out of
gargoyle, we can worry about the services later.

In the FUNET side things are not as critical to move out though their
admins do worry about the insane rsync load.  Whatever the future
system is, direct plain rsync connections should not be recommended:
rsync is just too heavy.  I and Elaine do have accounts to FUNET
and can move stuff in and out (more accounts though not impossible
are unlikely).

Regarding the maintenance scripts in FUNET: there isn't much that I
would be, ahem, proud to share: they are mostly dead simple shell /
very early Perl 5 scripts.  For 95% of that stuff I would recommend 
writing from scratch.  Perhaps the most important new thing needed

would be some sort of CPAN mirror staleness alerting script, as input
using Henk Pennings' mirror scan results.  I had over the years a few of
those systems, all of them rotted eventually.  As an extension of just
checking the timestamp of the magical timestamp file, it would be nice
to have some sort of random sampling of mirrors: are they really valid
uptodate mirrors?



  - ask





Re: Why are versions restricted to 999?

2010-04-21 Thread Jarkko Hietaniemi





999 revisions ought to be enough to anyone


♪ I got 999 revisions but I can't add one ♪

I speak from experience, limiting the version is a bad idea.


Some people version their modules with MMDD.  I  can see
bleeding edge development going with added HHHMMSS.


Re: Trimming the CPAN - Automatic Purging

2010-03-27 Thread Jarkko Hietaniemi
  Oh, I understand that fully.  And I'd be happy to lend some of my 
time.  But

you don't make people inclined to help when people are lobbing snarky
comments like we'll wait breathlessly for you to do it.


The time-honored tradition of many open source communities is to talk. 
And talk.  And talk.  The problem is that this solves nothing.  To do, does.


You are free to decide to take this as a personal insult.



Re: Trimming the CPAN - Automatic Purging

2010-03-26 Thread Jarkko Hietaniemi

On Friday-201003-26 13:20, Arthur Corliss wrote:

On Fri, 26 Mar 2010, Andy Lester wrote:


Absolutely.  This factual info would ideally look like this:

Of the 17,000 distros on CPAN, there are 8,000 that have versions more than a year 
older than the most recent one.  If those distros with versions more than a year out of 
date were purged, the number of files would decrease from 200,000 to 120,000.  This would 
save 7GB out of the 12GB that a full CPAN mirror takes now.  Removing that 7GB would mean 
Benefit X to mirror owners.

Without that, how can module authors be bothered to care?


If you don't mind me interjecting, I still can't be bothered to care.  We
have basically a 12GB data set, and we're worried about that?  I see that a
small barrier to bringing on new mirrors on constrained pipes, but
ultimately that's not that big a deal.  Hell, there's single versions of
some Linux distros that are bigger than that.


The total size is not the problem.  The number of files is.  Vanilla
rsync is horribly inefficient (not the protocol, which is genius, mind)
because a client coming by and asking for updates basically ends up
requiring the moral equivalent of
find . -type f -print.  Let me repeat that: each client.  Not fun.



Re: Trimming the CPAN - Automatic Purging

2010-03-26 Thread Jarkko Hietaniemi

On Friday-201003-26 19:02, Arthur Corliss wrote:

On Fri, 26 Mar 2010, Jarkko Hietaniemi wrote:


The total size is not the problem.  The number of files is.  Vanilla
rsync is horribly inefficient (not the protocol, which is genius, mind)
because a client coming by and asking for updates basically ends up
requiring the moral equivalent of
find . -type f -print.  Let me repeat that: each client.  Not fun.


Why use rsync, then?  Why not have checkpointed logs on cpan with
additions/removals logged by date so you can roll forward on the client,
processing only those files?  It would be trivial to set up and a lot more
efficient.


We wait your implementation breathlessly.  By the time all the CPAN 
mirrors have started using that, we probably will be rather blue in

the face.


--Arthur Corliss
  Live Free or Die





Re: CPAN vs Perl 6

2010-01-05 Thread Jarkko Hietaniemi
On Tuesday-201001-05 19:48, David Golden wrote:
 On Tue, Jan 5, 2010 at 6:19 PM, Eric Wilhelm enoba...@gmail.com wrote:
 Given the constraint of bootstrap-ability, it seems like you should
 answer Why not tsv? before reaching for anything more complicated.
 
 Because META is already multi-dimensional and I don't want to find or
 re-invent a wheel for representing multi-dimensional data in tsv.  The
 YAML/JSON debate is pretty much over as far as the META spec goes and
 JSON wins.
 
 Thus, given the pending arrival of JSON into core for META, I see no

So you are saying screw the older Perl distributions?

 reason not to use JSON for index information as well.  The last thing
 we need is for CPAN/CPANPLUS/Tool-X to all implement their own tsv
 parsers and we don't already have one in core, do we?  (I could be

Yes, it's called  and split(/\t/).

 wrong there).
 
 This is such a stupid bikeshed conversation anyway.

Wait until you see the color we chose.

 David
 



Re: CMSP 17. Better formalization of license field

2009-11-04 Thread Jarkko Hietaniemi
I have to say that I really don't care going down this particular rabbit
hole.  We can argue this to hell and back, but in the end if it comes to
that, the court will decide, for each particular case.

On Wed, Nov 4, 2009 at 12:57 PM, David Cantrell da...@cantrell.org.ukwrote:

 On Tue, Nov 03, 2009 at 02:12:00PM -0500, Jarkko Hietaniemi wrote:
  On Tue, Nov 3, 2009 at 12:41 PM, David Cantrell da...@cantrell.org.uk
 wrote:
   On Mon, Nov 02, 2009 at 12:45:30PM -0500, Jarkko Hietaniemi wrote:
+Inf.  Public domain doesn't mean what one might think it means.
  Most
importantly, it doesn't mean much outside U. S. jurisdictions.
   [citation needed]
  The core of the problem is this: public domain is a legal term that
 only
  is defined within the U.S. (and I admit, other Anglo-Saxon law, like UK
 and
  Australia, etc.). Say, a German author saying This is public domain is
  making no sense.  I know for a fact that in Finnish law an author cannot
  give away his rights, and the same applies in other European countries.
 
  Even more importantly, it doesn't work the way most people think.  It
  doesn't e.g. relieve the author of warranties or damage claims.  It's
 much
  better to choose a minimal license that disclaims warranties, such as the
  MIT one.

 But that doesn't relieve the author of damage claims either, no matter
 what the licence says.  At least not in all jurisdictions.  So on the
 basis that public domain can't be used because it is invalid in
 Germany, MIT can't be used because it is not entirely valid in the UK.

 Likewise the GPL and the Artistic licence.

 --
 David Cantrell | Reality Engineer, Ministry of Information

  Your call is important to me.  To see if it's important to
  you I'm going to make you wait on hold for five minutes.
  All calls are recorded for blackmail and amusement purposes.




-- 
There is this special biologist word we use for 'stable'. It is 'dead'. --
Jack Cohen


Re: CMSP 17. Better formalization of license field

2009-11-03 Thread Jarkko Hietaniemi
Angels, head of a pin, lawyers, three doors down :-)

On Tue, Nov 3, 2009 at 3:02 PM, Zefram zef...@fysh.org wrote:

 Jarkko Hietaniemi wrote:
 If your need is to list the licenses a package contains, in a way there is
 no need to list the public domain bits because there are no strings,
 err,
 licenses attached.  It is in the public domain.

 Null licensing is not the same as not saying anything about licensing.

   I know for a fact that in Finnish law an author cannot
 give away his rights, and the same applies in other European countries.

 So public domain isn't necessarily even a null license.

 -zefram




-- 
There is this special biologist word we use for 'stable'. It is 'dead'. --
Jack Cohen


Re: CMSP 22. Clarify author field

2009-10-30 Thread Jarkko Hietaniemi
One point about contact points comes to mind: do we currently
allow/mention/encourage *multiple* contact addresses (be they email
addresses or something else)

People change jobs / email providers / graduate, and to better be able to
contact them, multiple addresses is better than a single one.

On Fri, Oct 30, 2009 at 2:22 PM, David Golden xda...@gmail.com wrote:

 On Fri, Oct 30, 2009 at 12:26 PM, Lars Dɪᴇᴄᴋᴏᴡ
 lars.diec...@googlemail.com wrote:
  Since we have no consensus on a change of semantic, field extension,
 field
  renaming or deprecation in favour of something better, I came up with a
 doc
  patch (attached because Github is down) that merely describes the current
  practice in the wild. Some quotations from you that pull into this
 direction:
 
  • who to spam for problems with this module
  • who to contact with questions or bugs (in the event that there is no
  bugtracker)
  • Author is probably best as contact point.
  • I always feel uneasy to put my name/email address into author when
 all I'm
  doing is keeping the module in working condition on CPAN.
 
  If you read the patch's prose carefully, it sounds kind of vague as I
 wanted
  to avoid MUSTs and SHOULDs. Any comments welcome.

 Works for me.  It clarifies the current state, which is consistent
 with the criteria for changes.

 After all patches are integrated, I'll probably do a couple editing
 passes.  There are other sections using must and should and such,
 so I'd like to harmonize.  For the moment, this works great.

 -- David




-- 
There is this special biologist word we use for 'stable'. It is 'dead'. --
Jack Cohen


Re: CMSP 22. Clarify author field

2009-10-09 Thread Jarkko Hietaniemi
And contact for security stuff.

On Fri, Oct 9, 2009 at 10:38 AM, Steffen Mueller
nj88ud...@sneakemail.com wrote:
 David Golden wrote:

 22. Clarify author field

 Consider that it's currently, practically used as a contact field. I get
 lots of mail that should have gone to a mailing list instead.

 Therefore, I'm for:

 - Remove the ambiguous author field
 - Add contact field. Potentially with a type associated (person or mailing
 list).
 - To compensate, add a copyright holder field in some form. Though I realize
 that this may be impossible due to conflicting copyright of the package
 content. Maybe a field in the spirit of contact for legal stuff.

 The idea is to make it very clear where normal inquiries should go without
 removing the mention of a single person in case of legal issues where
 mailing lists and other media are inappropriate.

 Steffen




-- 
There is this special biologist word we use for 'stable'. It is
'dead'. -- Jack Cohen