so long, CPAN
It has become the time for me to admit to what has probably been pretty obvious for anyone else already for some time - I do not have the time to give CPAN the attention it deserves. Time to pass the baton, etc. First and foremost: CPAN is PAUSE. So it's actually Andreas that has been doing most of the work all these years. All the kudos to him. The FUNET site is still mirroring some other sources into CPAN, but they are completely dead and nobody would notice if they stopped. Secondly: the CPAN mirror database maintenance is very messy, error-prone, and time-consuming: time to create a ticketing system for it, where each mirror is a queue, and each mirror maintainer gets an account? Thirdly, here's what I've been thinking about who could take over: brian d foy Ricardo Signes - project management (policies) Ask Bjørn Hansen Robert Spier - tech leads (i.e. running systems) Henk Penning David Landgren - the mirror database Note that I have named always two people - myself being a baaad example of a single point of failure. Though: maybe the PAUSE maintenance could be shared with more people? Andreas does need some evenings off. Likewise, the DNS of .cpan.org is currently behind Jos. Not a bad place to be, but again, a single point of failure sucks. There are some other smaller parts in CPAN - like maintaining the FAQ, and maintaining the binaries page. I don't have any good ideas on how/whom they should go. As recipients I chose people who over the years have shown promise and/or interest in various aspects of CPAN, in reasonably random order. Feel free to nominate/denominate yourself/other people. The concrete first step could be that Ask's develooper starts mirroring PAUSE directly instead of from FUNET, and then I switch FUNET to mirror from develooper. Second step: maybe get kernel.org as the North America Tier 1 mirror (they have shown interest in the past, and they have the capacity). Third step: more Tier 1 mirrors in NA and other continents? Fourth step: you fill it in. However Perl 6 will affect CPAN, I leave for younger minds to ponder. To close off, some random musings, if I may: avoid tight couplings, like the plague they are. Avoid single points of failure. Programming/middleware fads come and go, don't be too eager to follow them.
Re: so long, CPAN
On Sunday-201009-26 8:16, Ask Bjørn Hansen wrote: On Sep 26, 2010, at 4:49, Jarkko Hietaniemi wrote: It has become the time for me to admit to what has probably been pretty obvious for anyone else already for some time - I do not have the time to give CPAN the attention it deserves. Time to pass the baton, etc. Thank you Jarkko -- had it not been for your early invention and work with CPAN I don't think many of us would be here or be as productive with Perl as we are. On a more urgent note: could you and Elaine coordinate on moving/copying stuff out of gargoyle where e.g. the mirrors.cpan.org runs? The webster.edu has given us a strong hint of moving out a.s.a.p. I think the first order of things would be just copying data out of gargoyle, we can worry about the services later. In the FUNET side things are not as critical to move out though their admins do worry about the insane rsync load. Whatever the future system is, direct plain rsync connections should not be recommended: rsync is just too heavy. I and Elaine do have accounts to FUNET and can move stuff in and out (more accounts though not impossible are unlikely). Regarding the maintenance scripts in FUNET: there isn't much that I would be, ahem, proud to share: they are mostly dead simple shell / very early Perl 5 scripts. For 95% of that stuff I would recommend writing from scratch. Perhaps the most important new thing needed would be some sort of CPAN mirror staleness alerting script, as input using Henk Pennings' mirror scan results. I had over the years a few of those systems, all of them rotted eventually. As an extension of just checking the timestamp of the magical timestamp file, it would be nice to have some sort of random sampling of mirrors: are they really valid uptodate mirrors? - ask
Re: Why are versions restricted to 999?
999 revisions ought to be enough to anyone ♪ I got 999 revisions but I can't add one ♪ I speak from experience, limiting the version is a bad idea. Some people version their modules with MMDD. I can see bleeding edge development going with added HHHMMSS.
Re: Trimming the CPAN - Automatic Purging
Oh, I understand that fully. And I'd be happy to lend some of my time. But you don't make people inclined to help when people are lobbing snarky comments like we'll wait breathlessly for you to do it. The time-honored tradition of many open source communities is to talk. And talk. And talk. The problem is that this solves nothing. To do, does. You are free to decide to take this as a personal insult.
Re: Trimming the CPAN - Automatic Purging
On Friday-201003-26 13:20, Arthur Corliss wrote: On Fri, 26 Mar 2010, Andy Lester wrote: Absolutely. This factual info would ideally look like this: Of the 17,000 distros on CPAN, there are 8,000 that have versions more than a year older than the most recent one. If those distros with versions more than a year out of date were purged, the number of files would decrease from 200,000 to 120,000. This would save 7GB out of the 12GB that a full CPAN mirror takes now. Removing that 7GB would mean Benefit X to mirror owners. Without that, how can module authors be bothered to care? If you don't mind me interjecting, I still can't be bothered to care. We have basically a 12GB data set, and we're worried about that? I see that a small barrier to bringing on new mirrors on constrained pipes, but ultimately that's not that big a deal. Hell, there's single versions of some Linux distros that are bigger than that. The total size is not the problem. The number of files is. Vanilla rsync is horribly inefficient (not the protocol, which is genius, mind) because a client coming by and asking for updates basically ends up requiring the moral equivalent of find . -type f -print. Let me repeat that: each client. Not fun.
Re: Trimming the CPAN - Automatic Purging
On Friday-201003-26 19:02, Arthur Corliss wrote: On Fri, 26 Mar 2010, Jarkko Hietaniemi wrote: The total size is not the problem. The number of files is. Vanilla rsync is horribly inefficient (not the protocol, which is genius, mind) because a client coming by and asking for updates basically ends up requiring the moral equivalent of find . -type f -print. Let me repeat that: each client. Not fun. Why use rsync, then? Why not have checkpointed logs on cpan with additions/removals logged by date so you can roll forward on the client, processing only those files? It would be trivial to set up and a lot more efficient. We wait your implementation breathlessly. By the time all the CPAN mirrors have started using that, we probably will be rather blue in the face. --Arthur Corliss Live Free or Die
Re: CPAN vs Perl 6
On Tuesday-201001-05 19:48, David Golden wrote: On Tue, Jan 5, 2010 at 6:19 PM, Eric Wilhelm enoba...@gmail.com wrote: Given the constraint of bootstrap-ability, it seems like you should answer Why not tsv? before reaching for anything more complicated. Because META is already multi-dimensional and I don't want to find or re-invent a wheel for representing multi-dimensional data in tsv. The YAML/JSON debate is pretty much over as far as the META spec goes and JSON wins. Thus, given the pending arrival of JSON into core for META, I see no So you are saying screw the older Perl distributions? reason not to use JSON for index information as well. The last thing we need is for CPAN/CPANPLUS/Tool-X to all implement their own tsv parsers and we don't already have one in core, do we? (I could be Yes, it's called and split(/\t/). wrong there). This is such a stupid bikeshed conversation anyway. Wait until you see the color we chose. David
Re: CMSP 17. Better formalization of license field
I have to say that I really don't care going down this particular rabbit hole. We can argue this to hell and back, but in the end if it comes to that, the court will decide, for each particular case. On Wed, Nov 4, 2009 at 12:57 PM, David Cantrell da...@cantrell.org.ukwrote: On Tue, Nov 03, 2009 at 02:12:00PM -0500, Jarkko Hietaniemi wrote: On Tue, Nov 3, 2009 at 12:41 PM, David Cantrell da...@cantrell.org.uk wrote: On Mon, Nov 02, 2009 at 12:45:30PM -0500, Jarkko Hietaniemi wrote: +Inf. Public domain doesn't mean what one might think it means. Most importantly, it doesn't mean much outside U. S. jurisdictions. [citation needed] The core of the problem is this: public domain is a legal term that only is defined within the U.S. (and I admit, other Anglo-Saxon law, like UK and Australia, etc.). Say, a German author saying This is public domain is making no sense. I know for a fact that in Finnish law an author cannot give away his rights, and the same applies in other European countries. Even more importantly, it doesn't work the way most people think. It doesn't e.g. relieve the author of warranties or damage claims. It's much better to choose a minimal license that disclaims warranties, such as the MIT one. But that doesn't relieve the author of damage claims either, no matter what the licence says. At least not in all jurisdictions. So on the basis that public domain can't be used because it is invalid in Germany, MIT can't be used because it is not entirely valid in the UK. Likewise the GPL and the Artistic licence. -- David Cantrell | Reality Engineer, Ministry of Information Your call is important to me. To see if it's important to you I'm going to make you wait on hold for five minutes. All calls are recorded for blackmail and amusement purposes. -- There is this special biologist word we use for 'stable'. It is 'dead'. -- Jack Cohen
Re: CMSP 17. Better formalization of license field
Angels, head of a pin, lawyers, three doors down :-) On Tue, Nov 3, 2009 at 3:02 PM, Zefram zef...@fysh.org wrote: Jarkko Hietaniemi wrote: If your need is to list the licenses a package contains, in a way there is no need to list the public domain bits because there are no strings, err, licenses attached. It is in the public domain. Null licensing is not the same as not saying anything about licensing. I know for a fact that in Finnish law an author cannot give away his rights, and the same applies in other European countries. So public domain isn't necessarily even a null license. -zefram -- There is this special biologist word we use for 'stable'. It is 'dead'. -- Jack Cohen
Re: CMSP 22. Clarify author field
One point about contact points comes to mind: do we currently allow/mention/encourage *multiple* contact addresses (be they email addresses or something else) People change jobs / email providers / graduate, and to better be able to contact them, multiple addresses is better than a single one. On Fri, Oct 30, 2009 at 2:22 PM, David Golden xda...@gmail.com wrote: On Fri, Oct 30, 2009 at 12:26 PM, Lars Dɪᴇᴄᴋᴏᴡ lars.diec...@googlemail.com wrote: Since we have no consensus on a change of semantic, field extension, field renaming or deprecation in favour of something better, I came up with a doc patch (attached because Github is down) that merely describes the current practice in the wild. Some quotations from you that pull into this direction: • who to spam for problems with this module • who to contact with questions or bugs (in the event that there is no bugtracker) • Author is probably best as contact point. • I always feel uneasy to put my name/email address into author when all I'm doing is keeping the module in working condition on CPAN. If you read the patch's prose carefully, it sounds kind of vague as I wanted to avoid MUSTs and SHOULDs. Any comments welcome. Works for me. It clarifies the current state, which is consistent with the criteria for changes. After all patches are integrated, I'll probably do a couple editing passes. There are other sections using must and should and such, so I'd like to harmonize. For the moment, this works great. -- David -- There is this special biologist word we use for 'stable'. It is 'dead'. -- Jack Cohen
Re: CMSP 22. Clarify author field
And contact for security stuff. On Fri, Oct 9, 2009 at 10:38 AM, Steffen Mueller nj88ud...@sneakemail.com wrote: David Golden wrote: 22. Clarify author field Consider that it's currently, practically used as a contact field. I get lots of mail that should have gone to a mailing list instead. Therefore, I'm for: - Remove the ambiguous author field - Add contact field. Potentially with a type associated (person or mailing list). - To compensate, add a copyright holder field in some form. Though I realize that this may be impossible due to conflicting copyright of the package content. Maybe a field in the spirit of contact for legal stuff. The idea is to make it very clear where normal inquiries should go without removing the mention of a single person in case of legal issues where mailing lists and other media are inappropriate. Steffen -- There is this special biologist word we use for 'stable'. It is 'dead'. -- Jack Cohen