Re: CPAN-river: can graph calculation be modified?

H.Merijn Brand Fri, 02 Feb 2018 14:29:58 -0800

On Fri, 2 Feb 2018 12:44:43 -0500, David Golden <x...@xdg.me> wrote:

> It's possible that an *alternate* simplest thing might be more meaningful:
> count the number of distinct *authors* depended on by any distribution
> (including, for the sake of example, the same author, but only once).
> 
> In the Foo case:
> 
>    - Foo has 3 authors depending on it
>    - Foo-Bar has 3 authors depending on it
>    - Foo-Bar-Noggin and Foo-Bar-Baz have 0 authors depending on it
>    - Foo-Bar-A has 1 author depending on it
> 
> In the Neil's Thing case:
> 
>    - Thing has 2
>    - Plant has 1
>    - Fruit and Banana each have 1
>    - Silver-Banana has 0
> 
> In Tux's Thing case, all the counts just increase by one and Distasteful
> has 0.
> 
> Consider this case:
> Zot (Larry) -> Pow (Moe) -> Splat (Curly) -> Whiff (Moe) -> Oof (Larry)
> 
>    - Zot has 3
>    - Pow has 3
>    - Splat has 2
>    - Whif has 1
>    - Oof has 0
> 
> The interesting thing about this metric to me is that it focuses on this
> question: "If a module breaks, how many *people* are affected" which sounds
> a lot more like what Jim's asking.


No, it tells you how many *authors* are affected (or author groups).

Breaking something up-river of say DBI will affect just 3 authors (the
(co)maints), whereas it affect millions of people (the users).

If some brave author maintains two or more up-river modules, it is
still just one author, but uncountable users. (don't count core modules
here, that would make it too hard).

Say we have


  Broum + Brumble - Droki - Blimco - Turf
  ALEX  | BEN       JOKI    FLON     DIY
        |
        + Fruig   - DBI   - DBD::XY
          BEN       HIW     JOCKX

IMHO BEN should be counted twice for Broum, not once

my € 0.02

> Counting an author as 1 for any downstream by the same author is arbitrary
> -- I think it simplifies the analysis and gives more or less the same
> answer, but it could be done the other way, too, if people preferred.
> 
> David
> 
> On Fri, Feb 2, 2018 at 9:48 AM, James E Keenan <jkee...@pobox.com> wrote:
> 
> > Overall Question:  How can we implement different ways of constructing the
> > CPAN river?
> >
> > Background:
> >
> > Since about this time last year I've had occasion to use the concept of
> > CPAN-river to derive lists of distributions to be tested against whatever
> > Perl 5 blead is of the moment.  In particular, for the last three months
> > I've been creating assessments of the impact of monthly Perl 5 development
> > releases on the "top 1000" of the CPAN river.  (See, e.g.,
> > http://thenceforward.net/perl/misc/cpan-river-1000-perl-5.27-master.psv.gz
> > )
> >
> > To calculate the CPAN river, I've been using the programs developed by
> > David Golden found here:
> >
> > https://github.com/dagolden/zzz-index-cpan-meta
> >
> > ... with one modification:  a local branch for the second of the three
> > programs cited there.  I use a local branch because I'm using Linux and
> > cannot install Ramdisk.
> >
> > Problem:
> >
> > As I've stared at this data over the past year I've become aware that the
> > order in which distros appear in the river is not necessarily the most
> > useful for assessing the real-world impact of changes in blead. Put less
> > charitably, the CPAN river can be "gamed."  It is possible for a person to
> > release a large number of distributions which have dependencies on other
> > distributions by the same author.  That can boost some of those
> > distributions high up into the CPAN river -- into, say, the "top 1000" that
> > I use in my monthly program.
> >
> > But if that author's distributions are not depended upon by *other*
> > authors' distributions then they are arguably less important than those
> > such as Module-Build and DateTime which are depended upon by vast numbers
> > of distros written by people other than those distros' maintainers.
> >
> > Since "testing against blead" programs take hours to run, I would like to
> > have that time spent focusing on what I consider to be more relevant
> > distros.
> >
> > For the 5.29.* development cycle starting in May of this year, I would
> > like to be able to use a ranking of CPAN distros which goes beyond asking:
> >
> > * "How many other distributions depend on this one?"
> >
> > ... to asking:
> >
> > * "How many distributions by other authors/maintainers depend on this one?"
> >
> > Would that be feasible?  Has anyone attempted this already?
> >
> > Thank you very much.
> > Jim Keenan
> >  


-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.27   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/        http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/

pgpXQK7P484Aj.pgp
Description: OpenPGP digital signature

Re: CPAN-river: can graph calculation be modified?

Reply via email to