Overall Question: How can we implement different ways of constructing
the CPAN river?
Background:
Since about this time last year I've had occasion to use the concept of
CPAN-river to derive lists of distributions to be tested against
whatever Perl 5 blead is of the moment. In particular, for the last
three months I've been creating assessments of the impact of monthly
Perl 5 development releases on the "top 1000" of the CPAN river. (See,
e.g.,
http://thenceforward.net/perl/misc/cpan-river-1000-perl-5.27-master.psv.gz)
To calculate the CPAN river, I've been using the programs developed by
David Golden found here:
https://github.com/dagolden/zzz-index-cpan-meta
... with one modification: a local branch for the second of the three
programs cited there. I use a local branch because I'm using Linux and
cannot install Ramdisk.
Problem:
As I've stared at this data over the past year I've become aware that
the order in which distros appear in the river is not necessarily the
most useful for assessing the real-world impact of changes in blead.
Put less charitably, the CPAN river can be "gamed." It is possible for
a person to release a large number of distributions which have
dependencies on other distributions by the same author. That can boost
some of those distributions high up into the CPAN river -- into, say,
the "top 1000" that I use in my monthly program.
But if that author's distributions are not depended upon by *other*
authors' distributions then they are arguably less important than those
such as Module-Build and DateTime which are depended upon by vast
numbers of distros written by people other than those distros' maintainers.
Since "testing against blead" programs take hours to run, I would like
to have that time spent focusing on what I consider to be more relevant
distros.
For the 5.29.* development cycle starting in May of this year, I would
like to be able to use a ranking of CPAN distros which goes beyond asking:
* "How many other distributions depend on this one?"
... to asking:
* "How many distributions by other authors/maintainers depend on this one?"
Would that be feasible? Has anyone attempted this already?
Thank you very much.
Jim Keenan