Overall Question: How can we implement different ways of constructing the CPAN river?

Background:

Since about this time last year I've had occasion to use the concept of CPAN-river to derive lists of distributions to be tested against whatever Perl 5 blead is of the moment. In particular, for the last three months I've been creating assessments of the impact of monthly Perl 5 development releases on the "top 1000" of the CPAN river. (See, e.g., http://thenceforward.net/perl/misc/cpan-river-1000-perl-5.27-master.psv.gz)

To calculate the CPAN river, I've been using the programs developed by David Golden found here:

https://github.com/dagolden/zzz-index-cpan-meta

... with one modification: a local branch for the second of the three programs cited there. I use a local branch because I'm using Linux and cannot install Ramdisk.

Problem:

As I've stared at this data over the past year I've become aware that the order in which distros appear in the river is not necessarily the most useful for assessing the real-world impact of changes in blead. Put less charitably, the CPAN river can be "gamed." It is possible for a person to release a large number of distributions which have dependencies on other distributions by the same author. That can boost some of those distributions high up into the CPAN river -- into, say, the "top 1000" that I use in my monthly program.

But if that author's distributions are not depended upon by *other* authors' distributions then they are arguably less important than those such as Module-Build and DateTime which are depended upon by vast numbers of distros written by people other than those distros' maintainers.

Since "testing against blead" programs take hours to run, I would like to have that time spent focusing on what I consider to be more relevant distros.

For the 5.29.* development cycle starting in May of this year, I would like to be able to use a ranking of CPAN distros which goes beyond asking:

* "How many other distributions depend on this one?"

... to asking:

* "How many distributions by other authors/maintainers depend on this one?"

Would that be feasible?  Has anyone attempted this already?

Thank you very much.
Jim Keenan

Reply via email to