Since there has been some discussion recently on improving search.cpan.org search results, here's an initial attempt to apply the Google-inspired PageRank algorithm on Perl modules when interpreting module dependencies as links:

http://www.math2.org/david/perlrank

The top rated modules are provided below:

======== Perl5 top 10 ============

strict : 15.3191537192272
Carp : 6.85461257024917
vars : 5.53458362830022
warnings::register : 5.00715999083099
XSLoader : 3.78995132442526
Exporter : 2.8340416999894
Config : 1.64068111856341
Socket : 1.63421825590883
POSIX : 1.61838700718907
warnings : 1.53924597096102

========= CPAN top 10 =============

File::Spec : 32.2063512256258
Business::OnlinePayment : 13.1125
DBI : 10.6702455651088
Test::Harness : 8.88232758238621
Data::Dumper : 8.87020662128007
Test::Simple : 8.09783592134747
Storable : 7.94579562314204
XML::Parser : 7.68246933302234
Tk : 7.55528753224355
Text::Balanced : 7.17430842306901


I think what would improve the relevency the most is a better data set. The CPAN data set was generated from the 'prereq' keys in the Module::CPANTS::asHash module, and this is only a rough representatation of the linking structure. It would have been better to download all the CPAN modules and do source code analysis directly on them.


-davidm



Reply via email to