A news report entitled "Physics and fame" in PhysicsWeb http://physicsweb.org/article/news/8/4/12 summarizes a preprint by James P. Bagrow, Hernan D. Rozenfeld, Erik M. Bollt, Daniel ben-Avraham entitled "How Famous is a Scientist?..." http://arxiv.org/pdf/cond-mat/0404515
This paper is related to one that Tim Brody is now preparing for submission based on his download.citation correlator: http://citebase.eprints.org/analysis/correlation.php Equating "merit" with number of papers published and equating "fame" with number of google links risks circularity. Google's PageRank does not count usage "hits" (i.e. downloads), it counts links, modulated by hub/authority weightings using PageRank, etc. And the correlation is almost tautological: More total items (whether or not by the same author) will lead to more total links to (any of) those items. "Same author" is merely a way of bundling items. The control comparisons (not performed by the authors of the merit/fame study) would require also calculating the correlations between. I don't think google's PageRank algorithm controls for this: (1) total number of an author's published papers and the average number of citations to that author's work (as calculated by citebase http://citebase.eprints.org/ or by ISI, not by google), (2) total number of an author's papers and average number of links to that author's papers (i.e., google PageRanking), (3) total number of arbitrary google items from the same producer (not research papers) and average number of links to (any of) those items (i.e., google ranking), and (4) total number of arbitrary google items, bundled arbitrarily, and average number of links to (any of) those items (i.e., google ranking), and then to *partial out* the pure item-quantity effect (perhaps in a multiple regression equation) to see whether there is any significant portion of the variance left that predicts "importance," even after the mere correlation between the quantity of items and quantity of links is removed. (Something similar needs to be done with download data too, to partial out the effects of baseline quantity and co-bundling from the specific merits of the material.) (The Bagrow et al. paper makes comparisons with the time-course of ace pilots' "fame," but it seems to me that would require more detailed time-course analyses of downloads, citations, and links in order to draw any conclusions.) Stevan Harnad Chaire de Recherche du Canada Centre de Neuroscience de la Cognition (CNC) Universite du Quebec a Montreal Montreal, Quebec, Canada H3C 3P8 tel: 1-514-987-3000 2461# fax: 1-514-987-8952 [email protected] http://www.ecs.soton.ac.uk/~harnad/
