Hello everyone: I was the person who asserted during the most recent Washington Research Evaluation Network (WREN) meeting that: "The traditional tools of R&D evaluation (bibliometrics, innovation indices, patent analysis, econometric modeling, etc.) are seriously flawed and promote seriously flawed analyses" and "Because of the above, reports like the 'Gathering Storm' provide seriously flawed analyses and misguided advice to science policy decision makers." I will admit that this was meant to be provocative, but it was also meant to be the views of a consumer of evaluation of science policy and research.
Perhaps I could explain my reasoning and then folks could jump in. First, the primary reason that I believe bibliometrics, innovation indices, patent analysis and econometric modeling are flawed is that they rely upon the counting of things (paper, money, people, etc.) without understanding the underlying motivations of the actors within the scientific ecosystem. This is a conversation I have had with Fran Narin, Diana Hicks, Caroline Wagner and a host of others and comes down to a basic question: what is the motivates scientists to collaborate? If we cannot come up with a set of business decision rules for the scientific community, then we can never understand optimal levels of funding of R&D for nations, the reasons why institutions collaborate, or a host of other questions that underpin the scientific process and explain the core value proposition behind the scientific endeavor. Second, what science policy makers want is a set of decision support tools that supplement the existing gold standard (expert judgment) and provide options for the future. When we get down to the basics, policy makers need to understand the benefits and effectiveness of their investment decisions in R&D. Currently, policy makers rely on big committee reviews, peer review, and their own best judgment to make those decisions. The current set of tools available don't provide policy makers with rigorous answers to the benefits/effectiveness questions (see my first point) and they are too difficult to use and/or inexplicable to the normal policy maker. The result is the laundry list of "metrics" or "indicators" that are contained in the "Gathering Storm" or any of the innovation indices that I have seen to date. Finally, I don't think we know enough about the functioning of the innovation system to begin making judgments about which metrics/indicators are reliable enough to provide guidance to policy makers. I believe that we must move to an ecosystem model of innovation and that if you do that, then non-obvious indicators (relative competitiveness/openness of the system, embedded infrastructure, etc.) become much more important than the traditional metrics used by NSF, OECD, EU and others. In addition, the decision support tools will gravitate away from the static (econometric modeling, patent/bibliometric citations) and toward the dynamic (systems modeling, visual analytics). These are the kinds of issues that my colleague, Julia Lane, and I have been discussing with other U.S. federal government colleagues as part of the Science of Science Policy Interagency Task Group (SoSP ITG) that was created by the President's Science Advisor, Dr. John Marburger two years ago. The SoSP ITG has created a research Roadmap that would deal with the three issues (and many more) discussed above as a way to push the envelope in the emerging field of science policy research that Julia supports at NSF. The SoSP ITG is also hosting a major workshop in December in Washington, with WREN, that will discuss the Roadmap and its possible implementation. Regards, Bill Valdez U.S. Department of Energy -----Original Message----- From: Subbiah Arunachalam [mailto:[email protected]] Sent: Tuesday, October 07, 2008 8:01 PM To: American Scientist Open Access Forum Subject: New ways of measuring research Dear Members of the List: One of the key concerns of the Open Access movement is how will the transition from traditional toll-access publishing to scientific papers becoming freely accessible through open access channels (both OA repositories and OA journals) affect the way we evaluate science.. In the days of print-only journals, ISI (now Thomson Reuters) came up with impact factors and other citation-based indicators. People like Gene Garfield and Henry Small of ISI and colleagues in neighbouring Drexel University in Philadelphia, Derek de Solla Price at Yale, Mike Moravcsik in Oregon, Fran Narin and Colleagues at CHI, Tibor Braun and the team in Hungary, Ton van Raan and his colleagues at CWTS, Loet Leydesdorff in Amsterdam, Ben Martin and John Irvine of Sussex, Leo Egghe in Belgium and a large number of others too numerous to list here took advantage of the voluminous data put together by ISI to develop bibliometric indicators. Respected organizations such as the NSF in USA and the European Union's Directorate of Research (which brought out the European Report on S&T INdicators similar to the NSF S&T Indicators) recognised bibliometrics as a legitimate tool. A number of scientomtrics researchers built citation networks; David pendlebury at ISI started trying to predict Nobel Prize winners using ISI citation data. When the transition from print to electronics started taking palce the scientometrics community came up with webometrics. When the transition from toll-access to open access started taking place we adopted webometrics to examine if open access improves visibility and citations. But we are basically using bibliometrics. Now I hear from the Washington Research Evaluation Network that "The traditional tools of R&D evaluation (bibliometrics, innovation indices, patent analysis, econometric modeling, etc.) are seriously flawed and promote seriously flawed analyses" and "Because of the above, reports like the 'Gathering Storm' provide seriously flawed analyses and misguided advice to science policy decision makers." Should we rethink our approach to evaluation of science? Arun [Subbiah Arunachalam] ----- Original Message ---- From: Alma Swan <[email protected]> To: [email protected] Sent: Wednesday, 8 October, 2008 2:36:44 Subject: New ways of measuring research Barbara Kirsop said: > 'This exchange of messages is damaging to the List and to OA itself. I > would like to suggest that those unhappy with any aspect of its > operation merely remove themselves from the List. This is the normal > practice.' > > A 'vote' is unnecessary and totally inappropriate. Exactly, Barabara. These attempts to undermine Stevan are entirely misplaced and exceedingly annoying. The nonsense about Stevan resigning, or changing his moderating style, should not continue any further. It's taking up bandwidth, boring everyone to blazes, and getting us precisely nowhere except generating bad blood. Let those who don't like the way Stevan moderates this list resign as is the norm and, if they wish, start their own list where they can moderate (or not) and discuss exactly as they think fit, if they believe they can handle things better. Now that they all know who they are (and so do we), let them band together, and get on with it together. Those who do like the way Stevan moderates this list (his list), can stay and continue discussing the things we, and he, think are important in the way the list has always been handled. Goodbye, all those who wish things differently. It's a shame that you're going but we wish you well and we will be relieved when you cease despoiling this list with your carping. Can I now appeal to those who opt to stay to start a new thread on something important - and I suggest that the issue of research metrics is a prime candidate. I particularly don't want to be too precise about that term 'metrics'. Arun (Subbiah Arunachalam) has just sent out to various people the summary that the Washington Research Evaluation Network has published about - er - research evaluation. One of the conclusions is that bibliometrics are 'flawed'. Many people would agree with that, but with conditions. It is important to me in the context of a current project I am doing that I understand what possibilities there are for measuring (not assessing or evaluating, necessarily, but measuring) THINGS related to research. Measurements may be such a thing as immediate impact, perhaps measured as usual by citations, but I am also interested in other approaches, including long-term ones, for measuring research activities and outcomes. We need not think only in terms of impact but also in terms of outputs, effects, benefits, costs, payoffs, ROI. I would like to hear about things that could be considered as measures of research activity in one form or another. They may be quite 'wacky', and they may be things that are currently not open to empirical analysis yet would seem to be the basis of sensible measures of research outcomes. Any ideas you have, bring 'em on. Then the challenge is whether, in an OA world, people will be able to develop the tools to make the measures measurable. That's the next conversation. Stevan, your incisive input is very welcome as always. And you may quote/comment as much as you want. That is the unique value that you bring to this list and why the vast majority of us are still here, right behind you. Alma Swan Key Perspectives Ltd Truro, UK
