I think there is some talking at cross purposes going on here. The term `central repository' or CR is a misnomer and has led you astray, because even so-called CRs are distributed repositories in the context of global scholarly work. Better to talk about `subject repository' or SR, to make it clear that the discussion is simply about whether the world is divided up by subject or by institution (or at the moment by both and neither).
Second point: a consortium of universities (even a whole country) can establish a repository, which retains its IR characteristic of being multi-disciplinary. It is an IR in style, and subject to exactly the same benefits and disadvantages as a single institution IR. There are many examples worldwide including Australia and the UK, so I hope that this disposes of the small university problem cited in India. Such repositories are collaborative IRs. There is no problem with establishing such collaborative IRs. The key issue in the discussion between SRs and IRs is that (a) Subjects and disciplines do not provide a unique partitioning of world research. Categories overlap and are blurred. The domain is confused. (b) SRs in general have no secure funding source. (c) SRs have no possibility of mandating deposit in that discipline. If it occurs, great. If it doesn't, wring your hands. (d) IRs of all types have mandatory mechanisms available to them. (e) IRs of all types have secure access to the quite low level of funds required to run them. (f) IRs do not in general overlap, because they are defined by discrete entities. If the few thousand research universities in the world had access to an IR, the world's research could be 100% captured. Summary - Any successful CR is to be applauded. However CRs do not provide a scalable model for open access. Only IRs do. Arthur Sale University of Tasmania From: American Scientist Open Access Forum [mailto:[email protected]] On Behalf Of Atanu Garai/Lists Sent: Sunday, 9 March 2008 3:51 AM To: [email protected] Subject: Re: [AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM] Central versus institutional self-archiving Thanks Stevan. These are key points that are coming to my mind. Stevan Harnad wrote: On Sat, 8 Mar 2008, Atanu Garai/Lists wrote: Dear Colleagues This question is very basic. Institutions all over the world are developing their own repositories to archive papers written by staffs. On the other hand, it is very much feasible to develop thematic and consortia repositories wherein authors all over the world can archive their papers very easily. Both the approaches have their own pros and cons. However, having few big thematic (e.g. subject based) and/or consortia (e.g. Indian universities archive) repositories is more advantageous than maintaining hundreds of thousands small IRs, taking cost, management, infrastructure and technology considerations. Moreover, knowledge sharing and preservation becomes easier across the participating individuals and institutions in large IRs. If this advantages are so obvious, it is not understandable why there is so much advocacy for building IRs in all institutions? Not only are the advantages of central repositories (CRs) over institutional repositories (IRs) not obvious, but the pro's of IRs vastly outweigh those of CRs on every count: This forum must have discussed this issue. Also, the objective of posing this question should be made clear, so that you can find it in the right context and spirit. At one point of time and still now, we wanted to have disbursed information platforms and database. But with the emergence of large digitisation projects, notably Google Books, the advantages of having a centralised global databases are becoming obvious. A choice between 'central repository' and 'IR' is a policy decision for a university or group of universities and such a decision is driven by number of factors. Again, the question is what are the sequence of events and rationale that led the open access community to select IRs as primary archiving mechanism over CRs. Institutions should be able to make a choice of their own, but if you want to advise the institutions what should be the key criteria to advise them to go for own IRs, over the CRs. (1) The research providers are not a central entity but a worldwide network of independent research institutions (mostly universities). (2) Those independent institutions share with their own researchers a direct (and even somewhat competitive) interest in archiving, evaluating, showcasing, and maximizing the usage and impact of their own research output. (Most institutions already have IRs, and there are provisional back-up CRs such as Depot for institutionally unaffiliated researchers or those whose institutions don't yet have their own IR.) http://roar.eprints.org/ http://deposit.depot.edina.ac.uk/ Points 1 and 2 are essentially dealing with the notion of self-archiving mandate that the institution may or may not invoke for its researcher. From an institutional point of view, the choice of CR and IR will primarily be driven by management, impact and effectiveness of the repositories. For universities which produce a high number of research papers annually, creating IRs may be sensible but there are universities in India that are producing only a handful of research papers. My understanding is that for such universities maintaining own repositories are less effective, even if we take cost considerations alone. The issue of "a direct (and even somewhat competitive) interest in archiving, evaluating, showcasing, and maximizing the usage and impact of their own research output" does not conflict with the choice of having a CR (or rather global repository). Independent institutions can have both mandated self-archiving and archiving, evaluating, showcasing, maximizing the usage etc. in CRs as well. (3) The OAI protocol has made all these distributed institutions' repositories interoperable, meaning that their metadata (or data) can all be harvested into multiple central collections, as desired, and searched, navigated and data-mined at that level. (Distributed archiving is also important for mirroring, backup and preservation.) (4) Deposit takes the same (small) number of keystrokes institutionally or centrally, so there is no difference there; but researchers normally have one IR whereas the potential CRs for their work are multiple. (The only "global" CR is Google, and that's harvested.) http://eprints.ecs.soton.ac.uk/10688/ Technology is not a constraint in making metadata interoperable, though not without some compromise in the data quality. For full text data, interoperability is challenged by copyright restrictions. These dilemma are avoided intrinsically in CRs. On the other hand, large scale CRs are having the opportunity to make full text search and retrieval feasible. Volatility of harvested metadata from IRs is avoided with the implementation of CRs. (5) The distributed costs of institutional self-archiving are certainly lower than than maintaining CRs (how many? for what fields? and who maintains them and pays their costs?), particularly as the costs of a local IR are low, and they can cover all of an institution's research output as well as many other forms of institutional digital assets. You may like to give some empirical data here to corroborate your statement. Creating and maintenance costs of IR are minimal, but if you want to advocate and popularise IRs, you will have a staff. There are some figures that were submitted to UK parliamentary committee. CRs adopt all these costs and institutions may or may not give the CRs same amount of subscription costs. Preserving "as well as many other forms of institutional digital assets" was not in the IR's mandate but obviously CRs can also do that purely from tech point of view. (6) Most important of all, although research funders can reinforce self-archiving mandates, the natural and universal way to ensure that IRs (and hence harvested CRs) are actually filled with all of the world's research output, funded and unfunded, is for institutions to mandate and monitor the self-archiving of their own research output, in their own IRs, rather than hoping it will find its way willy-nilly into external CRs. http://www.eprints.org/openaccess/policysignup/ Self-archiving and mandate is not a technological issue, it is a regulatory one - hence, it can be done in IRs and/or CRs. Best Atanu Garai Online Networking Specialist Globethics.net International Secretariat: 150, route de Ferney CH-1211 Geneva 2 Switzerland Tel: 41.22791.6249/67 Fax: 41.22710.2386 New Delhi Contact: Tel: 91.98996.22884 Email: [email protected] [email protected] Web: www.globethics.net
