Re: How to Compare IRs and CRs

2008-02-18 Thread Arthur Sale

Thomas, what you actually wrote is
  Show me an archive, and a university, who will vouch that for a
  certain period, all that is in the IR  with free full-text
  is a equivalent to the university's authors' total research
  papers in the same period. Does such a university exist?

Such a university can never and will never exist if you insist on
every term in the statement. Mainly because no university authority
can ever know all of the university authors' research output with
absolute certainty, unless its staff size is very small (say less
than 50). Maybe the head of a small research institute can be that
sure, but a senior executive simply can't for even a small size
university. Insistence on a free full-text is also impossible given
current publisher requirements, though deposit of a full-text is
achievable.

Exactly the same is true of discipline specific repositories, with
the proviso that the repository manager must be even more unsure.

I assumed that you meant the question seriously and would accept
'close to all'. To be reasonably sure that you are capturing close to
all research output requires some audit capability - for example that
there is independently collected data on the university's research
output to compare with the repository. As it happens, such a
situation existed in Australia in 2007 as you probably know. The
HERDC data collection for Government provides such an independent
estimate. The HERDC is spot-audited by Government to prevent
over-claiming.

Queensland University of Technology
I assert that QUT achieves an acceptable closeness to collecting all
research output in its repository. Indeed Paula Callan is in a good
position to cross-check the two collections against each other, and
does so.

The QUT policy statement is widely known within Australia and
outside, and you can read the current version approved by the
Academic Senate at http://www.mopp.qut.edu.au/F/F_01_03.jsp.

The QUT eprints site is certainly up now because I checked.

University of Queensland
As to UQ, I need not wait until 2009 to know that they will collect
all research output for 2008 by March 2009. They are simply
implementing the usual Australian Government HERDC report through
their repository. In other words the HERDC report will be generated
from the repository contents. That guarantees that they will collect
the same data that the HERDC requires or suffer financially for it by
losing funds from the research block grant. As I wrote, I need no
evidence to know this (nor does any other Australian repository
manager), though it will be worth confirming in 2009.

This policy is weaker than QUT's because it is not necessarily
Immediate Deposit (ID), but it is also stronger since it guarantees
much closer to 100%. There is a financial penalty for losing
publications, often down to the department. Of course there may
always be a small number of missing publications in any system. This
may be because of laziness on the part of the authors, mislaid
documentation, illness, or other reasons.

Charles Sturt University and others
BTW, Charles Sturt University has exactly the same intention.
Probably about ten or more other Australian universities are actively
considering the same step as UQ, because it eliminates duplication of
work.

Disagreement
You write
   But I hope that
   we can agree that, from today's perspective, filling IRs
   until we achieve 100% open access will be a very very long
   process.
Sorry, we can't agree. Filling IRs is happening now. The rate varies
by country and situation, of course. I have hopes that IRs in all or
most ~40 Australian universities will be capturing substantially all
their research output by say two years. It may not all be open
access, but it will be deposited. And by filling, I don't mean
retrospectivity but that current output is captured and continues to
be captured into the future.

I could agree with you that filling discipline-specific repositories
and covering all disciplines and inter-disciplinary fields will be a
very long process, if that will help.


Arthur Sale
Professor of Computer Science
University of Tasmania

 -Original Message-
 From: American Scientist Open Access Forum
 [mailto:AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM@LISTSERVER.SIGMAX
 I.ORG] On Behalf Of Thomas Krichel
 Sent: Sunday, 17 February 2008 3:10 PM
 To: american-scientist-open-access-fo...@listserver.sigmaxi.org
 Subject: Re: [AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM] How to
 Compare IRs and CRs

   Arthur Sale writes

  In response to Tom's request for one university that will
guarantee
  that they collect all their research output, here are two:
 
  Queensland Institute of Technology, Australia,  since 2004.
 University
  mandate since 2004. http://eprints.qut.edu.au/ Now in its 5th
year!

   The site can not be reached on Februrary 17 at 09:41:21 NOVT
2008.
   http://qut.edu.au can be, but I don't find such a statement
there.

  University of Queensland, Australia, since beginning

Re: How to Compare IRs and CRs

2008-02-18 Thread Paula Callan
Arthur  Thomas

RE:

-Original Message-
From: American Scientist Open Access Forum 
[mailto:american-scientist-open-access-fo...@listserver.sigmaxi.org] On Behalf 
Of Thomas Krichel
Sent: Sunday, 17 February 2008 2:10 PM
To: american-scientist-open-access-fo...@listserver.sigmaxi.org
Subject: Re: How to Compare IRs and CRs

  Arthur Sale writes

 In response to Tom's request for one university that will guarantee that
 they collect all their research output, here are two:

 Queensland Institute of Technology, Australia,  since 2004. University
 mandate since 2004. http://eprints.qut.edu.au/ Now in its 5th year!

  The site can not be reached on Februrary 17 at 09:41:21 NOVT 2008.
  http://qut.edu.au can be, but I don't find such a statement there.



As the manager of QUT's institutional repository, I can confirm that there is 
no statement on our website that 'guarantees' that we collect fulltext copies 
of ALL our research output.  I don't believe ANY university can give such a 
guarantee at this point in time.  However, we can say that a significant 
proportion of QUT's research output is being deposited (at least 65% for some 
years) and, to a large extent, this is due to the University-wide Eprint 
policy. (http://www.mopp.qut.edu.au/F/F_01_03.jsp)  The policy puts 
'self-archiving'and OA on the agenda within QUT academic departments and 
research centres. The adoption of similar policies by funding bodies will 
reinforce the message that providing open access to research results (not just 
publication) is an integral part of the research process.

In my experience, researchers who are most enthusiastic about our repository 
see it as a useful 'tool' that makes their academic life more effective and 
more efficient.

With multi-disciplinary research becoming increasingly common, I think it makes 
sense to encourage deposit via an IR with the option of including metadata that 
will facilitate later harvesting by CR's.  Alternatively, IRs could include 
options for forwarding the metadata to a CR specified by the depositor. It is 
important that we acknowledge disciplinary differences - one tool will not suit 
all.  But, if I may mix my metaphors, researchers should not have to hammer the 
same nail with two hammers.



Paula Callan
eResearch Access Coordinator
Library
Queensland University of Technology
Brisbane, Australia

PH:  +617  3138 3413



 University of Queensland, Australia, since beginning of 2008.

  That is for just 1 and a half months?

 Now achieving annual government research reporting through their
 IR. This implies 100% coverage of
 course. http://espace.library.uq.edu.au/

  I did not ask you to tell me about them, I asked if there would
  be an official from an institutions warrant us that they have
  achieved it. I happen to know a bit about the Queensland Institute
  of Technology, situation, I hold a QUT staff card and know the
  repository manager there. But I don't think that it is worth
  discussing the situation in one particular institution here.

  I am not saying that IRs are not a potentially good development
  and I am not saying that they will never work. But I hope that
  we can agree that, from today's perspective, filling IRs
  until we achieve 100% open access will be a very very long
  process.

  With cheers from Novosibirsk (sunny, -13C),


  Thomas Krichelhttp://openlib.org/home/krichel
RePEc:per:1965-06-05:thomas_krichel
  phone: +7 383 330 6813   skype: thomaskrichel


Re: How to Compare IRs and CRs

2008-02-17 Thread Thomas Krichel
  Arthur Sale writes

 In response to Tom's request for one university that will guarantee that
 they collect all their research output, here are two:

 Queensland Institute of Technology, Australia,  since 2004. University
 mandate since 2004. http://eprints.qut.edu.au/ Now in its 5th year!

  The site can not be reached on Februrary 17 at 09:41:21 NOVT 2008.
  http://qut.edu.au can be, but I don't find such a statement there.

 University of Queensland, Australia, since beginning of 2008.

  That is for just 1 and a half months?

 Now achieving annual government research reporting through their
 IR. This implies 100% coverage of
 course. http://espace.library.uq.edu.au/

  I did not ask you to tell me about them, I asked if there would
  be an official from an institutions warrant us that they have
  achieved it. I happen to know a bit about the Queensland Institute
  of Technology, situation, I hold a QUT staff card and know the
  repository manager there. But I don't think that it is worth
  discussing the situation in one particular institution here.

  I am not saying that IRs are not a potentially good development
  and I am not saying that they will never work. But I hope that
  we can agree that, from today's perspective, filling IRs
  until we achieve 100% open access will be a very very long
  process.

  With cheers from Novosibirsk (sunny, -13C),


  Thomas Krichelhttp://openlib.org/home/krichel
RePEc:per:1965-06-05:thomas_krichel
  phone: +7 383 330 6813   skype: thomaskrichel


Re: How to Compare IRs and CRs

2008-02-10 Thread Arthur Sale
In response to Tom's request for one university that will guarantee that
they collect all their research output, here are two:

Queensland Institute of Technology, Australia,  since 2004. University
mandate since 2004. http://eprints.qut.edu.au/ Now in its 5th year!

University of Queensland, Australia, since beginning of 2008. Now achieving
annual government research reporting through their IR. This implies 100%
coverage of course. http://espace.library.uq.edu.au/

A considerable number of Australian universities, including my own, are
independently going down this latter track.

Not that in neither case is it guaranteed that 100% of deposits are open
access, but it is guaranteed that the deposits are made. In some cases
expiry of an embargo will deal with access to the deposited full texts.

Equally, there is no guarantee that the figure is actually 100%. It may be
99% with a few recalcitrant or lazy non-performers, but that is generally
acceptable. The really active researchers are usually the compliers.

Arthur Sale
University of Tasmania


Re: How to Compare IRs and CRs

2008-02-10 Thread Stevan Harnad
On Sat, 9 Feb 2008, David E. Wojick wrote:

 I disagree Steve (and I am doing staff work for the US Federal
 Interagency Working Group that is grappling with these issues).

Which issues? OA's target content is the 2.5 million annual articles
published in the planet's 25,000 peer-reviewed journals, across all
scholarly and scientific disciplines, in all languages.

 Mind you I am all for OA, but integrating all the web accessible
 science is far from trivial.

I agree. But (1) OA is not about integrating all of web accessible
science; nor is it (2) only about science; nor is it (3) about making
all science web-accessible.

It's first and foremost about making the 2.5 million annual articles
published in the planet's 25,000 peer-reviewed journals, across all
scholarly and scientific disciplines, in all languages freely accessible
on the web,

 Google, Google Scholar, Science.gov, Worldwidescience.org,
 etc., each have large, irrational hunks. It is far from clear that adding
 tens of thousands of independent IR's is going to help.

OA is not about adding tens of thousands of empty IRs to existing web
content. It is about getting the 2.5 million annual articles published
in the planet's 25,000 peer-reviewed journals, across all scholarly and
scientific disciplines, in all language into their authors' OA IRs.

 Also, journal articles are not my favorite content, because they tend
 to be one to two years after the research and are too short.

But journal articles are OA's target content. And OA means getting them
freely accessible online immediately upon acceptance for publication,
not 1-2 years afterward.

 I prefer conference presentations, reports, even awards and news,
 to journals. We are trying to speed up science and journals are the
 tail end of research.

Those are all fine, and welcome in IRs, over and above OA's target
content; but OA's target content -- the 2.5 million annual articles
published in the planet's 25,000 peer-reviewed journals, across all
scholarly and scientific disciplines, in all languages -- is OA's
immediate priority.

 So OA is a worthy cause but only a small part of the policy
 picture. Findability of key information is the core issue.

Findability may be a problem for other causes, but it is not a problem
for OA (which is the only cause I am talking about). Absence, not
findability, is OA's problem.

 BTW I did some research that suggests that 60-80% of the journal lit,
 or something roughly equivalent, is findable for free if you poke around
 long enough, in some disciplines anyway.

I would be very interested to see that research, to find out in what
fields that is true, and in what time-slice. I am aware of a few fields
(mostly in physics) where it is true, but always happy to learn of more.
Our robot studies, across fields and years, find 5% to 15% of content,
depending on field (and that's using google).

Stevan Harnad

 David Wojick
 
  
  On Sat, 9 Feb 2008, dwoj...@hughes.net wrote:
  
   Steve, I am concerned when you say the following --
It's from the local repositories that the local produce can then be
harvested (the limitations of a mixed metaphor!) to some central
site, if desired, or just straight to an indexer like Google Scholar
or Citebase.
   
   OA in 10's of 1,000's of IRs is virtually worthless without some very
   good, central, global, search capability. How to build this capability
   is far from clear.
   
   David Wojick
   http://www.osti.gov
  
  The answer is as simple as it is certain: OA's problem today is
  *content*
  not *search*.
  
  What is missing is 85+% of OA's target content (2.5M annual articles
  in 25K peer-reviewed journals), not the means of searching it! Current
  search power -- both implemented and under development -- is orders of
  magnitude richer than the OA database for which it is intended.
  
  Figure out a way to fill all the world's university IRs with 100% of
  their annual article output, and the rest is a piece of cake.
  
  Keep fussing about the dessert when there's still no main course, and
  you have a recipe for prolonging the hunger of your esteemed guests
  even longer than they've already endured it (for over a decade and a
  half to date).
  
  (The way is already figured out, by the way: it's the institutional
  Green OA Self-Archiving Mandate. What still needs effort is getting the
  universities to go ahead and adopt them, instead of waiting passively,
  while fussing instead about preservation, copyright, publishing reform,
  -- and improved search engines!)
  
  Stevan Harnad
  AMERICAN SCIENTIST OPEN ACCESS FORUM:
  http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-For
  um.html
 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/
  
  UNIVERSITIES and RESEARCH FUNDERS:
  If you have adopted or plan to adopt a policy of providing Open Access
  to your own research article output, please describe your policy at:
 http://www.eprints.org/signup/sign.php
 

Re: How to Compare IRs and CRs

2008-02-10 Thread Stevan Harnad
On Sun, 10 Feb 2008, dwoj...@hughes.net wrote:

 My point is that one should not consider (and design) OA in isolation.

It is not at all clear why not, David. One does not have to redesign
the web, publishing, or science, to attain 100% OA. One need merely
self-archive in one's IR.

 OA should be viewed as part of a systematic change in the way we do
 science.

But why? when reaching 100% OA is simple and reachable -- just a matter
of a few keystrokes, and the only thing universities and funders need do
is mandate them -- whereas systematically changing the way we do science
is complicated, and not at all within obvious reach?

 Or, to put it another way, OA has to be justified in terms of
 the benefits it will provide. OA is disruptive and costly so the
 benefits must be correspondingly great.

What disruptive and costly effects? IRs cost next to nothing; keystrokes
cost nothing; mandates cost nothing.

Are we speculating, then, about the possible future of journal publishing
after Green OA self-archiving is mandated and reaches 100%? (It will
convert to Gold OA publishing. But what does that have to do with the
scientific and scholarly research community? Publishing is a service
industry and will adapt itself to the needs of research. Is research
instead supposed to adapt itself to the needs of the publish industry?)
http://www.publications.parliament.uk/pa/cm200304/cmselect/cmsctech/399/399w
e152.htm

 The benefits of OA in science lie in increased efficiency of
 communication. What I call better, faster science. But access is only
 part of the communication process. I am working the other part --

Agreed that access is only part of it. But it is a necessary part,
indeed an essential prerequisite. And it is an immediately doable
part: The way to do it is for universities and funders to mandate
Green OA self-archiving in the researcher's own OAI-compliant
Institutional Repository (IR).

That's immediately reachable, right now. Then we can worry about other
parts...

[NB: Recall that I am only talking about OA's target content: journal
articles.]

 getting the stuff to the people who need it as efficiently as possible
 (findability). My point is that my part of the system has something to
 say about your part.

But you can't find what's not there: Green OA IR mandates will provide
the missing content, and then we can see whether there's truly any
residual findability problem at all.

 Less metaphorically, OA design issues like IR
 versus CR need to consider the delivery (or findability) issue, perhaps
 even being determined by them.

IF it were the case that direct CR (Central Repository) deposit could
deliver 100% of the target OA content and IF direct CR deposit were also
somehow essential for findability, you would be quite right.

But direct CR deposit cannot and will not deliver 100% of the target
OA content (thematic CRs cannot cover all of research output space,
exhaustively and non-redundantly, and institutions and funders are the
entities that have the interests, and the means, to mandate deposit;
themes are not); and harvesting content to CR search services will
provide the findability. So both the conditional IFs are counterfactual.

 My specific point was that your IR solution to OA looks like it
 creates problems with my delivery solution. Perhaps we can discus this.

I would be happy to discuss it. My guess is that your delivery solution
calls for richer metadata than OAI. Fine. If the richer metadata really
prove necessary, either CRs can harvest the OAI metadata from the IRs
and enrich them, or, once the IRs are at last capturing all their own
research output, the IRs themselves can be persuaded (by the advantages
of your delivery solution) to enrich their own metadata requirements.

But direct CR deposit is a nonstarter, either way, because it will not
generate 100% OA content -- and it is totally unnecessary.

[NB: Again, recall that I am only talking about OA's target content:
journal articles.]

 As for the research, it was very preliminary. We just took one issue
 of each of several major journals, in physics and chemistry, and
 manually (intelligently) searched the web for each article. Starting by
 author typically worked better than by title or text. We got a good
 success rate. I should point out that much, perhaps most, of web
 available science is not on Google. It is in the deep web.

Depositing on arbitrary websites, let alone in the deep web, is obviously
nonoptimal. Mandates to deposit in OAI-compliant IRs will solve that.

Stevan Harnad
AMERICAN SCIENTIST OPEN ACCESS FORUM:
http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.h
tml
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/

UNIVERSITIES and RESEARCH FUNDERS:
If you have adopted or plan to adopt a policy of providing Open Access
to your own research article output, please describe your policy at:
http://www.eprints.org/signup/sign.php

Re: How to Compare IRs and CRs - or maybe how not to?

2008-02-10 Thread Stevan Harnad
On Sat, 9 Feb 2008, Armbruster, Chris wrote:

 I also have my doubts that IRs, federated IRs and OAI-PMH will do the
 job...

But what job, exactly, is it that you doubt they can do, Chris? Because
searching over nonexistent content cannot be done by anyone or anything!

 but CRs are also sometimes no better. Even assuming that content
 is self-archived, will it be found?

This is rather like asking: But assuming we have a cure for cancer,
how will we distribute it?

The immediate goal is to find a cure for cancer. Let's wait till we have
one before assuming we have a nontrivial distribution problem too!

(Fortunately, in the case of OA, we already know the cure: mandate
self-archiving in OAI-compliant IRs.)

The reason the OA target content cannot be found today is that most
if it isn't there; hence no resource is (or needs to be) developed and
implemented today on the assumption that the target content is all or
mostly there, free for all on the web, and that the only thing we are
missing is a reliable way to find it.

What we need is that nonexistent content, not the content-finder.

(In a parallel reply to David Wojick I address the question of free
content in the deep web, not indexed by Google: The solution there,
too, is to bring it to the reachable, surfable surface, by mandating
that it be deposited in the researcher's OAI-compliant Institutional
Repository [IR].)

 Consider this: It is often assumed that what stands in the way of
 enhanced functionality and quality is the lack of journal articles
 available in open access. However, a critical experiment has shown that
 databases already have problems with coverage even if items are available
 in open access.  It has been found (Bergstrom/Lavaty 2007) that for
 33 key economic journals, ninety percent of articles in the most-cited
 journals had been self-archived and about fifty percent of articles in
 less-cited journals were also available freely online. All of the freely
 available articles were found through Google. Using Google Scholar, they
 found about 10% less. However, when using OAIster they found only 1/4 of
 the freely available articles and results were only marginally better
 for SSRN and RePEc searches.

(0) (It is noteworthy that the BL study is in Economics, which, along
with Physics and Computer Science, make up the three disciplines that have
been spontaneously self-archiving for over a decade and a half now. But
the OA problem is with all the other disciplines: They have not followed
this admirable example. Nor have even these three laudable disciplines
come anywhere near depositing 100% of their annual article output.)

(1) But I'm not sure what, exactly, your point is, Chris: If all of
the free articles were indeed found with Google, then find them with
Google! OAIster and Google Scholar will get them too, once they are
deposited in mandated OAI-compliant IRs, as proposed, rather than on
arbitrary websites, as now.

(2) Of course a specific-item Google search only works if you know that
the item is on the web, and you know some or all of the boolean search
words that will pick it out, Google-style. No use expecting much of
that content to pop up in a generic-topic Google search, where you have no
idea know what is and isn't out there.

(3) The remedy for that is to have all of it in OAI-compliant IRs. Then
you can restrict the boolean full-text Google search to OA content, and
OA content alone, instead of searching for it in a haystack of at least
30 billion web pages (in Feb 2007).

Here is the sort of thing it would be absurd to expect to succeed
today, on the full web -- but would be a trivial piece of cake if
the full texts of all 2.5 million articles published annually in the
planet's 25,000 were self-archived in an OAI-compliant IR:

(i) Do a generic boolean search, GB, using content terms, on a dedicated
database, such as PubMed.

(ii) Then take the references for all the P PubMed hits, and first do a
specific-item boolean search, SB, for each of them, item by item, by
reference term, on the full web via Google.

(iii) Lets say the SB search on the web finds W of those P hits as
full texts on the web. W/P is the proportion of the Pubmed hits that is
currently available free on the web (apart from the deep web
unreachable by Google).

(iv) Now re-do the generic boolean search GB (i.e., using content
terms rather than each items reference) this time directly on the web,
via Google.

(v) Of course the result will be a huge and unnavigable mess, despite
the miracle of PageRank. PageRank is good enough for rank-ordering the
single targeted item reference search, but not for the generic boolean
search GB on content terms.

(vi) Why not? Two obvious reasons: (i) The target content that is there,
is embedded in too large a mess of irrelevant content and (ii) most of
the target content is not there.

(vii) Remedy: (i) get all of the target content out there in OAI-compliant
IRs so that (ii) the search can be restricted to all 

How to Compare IRs and CRs

2008-02-09 Thread Stevan Harnad
On Sat, 9 Feb 2008, Leslie Carr wrote:

 On 9 Feb 2008, at 11:35, Thomas Krichel wrote:
 
  Yeah, but E-LIS is really small, looking at it today it tells
  us it has 7253 documents. That IRs struggle to compete with that
  sort of effort demonstrates that IRs don't populate, even in the
  presence of mandates. No amount of Driver summits will change this.
 
 If you go to ROAR you will find 62 Institutional or Departmental
 repositories that are bigger than E-LIS (that's out of a total set of
 562). Admittedly that's just 1 in 8 institutional repositories pulling
 something approximating to their weight, but then there are only 89
 subject repositories listed in total.
 
 It's not a done deal by any means, but I think that the trend is
 looking a lot more positive than you suggest .

It's even a shade more subtle than that:

Not only is comparing IRs to CRs comparing apples to fruit, but the
genus and species have different respective denominators to answer to!

(1) Obviously, we would not be surprised if Harvard (with an output of,
say, 10K journal articles yearly) had a bigger IR than Mercer County
Community College (with a yearly output of 100 journal articles).

(2) But we would be surprised if the yearly deposit rate for Harvard's
10K annual articles was 1% and the yearly deposit rate for MCC was 90%,
even if that meant that Harvard had 100 annual deposits and MCC had only
90.

(3) So the right unit of comparison is not total repository content, of
course, but proportion of annual output self-archived.

(4) The comparison is more revealing (and exacting) when we compare CRs
with IRs: How to compare Harvard's IR to the CR for Biomedicine (PubMed
Central).

(5) We are not surprised if the total annual worldwide (or even just US)
output in Biomedicine exceeds the total annual output of Harvard in all
disciplines.

(6) Again, the valid unit of comparison is total annual-deposits divided
by annual-output, and for a discipline, total annual output means all
articles published that year in that disciple, originating from all of
the world's research institutions.

And that (if you needed one) is yet another reason why direct IR deposit
is the systematic way to generate 100% OA. It's apples/apples vs
fruit/fruit -- and all the fruit, hence all the apples, oranges, etc.
are sown, grown and stocked locally. It's from the local repositories
that the local produce can then be harvested (the limitations of a
mixed metaphor!) to some central site, if desired, or just straight to
an indexer like Google Scholar or Citebase.

The moral of the story is that we have to normalize IR/IR, IR/CR and
CR/CR comparisons -- and that absolute, non-normalized totals are not
meaningless, but especially misleading about CRs, which give a
spurious impression of magnitude simply by omitting their even-larger
magnitude denominators!

Stevan Harnad
AMERICAN SCIENTIST OPEN ACCESS FORUM:
http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.h
tml
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/

UNIVERSITIES and RESEARCH FUNDERS:
If you have adopted or plan to adopt a policy of providing Open Access
to your own research article output, please describe your policy at:
http://www.eprints.org/signup/sign.php
http://openaccess.eprints.org/index.php?/archives/71-guid.html
http://openaccess.eprints.org/index.php?/archives/136-guid.html

OPEN-ACCESS-PROVISION POLICY:
BOAI-1 (Green): Publish your article in a suitable toll-access journal
http://romeo.eprints.org/
OR
BOAI-2 (Gold): Publish your article in an open-access journal if/when
a suitable one exists.
http://www.doaj.org/
AND
in BOTH cases self-archive a supplementary version of your article
in your own institutional repository.
http://www.eprints.org/self-faq/
http://archives.eprints.org/
http://openaccess.eprints.org/


Re: How to Compare IRs and CRs

2008-02-09 Thread Thomas Krichel
  Stevan Harnad writes

 (Could Tom please state his evidence for this, comparing the 12 mandated
 IRs so far with unmandated control IRs -- as Arthur Sale did for a subset,
 demonstrating the exact opposite of what Tom here claims.)
 http://fcms.its.utas.edu.au/scieng/comp/project.asp?lProjectId=1830

  Show me an archive, and a university, who will vouch that for a
  certain period, all that is in the IR  with free full-text
  is a equivalent to the university's authors' total research
  papers in the same period. Does such a university exist?

 And the question of the *locus* of mandated deposit still needs to
 be sorted out for the funder mandates: they ought to be mandating IR
 deposit and central harvesting rather than going against the tide by
 needlessly mandating direct central deposit.
 http://openaccess.eprints.org/index.php?/archives/136-guid.html

  Central deposit in the funders archive is better because
  it assures the funder that a copy is and remains available. It
  does not preclude IR archiving.

 (It was my impression that Tom Krichel too was a fan of distributed
 local self-archiving and central harvesting; as I recall, he was one of
 those who warned me off of centralism during my brief fatuous flirtation
 with it.

  I remember still you apologizing to me in a public meeting about
  this. Surely, few readers of this forum will believe it happened, but
  I have witnesses. ;-)

  Now you just as infatuated with the idea of in institutional
  mandate as a simple solution. You love simple ideas, that
  you then keep on repeating.

 But now Tom seems so comfortable with the continuing spontaneous
 deposit rate of economists

  Where is your evidence for this? I am not comfortable. For a start,
  I am in Siberia at this time. ;-)

 that he does not notice that this spontaneous formula has utterly
 failed to generalize to all other disciplines for well over a decade
 now,

  I may be dump, but I am not deluded. I do notice.

  The problem is that there are not enough pioneers such as Paul Ginsparg and
  Thomas Krichel. And they don't get enough help. It's time for universities
  to support academics who are interested to lead forward scholarly
  initiative for their groups of scholars. Help them with disk space,
  CPU time, open TCP ports etc. In the long run this will generate more
  visibility for the sponsoring institution (per money spent) than
  pure research.

  BTW, I am working in pioneering initiatives (again), if an institution
  is interested in sponsorship (in kind not money) get in touch.

  Cheers,

  Thomas Krichelhttp://openlib.org/home/krichel
RePEc:per:1965-06-05:thomas_krichel
   skype: thomaskrichel


Re: How to Compare IRs and CRs

2008-02-09 Thread Thomas Krichel
  Leslie Carr writes

 It's not a done deal by any means, but I think that the trend is
 looking a lot more positive than you suggest .

  I am not saying that the trend is not up, but I would like to
  see one successful institutional archive as outlined in the
  other message, before I believe that a mandate really can work.

  I am not saying that mandates  IRs are wrong, but relying
  exclusively on them is failing to realize other opportunities.

  Cheers,

  Thomas Krichelhttp://openlib.org/home/krichel
RePEc:per:1965-06-05:thomas_krichel
   skype: thomaskrichel



Re: How to Compare IRs and CRs

2008-02-09 Thread Stevan Harnad
On Sat, 9 Feb 2008, Thomas Krichel wrote:

  Stevan Harnad writes
 
  (Could Tom please state his evidence for this, comparing the 12 mandated
  IRs so far with unmandated control IRs -- as Arthur Sale did for a
  subset,
  demonstrating the exact opposite of what Tom here claims.)
  http://fcms.its.utas.edu.au/scieng/comp/project.asp?lProjectId=1830
 
  Show me an archive, and a university, who will vouch that for a
  certain period, all that is in the IR  with free full-text
  is a equivalent to the university's authors' total research
  papers in the same period. Does such a university exist?

Yes, Les Carr has already provided these data for the first mandate,
Southampton ECS, in the pages of this Forum. CERN had done the same. We
are currently gathering the corresponding data for QUT and Minho. Arthur
Sale's comparative studies have also demonstrated this.

But while we're at it, what's good for the goose is good for the gander
(or, rather, for each genus and species): Show me a discipline-based CR
that normalizes by its own denominator -- i.e., by the total research
output of that discipline from all institutions, worldwide!

  And the question of the *locus* of mandated deposit still needs to
  be sorted out for the funder mandates: they ought to be mandating IR
  deposit and central harvesting rather than going against the tide by
  needlessly mandating direct central deposit.
  http://openaccess.eprints.org/index.php?/archives/136-guid.html
 
  Central deposit in the funders archive is better because
  it assures the funder that a copy is and remains available. It
  does not preclude IR archiving.

It does not preclude IR archiving, but it doesn't mandate it, it doesn't
help it, and it in fact hinders it, by confusing researchers as to where
self-archiving needs to be done, and how many times.

Simple solution: Both universities *and* funders mandate deposit in the
researcher's IR; then funders can also harvest centrally from the IRs
(or the IRs can -- very easily -- be configured to export to the
designated CRs, where desired or required).

So, No: Central deposit is decidedly *not* better -- it is worse, far
worse, on all counts. It is just something else that is being done
unthinkingly, and the effort is not being made to think it through.

  (It was my impression that Tom Krichel too was a fan of distributed
  local self-archiving and central harvesting; as I recall, he was one of
  those who warned me off of centralism during my brief fatuous flirtation
  with it.
 
  I remember still you apologizing to me in a public meeting about
  this. Surely, few readers of this forum will believe it happened, but
  I have witnesses. ;-)
 
  Now you just as infatuated with the idea of in institutional
  mandate as a simple solution. You love simple ideas, that
  you then keep on repeating.

Tom, I foolishly apologized to you publicly for my foolish brief lapse
from distributed institutional self-archiving to central self-archiving
between 1996 and 1999, and this is the thanks I get for my politeness?

Whereas here you are, defecting (I think!) to central deposit now without
so much as by your leave?

(Alright then, let me put it less charitably: My changes in strategy
were empirically-driven, not opinion-driven. They were always backed
up by reasoning on the best evidence available at the time, and they
continue to be. The empirical sequence was that once self-archiving
became possible (via FTP and then Web), some communities -- notably
Physics, depositing centrally, and Economics, depositing locally --
spontaneously took it up in significant numbers while most didn't. My
first instinct was local deposit (1989-5). But no one listened, while the
growing Physics Arxiv made it seem to me as if central deposit might be
a better way. So we created CogPrints (1997) for central deposit; yet
the other communities still weren't depositing. Then came OAI (1999),
opening up a new, interoperable way to do local depositing, so we created
the generic OAI-IR software (2000), and IRs caught on, globally, yet their
contents were still not growing. Then came Green OA mandates, they worked,
and it became obvious that they were the way to systematically cover all
research output, from all institutions, in all disciplines. So I'm afraid
it was those empirical facts that made me change my mind, Tom, not your
preference for local deposit in 1996, nor your preference for central
deposit in 2008. I am afraid that -- not for the first time -- I was,
in that public posting to which you allude, giving rather more credit
than credit was due. I've done worse. I've fatuously portrayed myself
as playing John the Baptist to someone else's Messiah. I confess to an
occasional weakness for hyperbole and even bathos, but not too often.
Mostly it's the facts and reasoning that prevail...)

  But now Tom seems so comfortable with the continuing spontaneous
  deposit rate of economists
 
  Where is your evidence for this? I am not 

Re: How to Compare IRs and CRs

2008-02-09 Thread Stevan Harnad
On Sat, 9 Feb 2008, dwoj...@hughes.net wrote:

 Steve, I am concerned when you say the following --
  It's from the local repositories that the local produce can then be
  harvested (the limitations of a mixed metaphor!) to some central
  site, if desired, or just straight to an indexer like Google Scholar
  or Citebase.
 
 OA in 10's of 1,000's of IRs is virtually worthless without some very
 good, central, global, search capability. How to build this capability
 is far from clear.
 
 David Wojick
 http://www.osti.gov

The answer is as simple as it is certain: OA's problem today is *content*
not *search*.

What is missing is 85+% of OA's target content (2.5M annual articles
in 25K peer-reviewed journals), not the means of searching it! Current
search power -- both implemented and under development -- is orders of
magnitude richer than the OA database for which it is intended.

Figure out a way to fill all the world's university IRs with 100% of
their annual article output, and the rest is a piece of cake.

Keep fussing about the dessert when there's still no main course, and
you have a recipe for prolonging the hunger of your esteemed guests
even longer than they've already endured it (for over a decade and a
half to date).

(The way is already figured out, by the way: it's the institutional
Green OA Self-Archiving Mandate. What still needs effort is getting the
universities to go ahead and adopt them, instead of waiting passively,
while fussing instead about preservation, copyright, publishing reform,
-- and improved search engines!)

Stevan Harnad
AMERICAN SCIENTIST OPEN ACCESS FORUM:
http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.h
tml
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/

UNIVERSITIES and RESEARCH FUNDERS:
If you have adopted or plan to adopt a policy of providing Open Access
to your own research article output, please describe your policy at:
http://www.eprints.org/signup/sign.php
http://openaccess.eprints.org/index.php?/archives/71-guid.html
http://openaccess.eprints.org/index.php?/archives/136-guid.html

OPEN-ACCESS-PROVISION POLICY:
BOAI-1 (Green): Publish your article in a suitable toll-access journal
http://romeo.eprints.org/
OR
BOAI-2 (Gold): Publish your article in an open-access journal if/when
a suitable one exists.
http://www.doaj.org/
AND
in BOTH cases self-archive a supplementary version of your article
in your own institutional repository.
http://www.eprints.org/self-faq/
http://archives.eprints.org/
http://openaccess.eprints.org/