Re: Central vs. Distributed Archives

2003-11-15 Thread Stevan Harnad
   PubMed Central will host individual OA articles

   PubMed Central http://www.pubmedcentral.gov/index.html
   has launched an About Open Access page
   http://www.pubmedcentral.gov/about/openaccess.html drawing attention
   to the journals that provide open access to their contents through
   PMC. The page also announces an important new policy: [I]n October
   2003, PMC began accepting individual open access articles from
   journals that do not participate in PMC on a routine basis. For
   the specific conditions under which PMC accepts these articles,
   see the relevant PMC agreement (in Microsoft Word format)
   http://www.pubmedcentral.gov/pmcdoc/pmc-openaccs-agree.doc
   . The offer is open to all authors in the life sciences
   willing to release their work to open access as
   defined by the Bethesda Statement on Open Access Publishing
   http://www.earlham.edu/~peters/fos/bethesda.htm. (Thanks to George
   Porter.) Posted to Open Access News 12 November 2003 by Peter Suber
http://www.earlham.edu/~peters/fos/2003_11_09_fosblogarchive.html#a106866889488739033

Relevant Prior Subject Threads:

E-Biomed: Very important NIH Proposal
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0240.html
http://www.nih.gov/about/director/ebiomed/com0509.htm

NIH's Public Archive for the Refereed Literature: PUBMED CENTRAL
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0372.html

Just two comments:

(1) More central open-access archives in which authors can self-archive
their articles are always welcome and helpful (especially if they are
OAI-interoperable) and it is gratifying to see what was originally the
E-Biomed proposal -- which at first unfortunately backed away from
individual author self-archiving of toll-access journal articles --
now ready to accept author self-archiving at last!

It has to be added, though, that since 1999, with the advent
of distributed eprint archiving, integrated by the glue of
OAI-interoperability http://www.openarchives.org/ , it has become
apparent that institutional self-archiving is a more promising route
than central self-archiving, because researchers and their instutions
share the benefits of maximizing the impact of their own research output,
and share the costs of impact-loss because of toll-based access-denial
to would-be users everywhere. Institutions also wield the carrot/stick
of publish or perish over their own researchers and are hence
in the position to mandate and monitor compliance with their own
self-archiving policy. Central archives share no such common costs/benefits
with researchers, and are not in a position to mandate self-archiving
or to monitor compliance.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0043.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0023.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0044.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0005.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0006.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0013.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0015.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0016.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0018.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0022.gif

(2) The Bethesda statement on open access publishing
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2878.html
is indeed a statement on open-access *publishing* and not on *open access,*
i.e., only on the golden and not the green (self-archiving) road to open access.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/3147.html

It is a potentially useful document, but only if this one-sidedness
is conscientiously and decisively remedied, for as it stands, the
Bethesda Statement is simply missing out on 95% of the immediate
potential for open access. (In addition, the Bethesda definition of
open is over-determined, again because of its one-sided focus on
open-access journal publishingalone. All that research
and researchers need is free online full-text access to
all research; the rest comes automatically with the online
territory: See the subject-thread: Free Access vs. Open Access
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2956.html )

http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0021.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0024.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0026.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0027.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0028.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0029.gif

Stevan Harnad

NOTE: Complete archive of the ongoing discussion of providing open
access to 

Re: Central vs. Distributed Archives

2003-11-06 Thread Stevan Harnad
Yet another piece of evidence has appeared that seems to confirm that
whereas central archiving was historically the way in which self-archiving
began, it is not the fastest or best form for it to grow and spread today:

The Nature headline is (as usual for the press) an exaggeration:

Critical comments threaten to open libel floodgate for physics
archive

http://www.nature.com/cgi-taf/Dynapage.taf?file=/nature/journal/v426/n6962/full/426007b_fs.html

and so is SciDevNet's:

Legal concerns plague open access physics archive
http://www.scidev.net/news/index.cfm?fuseaction=readnewsitemid=1087language=1

but the facts seem to be that, across the years, some papers that
contained plagiarism or libel might have found their way into ArXiv's vast
(250,000 papers) and unvetted collection.  http://www.arxiv.org

I said unvetted, but of course almost all those papers are
also submitted to peer-reviewed journals, which *do* vet them,
and when there have been any corrections to the unrefereed
preprint, the authors self-archive the refereed postprint too:
http://opcit.eprints.org/tdb198/opcit/

So the (tiny) problem of plagiarism and libel is with papers that have
*not* been peer-reviewed.

ArXiv can make an effort to vet its daily submissions for plagiarism or
libel, but at nearly 4000 per month, this would be quite a task:
http://arxiv.org/show_monthly_submissions

So the natural conclusions to draw from this seem to be the following:

(1) OAI-interoperability has now made all OAI-compliant archives
equivalent: They can all be harvested and jointly searched. It no
longer makes any difference which archive a paper is actually deposited
in: http://oaister.umdl.umich.edu/o/oaister/

(2) Not only are institutions in the best position to vet their own
research output before approving deposits in their own institutional
archives (probably on a departmental basis, optimally)
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
but this vetting load is much better shouldered in a distributed way,
rather than having one centralized vettor for all of the planet's research
output (in physics, mathematics, or other disciplines).

(3) Having institutional self-archived research output housed in the
institution's own archives also immunizes the archive from external
liabilities (such as plagiarizers from other institutions) but it also
makes it even more clear that -- contrary to what the Nature article
says it is, and perhaps contrary even to what the Physics ArXiv *thinks*
it is -- open-access archives are not *publishers*! They are merely a
means of providing open access to (refereed) publications (as well as
to their precursor unrefereed preprints).

Garfield: 'Acknowledged Self-Archiving is Not Prior Publication'
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2239.html

For those who needed a reminder of it, research's publish or perish
mandate is *not* self-archive or perish! Publication refers to
certification as having met the known peer-review quality standards of
a journal, not to having pressed the click button to self-archive an
unrefereed draft in an open-access archive! That meets the (trivial)
legal definition of publishing, to be sure -- even hand-writing it
on paper once and showing it to someone does! But it certainly doesn't
meet the definition of what the research community (and promotion/salary
committees, and research-funding councils) means by publication,
which is to be certified by a qualified, neutral third-party as having
met its known standards of peer review. At best, the self-archiving
of an unrefereed draft qualifies as vanity-press *self-publication* --
but that is precisely what researchers' institutions and their publish
or perish mandates are there in order to *protect* their researchers
from doing! (Or rather, to ensure that they go on to get their papers
properly peer-reviewed and certified as having met the peer-review
standards of the particular journal that accepted the paper.)

By the same token, it is each researcher's own institution -- not a
centralized entity like ArXiv -- that is in the best position to prevent
its own researchers (and themselves) from self-archiving plagiarized or
libellous papers -- and to take action if they do.

Having said that, the Physics ArXiv's legal concerns are all a tempest
in a teapot anyway. A central archive is a service provider. The service
it provides is to operate an archive for authors to self-archive in. If
an author self-archives a piece of plagiarism or libel therein, the only
legal responsibility of the archive is to *remove* that item as soon as
it is drawn to its attention. This is exactly the same rule as the one
applied to other Internet service providers: If someone posts or emails
pornography in an AOL discussion list or bulletin board, AOL does not
become liable as a pornographer if it immediately removes the item
as soon as it is drawn to its attention and blocks further postings
from the poster. (The poster, 

Re: Central vs. Distributed Archives

2003-11-06 Thread Eberhard R. Hilf
I agree with Stevan: ArXiv just needs a note clarifying that it is only
a time stamp and archiving machine, and takes no legal responsibility
for its content because it does not 'read the content' (as referees
do). It acts as a gateway provider. So the risk stays with the author.

Within-arxiv plagiarism can easily be checked within the
arxiv. Plagiarized papers will have a later time stamp, and thus the
original author can be spotted and the later one(s) blamed.

In contrast, scientific journals, serving to 'read and referee and check
the content of the paper' and gaining the ownership are responsible in
case the paper turns out to be plagiarized.

So, journal publishers run a real legal risk, in that they do not check
for plagiarism, - and they have to check this across all journals of all
publishers, since they claimed it's new.

The Schoen case and many others confirm: plagiarism in the e-age is a
real and formidable because it is so easy to-do. Plagiarism only seemed
to be rare, because it was not checked by the journals.

An still wider spread abuse is self-plagiarism, copy-and-pasting from
one's own older papers. Easy, 'legal', but a piece of misconduct by the
author from the standpoint of the reader.

http://www.iupap.org lists the recent London conference on plagiarism,
misconduct of authors, referees, journal editors.

Ebs

.
Eberhard R. Hilf, Dr. Prof.;
CEO (Geschaeftsfuehrer)
Institute for Science Networking Oldenburg GmbH
an der Carl von Ossietzky Universitaet
Ammerlaender Heerstr.121; D-26129 Oldenburg
ISN-home: http://www.isn-oldenburg.de/
homepage: http://isn-oldenburg.de/~hilf
email   : h...@isn-oldenburg.de
tel : +49-441-798-2884
fax : +49-441-798-5851

On Thu, 6 Nov 2003, Stevan Harnad wrote:

 Yet another piece of evidence has appeared that seems to confirm that
 whereas central archiving was historically the way in which self-archiving
 began, it is not the fastest or best form for it to grow and spread today:

 The Nature headline is (as usual for the press) an exaggeration:

 Critical comments threaten to open libel floodgate for physics archive

http://www.nature.com/cgi-taf/Dynapage.taf?file=/nature/journal/v426/n6962/full/426007b_fs.html

 Legal concerns plague open access physics archive
 http://www.scidev.net/news/index.cfm?fuseaction=readnewsitemid=1087language=1

 but the facts seem to be that, across the years, some papers that
 contained plagiarism or libel might have found their way into ArXiv's vast
 (250,000 papers) and unvetted collection.  http://www.arxiv.org

 I said unvetted, but of course almost all those papers are
 also submitted to peer-reviewed journals, which *do* vet them,
 and when there have been any corrections to the unrefereed
 preprint, the authors self-archive the refereed postprint too:
 http://opcit.eprints.org/tdb198/opcit/

 So the (tiny) problem of plagiarism and libel is with papers that have
 *not* been peer-reviewed.

 ArXiv can make an effort to vet its daily submissions for plagiarism or
 libel, but at nearly 4000 per month, this would be quite a task:
 http://arxiv.org/show_monthly_submissions

 So the natural conclusions to draw from this seem to be the following:

 (1) OAI-interoperability has now made all OAI-compliant archives
 equivalent: They can all be harvested and jointly searched. It no
 longer makes any difference which archive a paper is actually deposited
 in: http://oaister.umdl.umich.edu/o/oaister/

 (2) Not only are institutions in the best position to vet their own
 research output before approving deposits in their own institutional
 archives (probably on a departmental basis, optimally)
 http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
 but this vetting load is much better shouldered in a distributed way,
 rather than having one centralized vettor for all of the planet's research
 output (in physics, mathematics, or other disciplines).

 (3) Having institutional self-archived research output housed in the
 institution's own archives also immunizes the archive from external
 liabilities (such as plagiarizers from other institutions) but it also
 makes it even more clear that -- contrary to what the Nature article
 says it is, and perhaps contrary even to what the Physics ArXiv *thinks*
 it is -- open-access archives are not *publishers*! They are merely a
 means of providing open access to (refereed) publications (as well as
 to their precursor unrefereed preprints).

 Garfield: 'Acknowledged Self-Archiving is Not Prior Publication'
 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2239.html

 For those who needed a reminder of it, research's publish or perish
 mandate is *not* self-archive or perish! Publication refers to
 certification as having met the known peer-review quality standards of
 a journal, not to having pressed the click button to self-archive an
 unrefereed draft in an open-access archive! That meets the (trivial)
 legal 

Re: Central vs. Distributed Archives

2003-10-31 Thread Stevan Harnad
  Trends in Self-Posting of Research Material Online by Academic Staff
   Theo Andrew supplies a case study from the University of Edinburgh.
http://www.ariadne.ac.uk/issue37/andrew/

This is a survey preceding a series of SHERPA eprint self-archiving
projects http://www.sherpa.ac.uk/ to be implemented at Edinburgh.

Prior to the implementation of these projects at the University of
Edinburgh, it was decided that a baseline survey of research material
already held on departmental and personal Web pages in the ed.ac.uk
domain

The main conclusion of this advance survey was that:

(1) an unexpectedly high volume of research material (over 1000
peer-reviewed journal articles) exists online in the ed.ac.uk domain

and

(2) there is a direct correlation between willingness to self-archive
and the [prior] existence of subject-based [non-Edinburgh]
repositories

It is perhaps unsurprising that the Edinburgh disciplines that are the
most advanced in self-archiving are the ones that are also most advanced
globally, having their own central, discipline-based archives (elsewhere).
That said, 1000 is still a small number (relative to Edinburgh's annual
output), and now going on to establish departmental eprint archives at
Edinburgh will further promote self-archiving at Edinburgh, especially
if Edinburgh and the UK Research Funding Councils adopt a systematic
open-access policy along the lines of the Berlin Declaration:

http://www.ecs.soton.ac.uk/~harnad/Temp/berlin.htm
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.ariadne.ac.uk/issue35/harnad/

The article goes on to note:

The big problem is that this material is widely dispersed and
therefore not easily found. This is not very useful for the wider
dissemination of scholarly work. Also, personal Web sites tend to
be ephemeral...

This refers to the 1000 articles self-archived at Edinburgh *before* the
forthcoming Edinburgh eprint archives are implemented. The upcoming
archives will presumably be OAI-compliant -- http://www.openarchives.org
-- thereby solving the problem of dispersal and interoperability that
besets arbitrary websites.

As these self-archived articles will be duplicates of the published
version, self-archived in order to provide immediate open access, the
primary preservation problem will not be theirs; it will be the problem of
the producers and purchasers of the publishers' proprietary version. The
self-archived versions in the Physics ArXiv, for example, have
lasted twelve years now, and been successfully retrofitted for
OAI-compliance. There is every reason to belief that the growth of
self-archived content itself will be the best guarantor that we will
see for its perennity.

Oddly, there is no reference in this article to Edinburgh's own
most important existing eprint archive, already OAI-compliant,
and containing 10% of Edinburgh's current self-archived articles:
http://archive.ling.ed.ac.uk/ (There seems to be some confusion
of its contents with those of a non-Edinburgh archive --
http://cogprints.ecs.soton.ac.uk/ -- which overlaps with it in subject
matter).

There is also no reference to any prior usage surveys, such as:
http://www.eprints.org/results/
http://opcit.eprints.org/opcitevaluation.shtml

It is unfortunate that the title refers to self-posting whereas the
more widely used term self-archiving throughout the text itself: Why
proliferate needless and confusing synonyms? [The title may have been been
an unwise editorial suggestion that the author should have declined!])

Stevan Harnad

NOTE: Complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98  99  00  01  02  03):

http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html
Posted discussion to: american-scientist-open-access-fo...@amsci.org

Dual Open-Access Strategy:
BOAI-2: Publish your article in a suitable open-access journal
whenever one exists.
BOAI-1: Otherwise, publish your article in a suitable toll-access
journal and also self-archive it.
http://www.soros.org/openaccess/read.shtml
http://www.eprints.org/signup/sign.php


Re: Central vs. Distributed Archives

2003-10-31 Thread Dr.Vinod Scaria
Stevan Harnad wrote:

 Just as it was counterproductive to villify toll-access publishers
 (instead of either founding open-access journals or self-archiving),
 so it is counterproductive to villify open-access publishers (instead
 of either founding competing open-acecss journals or self-archiving).

It is also counterproductive to ignore the authors from the developing
world who have been always kept away from the mainstream.
I am not against the author pays  model, but just against the lack of
flexibility in operation.Majority of researchers in developing countries
have never had the luxury of being funded. [our own study (unpublished)
on authors publishing in top Indian Journals indexed in MEDLINE shows
more than 90% have had no funding for their research and those who had
it , had something like a miniscule fraction of what is considered as
*funding* in the developed countries]. This would simply mean they would
never be able to pay from their funds!.

There could be other viable models- like paying a fixed percentage of
funds for publishing. This would sound more aesthetic to researchers too.
This would also mean publishers could easily subsidize for research from
developing countries as well as researchers from Developed countries who
are not funded.

 So is the monopolistic objection that BMC and PLoS have more start-up
 support, giving them an advantage over journals without that support,
 or is the objection that they have an author pays model, unaffordable
 for some authors?

The heavy start up support gives them a clear edge over new
and existing publishers. PLoS Biology would not have received
the popularity and access [the traffic nearly broght down
their elegant homepage to just a couple of links on the day of
inauguration]. And the PLoS fund was better used to support lobbying --
http://bmj.bmjjournals.com/cgi/content/full/326/7392/766#art -- rather
than entering into neck-to-neck fight with existing publishers. If it was
really interested in supporting open Access, it should have supported
Journal of Biology, an Open Access Journal from BMC.

 And the same can be said about volunteer-service-based journals:
 It is too early to say whether they can last on volunteerism alone,
 let alone whether volunteerism can scale up to all 24,000 refereed
 journals!

Just imagine the scalability if the Internet was monopolised by
come company! The whole spectrum of resources we access with a
click was created by volunteerism, donations and public money.
Does PubMed/PubMedCentral make any profit?

 Perhaps a far better choice would have been to require all your authors
 to (1) try to self-archive their articles at their own institutions, and
 only in those cases where that failed, (2) to self-archive them in
 CogPrints or another suitable OAI-compliant archive. Offloading the
 self-archiving task onto the distributed authorship instead of the
 journal staff would take some of the load off the volunteer efforts
 (hence costs) involved!

 That policy would also have the benefit of spreading the practise of
 self-archiving by authors, as well as archive-provision by their
 institutions.

And yes! we actually plan to provide the authors with PDF reprints which
they could archive on their own. We did it ourselves just because we
need to see the whole thing gets started. We are also encouraging authors
to republish them on their institutional websites/repositories or their
own websites in addition to our existing archive at Cogprints.

 These are the vulnerabilities of new journals; they have nothing to do
 with open-access.

The sudden disappearance of a journal website would not have
made it so desparate if it was open access and someone would have
copied it somewhere [ some of the JMIR articles are available at
http://www.cybermedicine.netfirms.com [I own and maintain this site] after it
became open. I have also seen a number of similar websites offering JMIR
content]. This would mean one could access it just by searching for the
keywords on Google or any major search engine for that matter. At the same
time, that would not be the situation in a journal which is toll-access.

Dr. Vinod Scaria
http://www.drvinod.netfirms.com
MAIL:vinodsca...@yahoo.co.in
Tel: +91 98474 65452


Re: Central vs. Distributed Archives

2003-10-30 Thread Stevan Harnad
 a couple of months back that they
 would go Open.

 [I am in the Editorial board of OJHAS from Sept 2003]. OJHAS is
 edited and published by a small group of scholars with no external
 support. Everything from Web Design to Editing and Review are done by
 voluntarily by the Editorial team. It also stands as a fine example of
 the fact that Open Access Journals can indeed be successfully organised
 and can indeed survive without an author pays model.

There exists no firm evidence at all at the moment as to whether or
not Open Access journals can survive, with or without an author pays
model. Subsidized journals are subsidized journals, and depend on the
survival of the subsidy, not the journal. Author pays journals have
been around for far too short a time for us to know whether they can
survive. And the same can be said about volunteer-service-based journals:
It is too early to say whether they can last on volunteerism alone,
let alone whether volunteerism can scale up to all 24,000 refereed
journals!

 Now coming to the Archival, Cogprints was our first choice for many reasons

 1] It offers interoperability [as mentioned by Harnad]
 2] It offers unmatched popularity
 3] It has been there for years and we can be sure of the permanence
 4] It is of course FREE.

Perhaps a far better choice would have been to require all your authors
to (1) try to self-archive their articles at their own institutions, and
only in those cases where that failed, (2) to self-archive them in
CogPrints or another suitable OAI-compliant archive. Offloading the
self-archiving task onto the distributed authorship instead of the
journal staff would take some of the load off the volunteer efforts
(hence costs) involved!

That policy would also have the benefit of spreading the practise of
self-archiving by authors, as well as archive-provision by their
institutions.

 And as Harnad suggested, there is no reason why Journals should not
 be archived at Open Archives, be it self maintained repositories or
 Centralised ones. In fact Open Archiving of electronic journals is
 the need of the hour because our own studies [unpublished] show that
 Electronic journals are just as ephemeral as websites. Scholarly
 communication should never be lost at the cost of copyright
 restrictions. Many of these journals have perhaps done more harm than
 good by locking the access by copyright restrictions.

This is too vague: For toll-access journals, the preservation burden for
their contents (both the paper version and the online version) is
squarely on the shoulders of the journals that sell them and the
libraries that buy them. The self-archived versions of toll-access
journal articles are merely *duplicates,* provided for access, and it is
a strategic mistake to make an issue of concerns about their long-term
preservation. Those duplicates have lasted over 12 years already and
they will continue to last long enough to be retrofitted with whatever
solution the open-access era may eventually generate, if/when it prevails.

But the fact that new journals (whether paper or online) come and go is
a different problem. Journals should be archival in the sense that they
continue to exist. If they just make an appearance for a few months or
years and then vanish, then they are merely scattered collections of
items, and the preservation of such orphan items is a problem independent
of the problem of open access.

 Moreover, electronic journals are equally vulnerable to the vagaries
 of the Internet. For example, JMIR www. jmir. org went suddenly offline
 some time back [i think it was an year or so] making the whole content
 inaccessible. [But it reappeared later and now is an Open Access Journal].

These are the vulnerabilities of new journals; they have nothing to do
with open-access.

 Thus in short, OPen Archiving of Journals as a whole is perhaps to be
 discussed in a wider perspective than just making it OPEN. The major
 emphasis should be the PERMANENCE of Open Archiving. I hope this post will
 surely trigger a debate on the topic.

Preservation and access are -- for the time being -- very different
matters. The pressing problem for authors of the toll-access literature
today is access-denial and impact-loss, not preservation. It is a
mistake to conflate the open access problem with the digital preservation
problem, and it helps neither open access nor digital preservation.

Stevan Harnad

 Kind regards

 Dr. Vinod Scaria
 Executive Editor: Calicut Medical Journal
 Assoc Editor: Online Journal of Health and Allied Sciences
 Editor in Chief: Internet He@ lth

 WEB: www. drvinod. netfirms. com
 MAIL: vinodscaria@yahoo. co. in
 Mobile: +91 98474 65452

 - Original Message -
 From: Stevan Harnad
 To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM@LISTSERVER. SIGMAXI. ORG
 Sent: Wednesday, October 29, 2003 3:38 AM
 Subject: Re: Central vs. Distributed Archives

 The two items that follow below are by Vinod Scario from Peter Suber's
 Open Access News http://www

Re: Central vs. Distributed Archives

2003-10-30 Thread Michael Eisen
I would like you to defend your claim that PLoS is crunching small
publishers. Can you provide an example?

- Original Message -
From: Dr. Vinod Scaria drvi...@hotpop.com
To: american-scientist-open-access-fo...@listserver.sigmaxi.org
Sent: Thursday, October 30, 2003 9:07 AM
Subject: Re: Central vs. Distributed Archives


 CALICUT MEDICAL JOURNAL
  http://www.calicutmedicaljournal.org
 ARCHIVES AT COGPRINTS
 ***

 As we all know, Open Access Publishing is not gaining the momentum as
 far as Journals published from Developing Countries are concerned [with
 reference to western Journals]. Many reasons can be attributed like:

 1. Monopolistic nature of Open Access Publishers like BioMedCentral
 http://www. biomedcentral.com which pursues the author pays
 and would drive away any author from Developing countries. Thus
 obviously publishers from Developing countries would have second
 thoughts before starting one at BMC.

 By meaning monopolistic, I refer to the almost complete control over open
 access publishing- say about 75% of open Access Journals in Medicine.and
 Mega organisations like PLOS are crunching the small publishers, as they
 can easily override the smaller ones with the mega funding they have.
 see: http://bmj.bmjjournals.com/cgi/content/full/326/7392/766#art

 2. As I previously stated in my Editorial in Internet Health-
 www. virtualmed. netfirms. com/internethealth/articleapril03. html ,
 the fear of losing revenue, which are the sole source of sustenance
 of many Journals [though some make a meagre profit].

 3. Lack of sufficient expertise and
 exposure to Open Access Publishing. 
 www. virtualmed. netfirms. com/internethealth/opinion0303. html
 http://bmj. com/cgi/eletters/326/7382/182/b 

 But recent developments are worth mentioning - at least from India. Online
 Journal of Health and Allied Sciences www. ojhas. org , India's first
 Online BioMedical journal declared a couple of months back that they
 would go Open.

 [I am in the Editorial board of OJHAS from Sept 2003]. OJHAS is
 edited and published by a small group of scholars with no external
 support. Everything from Web Design to Editing and Review are done by
 voluntarily by the Editorial team. It also stands as a fine example of
 the fact that Open Access Journals can indeed be successfully organised
 and can indeed survive without an author pays model.

 Now coming to the Archival, Cogprints was our first choice for many
reasons

 1] It offers interoperability [as mentioned by Harnad]
 2] It offers unmatched popularity
 3] It has been there for years and we can be sure of the permanence
 4] It is of course FREE.

 And as Harnad suggested, there is no reason why Journals should not
 be archived at Open Archives, be it self maintained repositories or
 Centralised ones. In fact Open Archiving of electronic journals is
 the need of the hour because our own studies [unpublished] show that
 Electronic journals are just as ephemeral as websites. Scholarly
 communication should never be lost at the cost of copyright
 restrictions. Many of these journals have perhaps done more harm than
 good by locking the access by copyright restrictions.

 Moreover, electronic journals are equally vulnerable to the vagaries
 of the Internet. For example, JMIR www. jmir. org went suddenly offline
 some time back [i think it was an year or so] making the whole content
 inaccessible. [But it reappeared later and now is an Open Access Journal].

 Thus in short, OPen Archiving of Journals as a whole is perhaps to be
 discussed in a wider perspective than just making it OPEN. The major
 emphasis should be the PERMANENCE of Open Archiving. I hope this post will
 surely trigger a debate on the topic.

 Kind regards

 Dr. Vinod Scaria
 Executive Editor: Calicut Medical Journal
 Assoc Editor: Online Journal of Health and Allied Sciences
 Editor in Chief: Internet He@ lth

 WEB: www. drvinod. netfirms. com
 MAIL: vinodscaria@yahoo. co. in
 Mobile: +91 98474 65452

 - Original Message -
 From: Stevan Harnad
 To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM@LISTSERVER. SIGMAXI. ORG
 Sent: Wednesday, October 29, 2003 3:38 AM
 Subject: Re: Central vs. Distributed Archives

 The two items that follow below are by Vinod Scario from Peter Suber's
 Open Access News http://www. earlham. edu/~peters/fos/fosblog. html

 It provides an interesting and inspiring example of the power
 and value of OAI-interoperability http://www. openarchives. org/
 and the interdependence of the two open-access strategies (open-access
 self-archiving and open-access journal publishing) that this new online
 open-access journal, produced in India, is being made accessible
 by archiving it http://calicutmedicaljournal. org/archives. html
 in a specially created sector of CogPrints in the UK,
 http://cogprints. ecs. soton. ac. uk/view/subjects/JOURNALS. html

Re: Central vs. Distributed Archives

2003-10-28 Thread Stevan Harnad
The two items that follow below are by Vinod Scario from Peter Suber's
Open Access News http://www.earlham.edu/~peters/fos/fosblog.html

It provides an interesting and inspiring example of the power
and value of OAI-interoperability http://www.openarchives.org/
and the interdependence of the two open-access strategies (open-access
self-archiving and open-access journal publishing) that this new online
open-access journal, produced in India, is being made accessible
by archiving it http://calicutmedicaljournal.org/archives.html
in a specially created sector of CogPrints in the UK,
http://cogprints.ecs.soton.ac.uk/view/subjects/JOURNALS.html
a multidisciplinary central archive created in 1997 for author
self-archiving (which is now being done more via distributed institutional
eprint archives -- to which the CogPrints software was adapted by Rob
Tansley, creator of eprints http://software.eprints.org/#ep2 and then
of dspace http://www.dspace.org/ -- rather than via central ones like
CogPrints). Yet there is no reason a central archive like CogPrints (or,
for that matter, any of the distributed institutional archives) cannot
provide a locus for open-access journals too! OAI-interoperability
means that they will all be picked up and integrated by cross-archive
harvesters like OOAster! http://oaister.umdl.umich.edu/o/oaister/

-

1. The Editorial of the Inaugural issue of Calicut Medical
Journal- Online, open access journals: the only hope for the future
http://calicutmedicaljournal.org/2003;1(1)e1.htm discusses in detail how
and why Calicut Medical Journal supports the Open Access initiatives.In
his editorial, Dr Ramachandran, stresses the need to disseminate knowledge
in the widest possible sphere, and especially between scholars of other
developing countries and asserts that Open Access is the best possible
solution to achieve this goal.The Editorial also criticises the widely
publicised  author pays model as discouraging for scholars from
developing world and states it would badly affect the already low level
of publications from these countries. It also discusses the various
advantages of being Online and Open. He also asserts the need for more
regional Open Access Journals to meet the specific demands of scholars
and clinicians and for the maintenance and enhancement of the quality of
health services.The editorial concludes with the statement that Calicut
Medical Journal would play a dual role - being International by being
online ,Open and upholding the highest standards of publication,and at
the same time catering to the needs of Indian Scholars and Clinicians.
Posted by Vinod Scaria at 12:27 PM.


2. The Calicut Medical Journal is Online http://calicutmedicaljournal.org/
The much awaited Calicut Medical Journal is Online. The new Open Access
BioMedical Journal published by the Calicut Medical College Alumni
Association, is the second Indian Open Access BioMedical Journal. With
new Open Access medical Journals coming up in India, existing publishers
are already feeling the heat of competetion . While these two Open
Access Journals offer online acceptance of manuscripts, speedy peer
review and almost instant publication, with a host of utilities, and
ofcourse without a pricetag, other publishers are still in dark with their
outdated modes of peer review and publication. The web statistics of these
Journals are telltele signs of the fact that Open Access Publications
are widely embraced. Being Open Access, these Journals also aim to have
an International impact, which was hitherto virtually impossible in the
conventional publishing model.
Posted by Vinod Scaria at 12:22 PM.


Re: Central vs. Distributed Archives

2003-09-10 Thread Eberhard R. Hilf
Dear Stevan and the list members,
here are some arguments for
1. All physicists will publish in the ArXiv not before the year 2050,
although the arxiv size is growing quadratically, not linearly with time.
Earlier estimates [St. Harnad,
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
slide 25 are to be revised].

2. Usage of  repositories seem to be proportional to their size,
but independent of absolute size.
The full text you find at
http://www.isn-oldenburg.de/~hilf/publications/arxiv-analyis.ps

   physicists will publish in the ArXiv not before the year 2050
Here are some more elaborate but rather audacious risky estimates
(P.Ginsparg would know better).

The ArXiv is unique in that it serves its own usage and submission logs.

At present (after 146 months of service) there are 246.555 documents
stored.
The monthly rate of incoming new documents are at present 3.500. It rises
linearly with time, see
http://arxiv.org/show_monthly_submissions
Next month there will be 24 papers more per month handed in than this
month.

This allows to integrate it to get an estimate, at which future time
virtually all physicists would send in their prime papers to the ArXiv.

Let us estimate the number of physicists worldwide to be 1.000.000
of which 10 %
might be active as researchers, producing, say 2 papers per year.
Then we have 200.000 prime physics papers per year.
Integrating this  yields to see them all in ArXiv to be in 44 years and
six months from now, that is in the year 2050.


Clearly, by then we will have passed more technical revolutions, so that
this
steady state extrapolation is not likely to happen.

Other new developments may have a much steeper rise of spreading,
notably  the selfarchiving by the authors, their institutes or
Universities
and their libraries  forming a distributed net of repositories.

The advantage is its scalability, flexibility, the business model
(distributed funding by the institutions of the creators of the
documents),
the retaining of the author's rights, the update possibility,
and the acceptance spreading: to convince a large body  such as a
learned community to set up a  central service such as the ArXiv for
physics
is much harder, then to convince a percentage of local distributed
institutions
and institutes (the  multiple small versus one large barrier chance).

The challenges are to set up the  needed international standards,
to allow intelligent search engines to serve the retrieval,
to stimulate the discussion and communication between the authors,
-known in the past of beeing very conservative but not considerate of
their
working habits, and not very colloquial about it, used that they are being
taken care of and that someone else pays..

At present, the ArXiv is still unique in serving unconditional time stamp,
and long term readability.

 Is the usage is proportional to the size of a repository?
Reachout to and satisfaction of users of a repository may be estimated by
the ratio of pageviews per month
divided by the number of documents,

This ratio is astonishingly similar for different respositories even
of widely different size, may they contain documents or links.

For Marenet with its   1.595 links it is  1.9
for MPIVwith its   3.027 links it is  3.6
for Physnet with its   5.759 links it is  4.2
for VAB with its   2.655 links it is 10.4
for ArXiv   with its 245.056 docs  it is 16.3

All numbers are astonishingly low, as we know from libraries usage of
journals
and books.

Eberhard Hilf, h...@isn-oldenburg.de
Institute for Science Networking Oldenburg GmbH
at the Carl von Ossietzky University
http://www.isn-oldenburg.de

i
On Tue, 9 Sep 2003, Stevan Harnad wrote:

 On Mon, 8 Sep 2003, Eberhard R. Hilf wrote:

  the physics ArXiv has a linear increase of the number of papers put in per
  month, this gives a quadratic acceleration of the total content (growth
  rate of Data base), not linear.

 Maybe so. But slide 25 of
 http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm (slide 25)
 still looks pretty linear to me. And it looks as if 100% was not only
 *not* reached at this rate 10 years after self-archiving started in
 physics in 1991, but it won't be reached for another 10 years or so...

  Total amount by now may be at 10-15 % of all papers in physics.

 I count that as appallingly low, considering what is so easily
 feasible (though stunningly higher than any other field!)...
 
  Linear growth of input rate means the number of physicists and fields
  using it rises, while in each field (and physicist) a saturation is
  reached after a first exponential individual rise.

 Interesting, but the relevant target is 100% of physics (and all other
 disciplines) -- yesterday!

  Never there will be a saturation such that all papers will go this way,
  since in different fields culture and habits and requirements are
  different. --

 I couldn't follow that: Never 100%? Even at this rate? I can't 

Re: Central vs. Distributed Archives

2003-09-10 Thread Stevan Harnad
Ebs Hilf -- who will host a meeting on the subject next week:
http://physnet.physik.uni-oldenburg.de/projects/SINN/sinn03/programme.html
-- confirms that the rate of growth of the biggest and oldest open-access
archive -- the Physics Arxiv -- is still far, far too slow. I entirely
agree.

This does not diminish from the credit from Arxiv's having been the
first; but now, 12 years down the road, this unchangingly slow rate
suggests that something more may be needed than what has been feeding
Arxiv across the years, and my own guess (and Ebs's) is that that
something more may well be distributed institution-based self-archiving,
instead of Arxiv's central discipline-based self-archiving.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html

The reason institutional self-archiving is more likely to speed up
self-archiving and to generalize it across disciplines is that
researchers and their institutions both share the benefits of the impact
of their research output, whereas researchers and their disciplines do
not. It is not the discipline that exercises the incentive of
the publish-or-perish carrot-and-stick on researchers, it is their
research institutions. As the co-investor in and co-beneficiary of the
rewards of research impact (research funding, overheads, reputation,
prizes) the researcher's institution is in a position to mandate not only
publish or perish but publish with maximal impact -- which means
maximal access, which means open access, which means self-archiving.
http://www.ariadne.ac.uk/issue35/harnad/

I think on all this we agree with Ebs Hilf. Ebs too notes the likely
remedy for the sluggish growth rate of self-archiving in physics:
institutional (indeed, departmental) self-archiving. What is needed to
accelerate that is compelling empirical demonstrations of the correlation
between access and impact, to make researchers and their institutions
realize that self-archiving is in their own interest (and how much so)
-- in all disciplines.

There is, however, in Ebs's summary below, a rather important and
potentially misleading ambiguity: He conflates self-archiving with
publishing -- referring to depositing papers in Arxiv as publishing
them, in contrast to self-archiving them in institutional eprint
archives. But surely *both* of these are self-archiving and not
publishing! The publishing is done in the journals (in both cases). The
self-archiving is merely the provision of a supplementary version of
the paper, its full-text accessible online toll-free for all would be
users webwide (in either a central discipline-based eprint archive or in
distributed institution-based eprint archives).

Both central disicplinary archives like Arxiv and distributed
institutional archives include, in addition to the all
important peer-reviewed, published version of each article (the
postprint) also the pre-peer-review preprint version(s) and
sometimes also postpublication updated and enhanced versions
(post-postprints). But the critical version, and the one that
counts as the publication, is of course the published postprint:
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2239.html .
That (and not unpublished preprints or revisions) is what
publish-or-perish is all about!
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.4

But apart from these minor points, I don't think Ebs and I disagree. Here
is the quote/commentary:

On Wed, 10 Sep 2003, Eberhard R. Hilf wrote:

 Dear Stevan and the list members,
 here are some arguments for
 1. All physicists will publish in the ArXiv not before the year 2050,
 although the arxiv size is growing quadratically, not linearly with time.
 Earlier estimates [St. Harnad,
 http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
 slide 25 are to be revised].
 [see http://isn-oldenburg.de/~hilf/ ]

If readers look at slide 25 above, they will find that according to
Ebs's estimate (which I accept!), it would have to be revised to extend
the linear growth from 2020 instead to 2050. According to Ebs, at the
present growth rate, 2050 would be the first year in which *all* physics
articles published in that year are self-archived in Arxiv.

But note that that's *self-archived* in Arxiv, not *published* in Arxiv:
There is absolutely no reason to believe that all those articles will
not continue (*exactly* as they all do now) being published in the
appropriate peer-reviewed journal for their area and their quality-level.
(Publication will continue to mean, as it does now, peer-review and
certification of having met that journal-name's quality standards.)

And the rate of growth of the portion of total annual published journal
article output in physics that is self-archived will grow (linearly!) from
now till it reaches 100% in 2050, at exactly the same unchanging rate
at which it has been growing for 12 years now.

 2. Usage of repositories seems to be proportional to their size,
 but independent of absolute size.
 The full text you find at
 

Re: Central vs. Distributed Archives

2003-09-09 Thread Stevan Harnad
On Mon, 8 Sep 2003, Eberhard R. Hilf wrote:

 the physics ArXiv has a linear increase of the number of papers put in per
 month, this gives a quadratic acceleration of the total content (growth
 rate of Data base), not linear.

Maybe so. But slide 25 of
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm (slide 25)
still looks pretty linear to me. And it looks as if 100% was not only
*not* reached at this rate 10 years after self-archiving started in
physics in 1991, but it won't be reached for another 10 years or so...

 Total amount by now may be at 10-15 % of all papers in physics.

(10-15% of the annual output, I assume.)
I count that as appallingly low, considering what is so easily
feasible (though stunningly higher than any other field!)...

 Linear growth of input rate means the number of physicists and fields
 using it rises, while in each field (and physicist) a saturation is
 reached after a first exponential individual rise.

Interesting, but the relevant target is 100% of the annual output
of physics (and all other disciplines) -- yesterday!

 Never there will be a saturation such that all papers will go this way,
 since in different fields culture and habits and requirements are
 different. --

I couldn't follow that: Never 100%? Even at this rate? I can't imagine
why not. 

Cultural differences? Do any of the cultural differences between fields
correspond to indifference or antipathy toward research impact -- toward
having their research output read, used, cited? Unless the cultural
differences are specifically with respect to that, then they are
irrelevant.

Requirement differences? Are any universities or research funders
indifferent or averse to their researchers' impact? Unless they are,
any remaining requirement-differences are irrelevant. 

Habit differences? Well, yes, there are certainly those. But that is
just what this is all about *changing*! Are any field's current
access/impact practises optimal? or unalterable for some reason? If
not, then habit-change is (and always has been) the target!

And the point is that the rate of habit-change is still far too slow --
relative to what is not only possible, but easily done, and immensely
beneficial to research, researchers, etc. -- in all disciplines.

 [That is why it is e.g. best, to keep letter distribution by
 horses at a remote island (Juist) alive since the medieval times].

That I really couldn't follow! If you mean paper is still a useful back-up,
sure. But we're not talking about back-up. We are talking about open
online access, which has been reachable for at least a decade and a half
now, and OAI-interoperably since 1999. What more is the research cavalry
waiting for, before it will stoop to drink?

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98  99  00  01  02  03):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
or
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2003-09-09 Thread Thomas Krichel
  ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?= writes

 By community-building, I mean that such archives can contribute to the
 creation or development of the identity of a scholarly community in
 research areas that go across the established disciplinary matrix of the
 university world.

  This crucial if self-archiving is to take off.


 I know the same thing can in principle be done with OAI-compliant
 university archives and a disciplinary hub or research area hub, and
 in ten years time, we may not be able to tell the difference. But today,
 it is still not quite the same thing.

  Correct. This is a point that is too many times overlooked.

  RePEc (see http://repec.org) prodives an example for this in
  the area of economics. RePEc archives are not OAI compliant
  but an OAI gateway export all the RePEc data. Many RePEc
  services are in the business of community building. The
  crucial part, though, it RePEc's author registration service.



  Cheers,

  Thomas Krichel  mailto:kric...@openlib.org
  from Espoo, Finlandhttp://openlib.org/home/krichel
 RePEc:per:1965-06-05:thomas_krichel


Re: Central vs. Distributed Archives

2003-09-08 Thread ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?=
Stevan Harnad wrote:
 Those are all OAI-compliant archives, and they include both central,
 discipline-based archives and distributed institutional archives. With
 OAI-interoperability, it doesn't matter which kind of OAI archive a
 paper is in, but I am promoting university archives
 http://www.eprints.org/self-faq/#institution-facilitate-filling
 http://www.eprints.org/
 rather than central ones (even though I founded a central one myself
 http://cogprints.ecs.soton.ac.uk/ ) because researchers'
 institutions (and
 their research funders) all share in the joint
 publish-or-perish interests
 (and rewards) of maximizing the impact of their research
 output. Central
 repositories and disciplines do not. (They are the common locus for
 research that is competing for impact.) Hence research institutions
 (and their funders) are in a position to encourage,
 facilitate, and even
 mandate (through an extension of the publish-or-perish
 carrot-and-stick)
 open-access self-archiving of their own research output in
 their own OAI
 archive by their researchers, whereas disciplines and central
 organizations (e.g., WTO, WHO, UNESCO) are not:
 http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
 http://www.ariadne.ac.uk/issue35/harnad/

I think it is still too early to write off any of the possible paths to
open access within the field of self-archiving (not that you do that). I
see a potentially very fruitful role for community-building archives
that focus on certain research areas. These could be facilitated or
mandated by some of the specialized public research institutions that,
together with universities and private companies, inhabit the research
landscape. I think of research institutions oriented towards applied
research within for instance environmental research, agriculture, public
health, education, community development, etc. Here, there is a clear
two-sided research communication: towards the public and towards other
researchers in the field. Open access thus serves two communicative
purposes, improving scholarly communication and improving public access
to research results, besides the complementary purpose of institutional
self-promotion.

By community-building, I mean that such archives can contribute to the
creation or development of the identity of a scholarly community in
research areas that go across the established disciplinary matrix of the
university world. I have myself inititated an archive in research in
organic agriculture (http://orgprints.org), which we hope will become a
center for international communication and cooperation in this area.
Scientific papers from research in organic agriculture are published in
many different specialized disciplinary journals as well as in general
scientific journals and journals focused at organic agriculture, and it
is not easy for researchers to keep track of all that is being
published.

I know the same thing can in principle be done with OAI-compliant
university archives and a disciplinary hub or research area hub, and
in ten years time, we may not be able to tell the difference. But today,
it is still not quite the same thing. Contributing to the community
would be detached from the usage of what is there, since the depositing
of papers would take place somewhere outside the hub. This makes it
dependent on the widespread existence of university archives. So if one
wants to establish such an open-archive-based scholarly community hub,
the way to do it is to make an eprint archive with the scope that one
wants.

 Having said that, it is still a historical fact that the first and
 still-biggest open-access OAI archive is a central,
 discipline-based one,
 the Physics Archive founded in 1991 http://arxiv.org/. But
 Arxiv's growth
 rate has been steadily linear since 1991, and shows no sign of either
 accelerating or generalizing to all the other disciplines. So clearly
 something else was needed to hasten the open-access era, and my own
 hunch is that a concerted policy university-based archiving was what
 was needed.
 http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt

What's wrong with linear growth? It must be the SIZE of the growth rate
that is important. And how long it will take to realize some satisfying
level of open access with this growth rate. When you are looking for
exponential growth, I take it that you are looking for something that
MIGHT turn out to have a higher maximum growth rate than, for instance,
arXiv. And that is all well, but it might be exponential and still have
a slower maximum growth than the linear growth we see in arXiv.

In the presentation that you refer to above, you write:
At that rate, it would still take a decade before we reach the first
year that all physics papers for that year are openly accessible.

I think that this is an impressive and very satisfying growth. And I
don't think that a decade is too long - the great news is that physics
is getting there!

Kind regards
Hugo Alroe, 

Re: Central vs. Distributed Archives

2003-09-08 Thread Eberhard R. Hilf
dear Colleagues,
the physics ArXiv has a linear increase of the number of papers put in per
month, this gives a quadratic acceleration of the total content (growth
rate of Data base), not linear.
Total amount by now may be at 10-15 % of all papers in physics.
Linear growth of input rate means the number of physicists and fields
using it rises, while in each field (and physicist) a saturation is
reached after a first exponential individual rise.

Never there will be a saturation such that all papers will go this way,
since in different fields culture and habits and requirements are
different. --
[That is why it is e.g. best, to keep letter distribution by
horses at a remote island (Juist) alive since the medieval times].
Ebs


.
Eberhard R. Hilf, Dr. Prof.;
CEO (Geschaeftsfuehrer)
Institute for Science Networking Oldenburg GmbH
an der Carl von Ossietzky Universitaet
Ammerlaender Heerstr.121; D-26129 Oldenburg
ISN-home: http://www.isn-oldenburg.de/
homepage: http://isn-oldenburg.de/~hilf
email   : h...@isn-oldenburg.de
tel : +49-441-798-2884
fax : +49-441-798-5851

On Mon, 8 Sep 2003, ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?= wrote:

 Stevan Harnad wrote:
  Those are all OAI-compliant archives, and they include both central,
  discipline-based archives and distributed institutional archives. With
  OAI-interoperability, it doesn't matter which kind of OAI archive a
  paper is in, but I am promoting university archives
  http://www.eprints.org/self-faq/#institution-facilitate-filling
  http://www.eprints.org/
  rather than central ones (even though I founded a central one myself
  http://cogprints.ecs.soton.ac.uk/ ) because researchers'
  institutions (and
  their research funders) all share in the joint
  publish-or-perish interests
  (and rewards) of maximizing the impact of their research
  output. Central
  repositories and disciplines do not. (They are the common locus for
  research that is competing for impact.) Hence research institutions
  (and their funders) are in a position to encourage,
  facilitate, and even
  mandate (through an extension of the publish-or-perish
  carrot-and-stick)
  open-access self-archiving of their own research output in
  their own OAI
  archive by their researchers, whereas disciplines and central
  organizations (e.g., WTO, WHO, UNESCO) are not:
  http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
  http://www.ariadne.ac.uk/issue35/harnad/

 I think it is still too early to write off any of the possible paths to
 open access within the field of self-archiving (not that you do that). I
 see a potentially very fruitful role for community-building archives
 that focus on certain research areas. These could be facilitated or
 mandated by some of the specialized public research institutions that,
 together with universities and private companies, inhabit the research
 landscape. I think of research institutions oriented towards applied
 research within for instance environmental research, agriculture, public
 health, education, community development, etc. Here, there is a clear
 two-sided research communication: towards the public and towards other
 researchers in the field. Open access thus serves two communicative
 purposes, improving scholarly communication and improving public access
 to research results, besides the complementary purpose of institutional
 self-promotion.

 By community-building, I mean that such archives can contribute to the
 creation or development of the identity of a scholarly community in
 research areas that go across the established disciplinary matrix of the
 university world. I have myself inititated an archive in research in
 organic agriculture (http://orgprints.org), which we hope will become a
 center for international communication and cooperation in this area.
 Scientific papers from research in organic agriculture are published in
 many different specialized disciplinary journals as well as in general
 scientific journals and journals focused at organic agriculture, and it
 is not easy for researchers to keep track of all that is being
 published.

 I know the same thing can in principle be done with OAI-compliant
 university archives and a disciplinary hub or research area hub, and
 in ten years time, we may not be able to tell the difference. But today,
 it is still not quite the same thing. Contributing to the community
 would be detached from the usage of what is there, since the depositing
 of papers would take place somewhere outside the hub. This makes it
 dependent on the widespread existence of university archives. So if one
 wants to establish such an open-archive-based scholarly community hub,
 the way to do it is to make an eprint archive with the scope that one
 wants.

  Having said that, it is still a historical fact that the first and
  still-biggest open-access OAI archive is a central,
  discipline-based one,
  the Physics Archive founded in 1991 

Re: Central vs. Distributed Archives

2003-09-03 Thread Stevan Harnad
On Wed, 3 Sep 2003, [identity deleted] wrote:

 Dear Mr. Harnad,

 I am also one of these stressed diploma-writers -- but very curious and
 enthusiastic. My subject is the future of institutional
 archives.  I would be very pleased, if you could answer my questions:

 1) Do you know anything about non university archives, such as
 NonGovernmentOrganisations (i.e., WTO, WHO, UNESCO). Do these kinds of
 repositories already exist?

There are countless digital archives. You have to specify what *content*
you have in mind. This Forum (soon to be re-named the American Scientist
Open-Access Forum) is concerned *only* with scientific and scholarly
*research*, before and after peer-review (preprints and postprints).

Assuming that that is the content you are inquiring about, I suggest
that you have a look at the archives listed by the Open Archives
Initiative:
http://oaisrv.nsdl.cornell.edu/Register/BrowseSites.pl
as well as those indexed by
http://oaister.umdl.umich.edu/o/oaister/viewcolls.html

Those are all OAI-compliant archives, and they include both central,
discipline-based archives and distributed institutional archives. With
OAI-interoperability, it doesn't matter which kind of OAI archive a
paper is in, but I am promoting university archives
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.eprints.org/
rather than central ones (even though I founded a central one myself
http://cogprints.ecs.soton.ac.uk/ ) because researchers' institutions (and
their research funders) all share in the joint publish-or-perish interests
(and rewards) of maximizing the impact of their research output. Central
repositories and disciplines do not. (They are the common locus for
research that is competing for impact.) Hence research institutions
(and their funders) are in a position to encourage, facilitate, and even
mandate (through an extension of the publish-or-perish carrot-and-stick)
open-access self-archiving of their own research output in their own OAI
archive by their researchers, whereas disciplines and central
organizations (e.g., WTO, WHO, UNESCO) are not:
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ariadne.ac.uk/issue35/harnad/

Having said that, it is still a historical fact that the first and
still-biggest open-access OAI archive is a central, discipline-based one,
the Physics Archive founded in 1991 http://arxiv.org/. But Arxiv's growth
rate has been steadily linear since 1991, and shows no sign of either
accelerating or generalizing to all the other disciplines. So clearly
something else was needed to hasten the open-access era, and my own
hunch is that a concerted policy university-based archiving was what
was needed.
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt

 2) I read about the Ingenta-Southampton cooperation concerning
 eprints-software in 2002. What has happend so far? Is there a result yet?

It's still there on paper, but Ingenta has not yet made any move to
implement or promote it. The idea had been that the Ingenta option
would be for those universities that did not want to be bothered with
maintaining their own OAI archives, and preferred to outsource it to
Ingenta. This is still a good idea, but the ball is in Ingenta's court;
Southampton has plenty to do already, with optimizing and maintaining
the GNU eprints.org archive-creating software it provides free to
universities, with creating tools for measuring and demonstrating the
impact of open-access research (to help induce researchers and their
institutions to self-archive) http://citebase.eprints.org/cgi-bin/search
and with trying to shape national and international self-archiving policy.

Other archive-creating softwares have since appeared too
but what is needed now is not more software, but more self-archiving,
and a clear, focused rationale, agenda and policy for it.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2670.html

 3) Is there any other serious method of preservation expect OAIS?

Serious method of preservation for *what*? As noted, the Physics Arxiv,
which is OAI-compliant but not OAIS
http://www.rlg.ac.uk/longterm/oais.html is alive and well, and has been
since 1991. But the first, second and third objective of open-access
self-archiving is *access*, right now. The main preservation burden
for all the physics journal articles that are self-archived in Arxiv as
preprints and postprints is not on Arxiv but on each physics journal
publisher's primary corpus.

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2676.html
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2678.html

Please do not conflate the problem of open-access -- which is a
*supplement* to publishing in journals, not a *substitute* for it --
with the problem of digital preservation of journal content -- which
is a problem for journals, not for authors' institutional OAI archives.
And, in the same breath, don't conflate institutional OAI archives whose
purpose is to provide 

Re: Central vs. Distributed Archives

2003-04-16 Thread Stevan Harnad
Subject Threads:
 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1583.html
 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html

 From: [identity removed]

 What I wish to emphasize... is the big difference between posting
 one's production on line in one's personal site, and sending it to an
 international server such as ArXiv...

Yes, you are quite right that there is this difference. See:

Open Letter to Philip Campbell, Editor, Nature
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2601.html

in which this point is explicitly discussed. Let me point out that
this point (about central-disciplinary versus distributed-institutional
self-archiving) is one of the three reasons I switched my own support
several years ago from central, discipline-based archiving (back) to local,
institution-based archiving (where I had started:
http://www.arl.org/sc/subversive/ ).

My three reasons for switching back were:

(1) OAI-interoperability has made central and distributed self-archiving
interoperable, hence jointly harvestable, searchable and navigable,
hence equivalent.

(2) Researchers and their institutions share a common interest in
maximizing their (shared) research impact (and its rewards), whereas
researchers and their disciplines do not. Institutions are hence in a
position to use publish or perish carrots and sticks to encourage
institutional self-archiving. Disciplines cannot (although of course any
disciplinary culture of self-archiving can be equally directed toward
central or institutional self-archiving). Hence institutional
self-archiving, once it catches on, can grow far faster than
disciplinary self-archiving.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm

(3) Institutional self-archiving is truly *self*-archiving -- by the
author, of his own institutional research output, in his own
institution's research archive. And it is restricted *only* to the
output from researchers of that institution, made openly accessible
purely to maximize its impact. It is hence in a position to benefit from
the growing number of progressive self-archiving policies on the part of
publishers:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm

In contrast, a central, 3rd-party archive runs the risk of falling under
the (understandable) efforts of the publisher not to let *other*
publishers re-publish the work to which the original publisher has added
the value. (Of course, in the online and interoperable age this is moot
for give-away open-access research, because if something is openly
accessible to one and all on the web, it makes no difference whatosever
whether it is openly accessible from this website or from that
website! But central, 3rd-party archives are a psychological deterrent
because, being 3rd-party rather than self, as the author's institution
is, it makes them -- in principle, but so far of course never in practise
-- open to publishers' claims of 3rd-party copyright-infringement by
a rival publisher. The author himself (and hence his own institution)
is immune to this, and hence can be the beneficiary of the retention of
the *self-archiving* right where a 3rd-party, central archive is not.

Anyway, since all OAI archives are interoperable and equivalent, I see
no reason at a time when self-archiving is still growing much too
slowly (compared to what would so easily be possible) to retard its
growth in any unnecessary way: Focussing on central discipline-based
archives and self-archiving is no longer necessary. Distributed
institution-based archives and self-archiving achieve the exact same end,
with at least one fewer obstacle (and at least one more incentive).

 Yes, as you say, most publishers allow authors to do the first thing
 [institution-based but not central self-archiving]: the APS, for instance,
 changed its copyright transfer form a few years ago to make this perfectly
 legal. I think that EPS did the same. But sending a document to a more
 general server such as ArXiv is another matter, and this is not permitted
 - at least for the moment (APS does not allow it for instance).

APS does not (yet) allow their *PDF files* to be
self-archived in ArXiv, but it does allow the final, revised
text to be self-archived. So this problem is trivial.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0749.html
http://forms.aps.org/author/copytrnsfr.pdf

What is less trivial (because it is *perceived* by authors as a
deterrent) is publishers' expressed opposition to 3rd-party (i.e.,
central) self-archiving. The simple and obvious solution is distributed
institutional self-archiving, linked by the glue of OAI.

 Most private websites are not permanent; experience shows that they are
 often not updated, not stable, and that their url sometimes disappear
 after a few years. This is, by the way, why we need centralized structure
 to ensure long term preservation of all 

Re: Central vs. Distributed Archives

2003-02-24 Thread Hugo Fjelsted Alrøe
  [Thread: http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html]

Dear Stevan

Just a question of clarification.

I have noticed that you lately recommend exclusively institutional eprint
archives and not (inter)disciplinary archives.

Why is that? What are the reasons for not recommending disciplinary
archives? As you well know, the most successfull archive we have seen
(arxiv.org) is disciplinary, and there are a few others on the way.

If I am to guess, you might be thinking that authors can be pressured to
place their papers in institutional archives by making it a condition in
their employment contracts, or something similar. This pressure can also be
applied in at least some kinds of disciplinary archives (such as
http://orgprints.org), by way of making the condition in the research grant.
And the motivation is straight forward: what the public pays for should be
made publicly available.

One possible benefit of (inter)disciplinary archives is that they can better
support a kind of 'community feeling' (which a journal can also sometimes
offer), and that this community feeling can help improve research
communication.


kind regards
Hugo Alroe


 -Oprindelig meddelelse-
 Fra: Stevan Harnad [mailto:har...@ecs.soton.ac.uk]
 Sendt: 19. februar 2003 16:32
 Til: american-scientist-open-access-fo...@listserver.sigmaxi.org
 Emne: Re: STM Talk: Open Access by Peaceful Evolution


 What researchers can and should do right now for OA is to self-archive
 their own refereed research output (Self-Archive Unto Others As Ye
 Would Have Them Self-Archive Unto You) in their own institutional
 Eprint Archives, rather than to keel scolding publishers for not doing
 it for them -- *especially* as publishers (e.g., Elsevier) are
 now coming round to recognizing their own responsible role in all
 this, by formally supporting author/institution self-archiving:

http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%2
0Policies.htm

 Stevan Harnad


Re: Central vs. Distributed Archives

2003-02-24 Thread Stevan Harnad
On Mon, 24 Feb 2003, Hugo Fjelsted Alrøe wrote:

 [Thread: http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html]
 
 I have noticed that you lately recommend exclusively institutional eprint
 archives and not (inter)disciplinary archives. 

 Why is that? What are the reasons for not recommending disciplinary
 archives? As you well know, the most successful archive we have seen
 (arxiv.org) is disciplinary, and there are a few others on the way. 

Both institutional self-archiving and central self-archiving are
welcome and valuable contributions to open-access. Moreover, because of
OAI-compliance, they are all interoperable. So the short answer is that
it makes no difference. But there is a bit more:

Strategically, several years ago, I could see no reason why large
central archives like the Physics ArXiv should not subsume all of the
literature, in all disciplines. But gradally two problems become
apparent, along with their solutions:

Problem 1: ArXiv itself, though the biggest, is still growing too
slowly, even in Physics: It is growing linearly, which means it will
still be another decade before we arrive at a year when *all* of that
year's physics publications are self-archived.
http://arxiv.org/show_monthly_submissions

Problem 2: The central-archiving of ArXiv was generalizing even more
slowly to other disciplines: CogPrints (at 5+ years), another central
archive,  still only has about 1500 papers, compared to ArXiv's (at 11+
years) 200,000.
http://cogprints.ecs.soton.ac.uk/
http://www.earlham.edu/~peters/fos/timeline.htm

Solution 1: The Open Archives Initiative in 1999 provided an
interoperability protocol that effectively made all compliant archives
equivalent, whether they were central or institutional.
http://www.openarchive.org

Solution 2: What is needed to accelerate self-archiving is an *incentive*,
and it is clear that that incentive is something that is shared by a
researcher and his own institution, not a researcher and his discipline
or a central archive.
http://software.eprints.org/#ep2

The purpose of self-archiving is to maximize the visibility,
accessibility, usage and impact of one's research. In a word, to
maximize research impact. The benefits of research impact are shared by
researchers and their institutions. It is one of the main factors in
determining salaries, promotion tenure, research-funding, prizes and
prestige. These are all shared interests for researchers and their
institutions. They are behind the publish or perish injunction. This
means that the institution is not only a natural ally in self-archiving,
but it can even be the provider of the carrot and the stick, as an
extension of exactly the same considerations as those underlying
publish-or-perish: Maximize research impact.
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt
http://www.ecs.soton.ac.uk/~harnad/Temp/unto-others.doc

It is for this reason that I think institutional self-archiving holds
greater promise for propelling open-access to critical mass than central
archiving -- or, as the effect is additive, I should really say: than
central archiving alone.

 If I am to guess, you might be thinking that authors can be pressured to
 place their papers in institutional archives by making it a condition in
 their employment contracts, or something similar. This pressure can also be
 applied in at least some kinds of disciplinary archives (such as
 http://orgprints.org), by way of making the condition in the research grant.
 And the motivation is straight forward: what the public pays for should be
 made publicly available.

I agree. And both of these pressures are welcome. But the institutional
self-archiving solution is more general, and pan-disciplinary. It is
easier to create and fill institutional archives (using local carrots and
sticks) than to create a central archive for each discipline and get all
researchers to fill it. Institutional self-archiving also benefits from
a wider institutional interest in making institutional digital output
and holdings (not just refereed research) openly accessible (though I
confess that this double mandate has been a 2-edged sword, also causing
confusion about what the target contents of institutional archives
should be, and thereby slowing rather than hastening the self-archiving
of refereed research output).

I would say that when an institution has adopted a policy of mandatory
self-archiving for all its researchers, it is easier and more general
to also provide the local archives to do it in, rather than to rely
on their being spawned and sustained by some external central entity
for each discipline. The policy is then also a uniform, self-conained and
self-sufficient one, whereas self-archive somewhere would have
been too vague and would not fit most disciplines yet (rather the way
publish in an open-access journal would be a premature injunction in
most disciplines and specialties today).

Last, there is a link between self-archiving and research 

Re: Central vs. Distributed Archives

2001-11-19 Thread Eberhard R. Hilf
dear Stevan,
thanks a lot for your somehwat summary of the topic up to now.
I agree with what you say. All paths leading to the same destination.
Indeed, we work on all three lines: encourage the authors, the
institutions to set up selfarchiving with our help or gate or not and
promote central archives.
I now daw you img files .
Ebs


Re: Central vs. Distributed Archives

2001-11-18 Thread Stevan Harnad
The current topic thread begins with:

Central vs. Distributed Archives
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0950.html

See also the earlier thread:

Central vs. Distributed Archives
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0294.html

On Sat, 17 Nov 2001, Eberhard R. Hilf wrote:

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1655.html
 eh Steve said the only way is using OAi-compliance by the author to
 eh self-archive his documents before and through refereeing.
 eh 
 eh The word only is too much of a load.
 eh 
 eh In Physics (and Mathematics) since a long time authors can self-archive
 eh their documents, without having to install any software or learn about
 eh OAi. They are automatically included into the OAi scheme by the
 eh OAi compliant service providers by using PhysDoc (or Math-Net) as gateways
 eh who take care of their document being included.

My comrade-at-arms Ebs Hilf has misinterpreted the sense of my only.

He is of course quite right that central, discipline-based
self-archiving (in OAI-compliant Eprints Archives) is likewise an
effective and welcome form of self-archiving. However, as I wrote in
the very next posting:

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1654.html
 sh The Physics Archive [http://arxiv.org], for example, has over 150,000
 sh articles, but cumulated across 10 years! At that rate, even for this
 sh most advanced of all the self-archiving disciplines, the year 2011 will
 sh be the first in which ALL the articles published in physics that
 sh year will be accessible for free for all:
 sh 
 sh http://www.ecs.soton.ac.uk/~harnad/Tp/Digitometrics/img001.htm
 sh 
 sh http://www.ecs.soton.ac.uk/~harnad/Tp/Digitometrics/img002.htm
 sh 
 sh This is why institution-based self-archiving now needs to be vigorously
 sh supported and promoted to fast-forward us all to the optimal and
 sh inevitable for research and researchers.

It was with this fact in mind that I had written written the earlier only
passage:

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1653.html
 sh The only sure way to free access to the entire refereed research
 sh literature online, right now, is for researchers themselves to take the
 sh initiative and self-archive it (in their own institutions' OAI-compliant
 sh Eprint Archives: http://www.arl.org/sparc/pubs/enews/aug01.html#6 )

The force of the only was coupled with the sense of the right now!

A researcher in any particular discipline today (other than Physics,
Mathematics, or Cognitive Sciences) cannot take the initiative and
self-archive his refereed research in a central archive for his discipline,
because such central archives do not yet exist for most disciplines! Nor,
where they to exist, are they filling anywhere near fast enough (see the
2 Digitometrics links above).

Researchers' individual (and thereby collective) leverage (and rewards
for publication and impact) operates largely at the level of their own
institutions. Researchers need not install any software themselves, nor
learn anything about OAi. They need only encourage their own
universities to do so, out of shared self-interest in research
visibility, uptake and impact:

7. What you can do now to free the refereed literature online
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#7

Online or Invisible? (Steve Lawrence)
http://www.neci.nec.com/~lawrence/papers/online-nature01/

By way of OAI-interoperable central Eprint Archives, physicists and 
mathematicians 
today have http://arxiv.org and Ebs's PhysDoc (or Math-Net)
http://physnet.uni-oldenburg.de/PhysNet/physdoc.html and Cognitive
Scientists have http://cogprints.soton.ac.uk/

But for all the other disciplines, the fastest and surest path today is
to have their own institutions install their own OAI-compliant
Institutional Eprint Archives (using the free http://www.eprints.org
software) as a growing number of universities and research institutions
are now doing:

Institute of Education, University of London, London, England
University Library System, University of Pittsburgh 
http://philsci-archive.pitt.edu
Centre pour la Communication Scientifique Directe http://eprinttheses.in2p3.fr
Media Studies, University of Ulster, Coleraine, Northern Ireland
Formations Media Studies Archive http://formations2.ulst.ac.uk/
California Institute of Technology http://caltechcstr.library.caltech.edu/
Instituto Brasileiro de Informacao em Ciencncia e Tecnologia 
http://www.sbg.ibict.br
Institut Jacques Monod, Paris
Department of Philosophy, University of Vienna http://eprints.philo.at
University of Southampton, Southampton, UK http://demoprints.eprints.org/
RIACS, NASA Ames, Moffett Field CA http://horus.riacs.edu
University of Nottingham, Nottingham http://www-db.library.nottingham.ac.uk/ep1
University of Rochester Libraries http://128.151.45.180/
Sissa Multimedia Database http://mmdb.sissa.it/
University of California Digital Libraries 
http://www.escholarship.cdlib.org

Re: Central vs. Distributed Archives

2001-02-03 Thread Stevan Harnad
On Fri, 2 Feb 2001, Greg Kuperberg wrote:

 On Sun, Dec 31, 2000 at 09:57:50PM +, Stevan Harnad wrote:
 
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm
 
Physicists have already shown the way, but at their current
self-archiving rate, even they will take another decade to free the
entire Physics literature

 Of course you are entitled to your opinion that institution-based open
 archiving (sorry, I won't call it self-archiving) is the bugle call
 of the revolution.

Terminology is terminology, but calling one's own archiving of one's own
papers self-archiving sure sounds like calling a spade a spade...

Besides, the Open Archives Initiative (OAI http://www.
openarchives.org) has informed me in no uncertain terms that I should
NOT characterize self-archiving as open-archiving or vice versa. The
OAI is a much broader initiative than the self-archiving initiative.

OAI is dedicated to providing shared interoperability standards for the
entire on-line digital literature, whether self-archived or not,
whether for-free or for-fee, whether journal, book or other, whether
full-text or not, whether centralized or distributed.

It is true that the OAI was originally proposed as the UPS (Universal
Preprint Service), which was indeed a form of self-archiving (though a
limited form, focussing on the unrefereed preprint rather than on both
the unrefereed preprint and the refereed postprint, as self-archiving
does). But UPS was quickly dropped and the OAI has since vastly
outgrown those limited original objectives.

 In my opinion, institution-based archives are,

 o in physics, all but superceded by the arXiv,

On-Line archives (apart from the Physics arXiv) are all but non-existent.

The hope is that institution-based, distributed self-archiving (perhaps
with the newfound help of the http://www.eprints.org archive-creating
software) will now remedy this.

And, as I said above, even in Physics, self-archiving is still growing
too slowly to free the Physics literature in less than a decade. It
seems to me that the central self-archiving model, admirable and
welcome though it is, can use all the help it can get.

 o in mathematics, a politically appealing distraction, and

I have no idea why you mention politics. The only appeal is to
researchers, that they should free their refereed research from their
obsolete access- and impact-barriers by self-archiving it, now. I have
no political preference for their doing it the central way or the
distributed way: We should all just go ahead and DO it!

I used to lean towards central self-archiving myself, seeing no reason
why it should not all be subsumed under arXiv; but that just isn't
happening, and the clock is ticking; so it's time to add more powerful
and general means of self-archiving.

Besides, the whole point of OAI-compliance and interoperability is that
it should no longer MATTER which way you self-archive: centrally or
institutionally. It's all harvestable into the same global virtual
archive anyway, thanks to the OAI protocol.

Unless one's political objective becomes, publisher-like, to protect
one's own proprietary (centralized?) turf instead of to free the
research literature...

 o in computer science and economics, the inadequate status quo.

I have no idea what you mean by the above.

 As I said before, I know that NCSTRL and RePEc, which are the efforts
 in computer science and economics to make institutional archives
 interoperable, are important major projects.  I don't mean to slight
 them.  But they are not a panacea and they do not match the arXiv.

Nobody is trying to match anything. We are trying to free the research
literature, as quickly and as effectively as possible.

 Computer science has a second important project, ResearchIndex/CiteSeer,
 which has some good features that the arXiv does not.  But (a) it doesn't
 match the arXiv either, (b) it relies on search engine intelligence and
 not bureaucratic standards, and (c) an arXiv search facility could be
 made as intelligent as CiteSeer.

I really can't follow any of this, and I have no idea who you think is
competing with whom for what:

ResearchIndex/CiteSeer is a wonderful tool, harvesting and
citation-linking papers on the Web, whether in OAI-compliant archives
or not. As the OAI-compliant corpus grows (with the growth of central
and distributed self-archiving), ResearchIndex/CiteSeer's harvest will
grow, and surely we all welcome that!

I don't know what you have in mind with bureaucratic standards, but you
need not sell me on search-engine intelligence: I love it already.

Moreover, as the OAI-compliant corpus grows, it will spawn still
further and more powerful Open Archive Service Providers (e.g., OpCit
http://opcit.eprints.org and ARC http://arc.cs.odu.edu/).

But the main goal now is to do whatever can be done to make that corpus
grow into the full refereed literature in all disciplines as soon as
possible. This is not the time to squabble over who has the best

Re: Central vs. Distributed Archives

2001-02-03 Thread Greg Kuperberg
On Sat, Feb 03, 2001 at 10:28:19AM +, Stevan Harnad wrote:
 Terminology is terminology, but calling one's own archiving of one's own
 papers self-archiving sure sounds like calling a spade a spade...

In my opinion, if I submit a paper to the arXiv or to a hypothetical UC
Davis archive, that is them archiving my papers, not me archiving my own.
The arXiv has a technical staff, admittedly small, and you could fairly
call the staff members archivists.  The authors are not archivists.

 Besides, the Open Archives Initiative (OAI http://www.
 openarchives.org) has informed me in no uncertain terms that I should
 NOT characterize self-archiving as open-archiving or vice versa.

I suspect that that's because you don't take into account considerations
that they consider important.  In any case in your paper you do
still imply that the arXiv is an example of self-archiving.

Anyway, my *main* comment last time is that you don't even mention these
points of disagreement in your article.  Your article has the bias that
if people agree with you on the ends, it doesn't matter if they agree
with you on the means.

 On-Line archives (apart from the Physics arXiv) are all but non-existent.

That's not true at all.  In mathematics alone the AMS has a list of 60+
department-based and research-institute-based archives,

http://www.ams.org/global-preprints/dept-server.html

and 16 subdiscipline-based archives,

http://www.ams.org/global-preprints/special-server.html

Maybe a dozen of these independent archives are bigger, as measured by
new submissions per month, than your CogPrints archive.  The biggest one,
mp_arc, gets 30 new papers a month.  If you put them all together they
are comparable in size to the math arXiv.

But they're not growing as quickly as the math arXiv, not even those
in Germany that enjoy an interoperable metadata standard and a common
search engine called MPRESS, http://mathnet.preprints.org .  MPRESS even
includes everything in the math arXiv.  MPRESS can be useful, but it is
not the panacea that you seem to expect it to be.

  o in mathematics, a politically appealing distraction, and
 I have no idea why you mention politics.

Because deciding who gets to maintain the archives is political.
People get service credit for it and they don't want to give that up.
Some of the Europeans don't trust projects that they perceive as American.
In mathematics, the numerous institution-based archives tend to satisfy
administrators more and readers less.  They are useful, but they grow
less quickly than the arXiv because they are less useful.  They aren't
by any means the arXiv's savior.

 Besides, the whole point of OAI-compliance and interoperability is that
 it should no longer MATTER which way you self-archive: centrally or
 institutionally. It's all harvestable into the same global virtual
 archive anyway, thanks to the OAI protocol.

There lies MPRESS, the global virtual archive in mathematics,
and it still does matter.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2001-02-03 Thread Stevan Harnad
On Sat, 3 Feb 2001 Greg Kuperberg g...@math.ucdavis.edu wrote:

 if I submit a paper to the arXiv...
 that is them archiving my papers, not me archiving my own.

Sorry, Greg, I don't find these details useful. This is terminological
niggling. (As long as we're at it, I prefer the word depositing to an
archive, because I submit to a journal.)

 The arXiv has a technical staff, admittedly small, and you could fairly
 call the staff members archivists.  The authors are not archivists.

And authors are not publishers either. Yet it is quite common to say
I've published that paper.

What was needed was a term to describe the act of depositing a paper
into a free on-line archive for yourself, rather than relying on
someone else (e.g., a publisher) to do it for you. Self-archiving
describes that quite transparently.

(If I had to vote on it, I'd say most of the work of archiving itself
was being done by the software and the hardware, not the staff. But the
supporting staff are certainly essential, as they are even for personal
web-pages...)

 in your paper you do still imply that the arXiv is an example of
 self-archiving.

And so it is. Authors can self-archive in centralized OAI-compliant
archives like arXiv or distributed institutional OAI-compliant archives
like the ones being set up using eprints.org software.

 Anyway, my *main* comment last time is that you don't even mention these
 points of disagreement in your article.  Your article has the bias that
 if people agree with you on the ends, it doesn't matter if they agree
 with you on the means.

Well it seems to me that in my article (1) I recommend self-archiving to
free the refereed research literature, and (2) I recommend self-archiving
in distributed institutional OAI-compliant Archives to complement
self-archiving in centralized OAI-compliant Archives.

Now in recommending this, what exactly do you think I should add? That
there are some people who think it's not worth complementing the former
with the latter? that they think we should just carry on with the
former as if there were no new possibilities for broadening and
accelerating the growth of self-archiving?

Why would I want to say that? Why would anyone want to say that?

  On-Line archives (apart from the Physics arXiv) are all but non-existent.

 That's not true at all.  In mathematics alone the AMS has a list of 60+
 department-based and research-institute-based archives,

Perhaps I should have said interoperable OAI-compliant archives. And if
they exist, that's splendid. I hope there will be many more.

 Maybe a dozen of these independent archives are bigger, as measured by
 new submissions per month, than your CogPrints archive.  The biggest one,
 mp_arc, gets 30 new papers a month.  If you put them all together they
 are comparable in size to the math arXiv.

Good. Let them go OAI-compliant (perhaps by installing eprints.org
software!) and they will be making a valuable contribution to freeing
the refereed research literature (assuming they are not just for
unrefereed preprints!).

 But they're not growing as quickly as the math arXiv

So what?

  I have no idea why you mention politics.

 Because deciding who gets to maintain the archives is political.
 People get service credit for it and they don't want to give that up.

Pity. Especially if it ever engenders a conflict of interest (as it has
done in journal publishing) between what's in the best interest of
research and researchers (maximizing free access) and what's in the
interests of archivists.

 Some of the Europeans don't trust projects that they perceive as American.
 In mathematics, the numerous institution-based archives tend to satisfy
 administrators more and readers less.  They are useful, but they grow
 less quickly than the arXiv because they are less useful.  They aren't
 by any means the arXiv's savior.

Make 'em all OAI-compliant and it will no longer make a bit of
difference...


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98  99  00  01):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2001-02-03 Thread Stevan Harnad
Greg, I honestly don't know what the substantive issue is that you are
disagreeing with me about. We are both for freeing the research
literature. We are both for self-archiving. We are both for
interoperability. We both agree that the Physics arXiv was the first to
show the way. We both agree that it would be good if the pace of
self-archiving were accelerated. We both agree that it would be good if
self-archiving spread to all disciplines.

So what is at issue here? That I have suggested that distributed
OAI-compliant self-archiving may help accelerate and spread
self-archiving whereas you think it won't? Well let's just wait and
see. You seem to have some reason for wanting to nip distributed
self-archiving in the bud, a reason that I can't fathom. Could it be
because it is competing with arXiv in mathematics? Who cares?
Self-archiving is self-archiving, and free is free.

As for interoperability, the reason I stress it is that that is what
will make the locus-differences between the individual archives
irrelevant. It will all be harvested into global virtual archives, and
those, not the individual archives, will be the locus classicus for the
research literature.

On Sat, 3 Feb 2001, Greg Kuperberg wrote:

 You don't just recommend institution-based archives, you hype them as
 superior to discipline-based archives.  You describe them as a powerful
 and natural complement that you hope will broaden and accelerate the
 self-archiving literature.  I think you should add, more clearly than
 you have, that that part is only your opinion, and not that of the
 physicists and others who have shown the way.

Greg, it seems to me hope is already at least as subjective and
hypothetical a descriptor as opinion. Nor does hope equal hype.
Nor do I say anything about superior. I simply state the facts (and
hopes). The facts are that it started in Physics, in the form of
centralized self-archiving; but this is only growing linearly and not
generalizing across disciplines. Enter OAI-interoperability and the
possibility of complementing central self-archiving with distributed
self-archiving.

Why, one wonders, would any disinterested party (or rather, one with
an interest solely in freeing the literature, not in characterizing one
form of self-archiving as superior) fail to welcome a complementary
form of archiving, rather than trying to dismiss it as hype and
opinion, or as contrary to the opinion of physicists?

The freeing of their present and future refereed research from all
access- and impact-barriers forever is now entirely in the hands of
researchers. Posterity is looking over our shoulders, and will not
judge us flatteringly if we continue to delay the optimal and
inevitable needlessly, now that it is clearly within our reach.
Physicists have already shown the way, but at their current
self-archiving rate, even they will take another decade to free the
entire Physics literature
(http://www.ecs.soton.ac.uk/~harnad/Tp/Tim/sld002.htm) -- with
the Cognitive Sciences (http://cogprints.soton.ac.uk) 39 times
slower still, and most of the remaining disciplines not even
started: http://www.ecs.soton.ac.uk/~harnad/Tp/Tim/sld004.htm

This is why it is hoped that (with the help of the eprints.org
institutional archive-creating software) distributed,
institution-based self-archiving, as a powerful and natural
complement to central, discipline-based self-archiving, will now
broaden and accelerate the self-archiving initiative, putting us
all over the top at last, with the entire distributed corpus
integrated by the glue of interoperability
(http://www.openarchives.org).

 sh Perhaps I should have said interoperable OAI-compliant archives.
 sh And ir they exist, that's splendid. I hope there will be many more.

 This sounds like the Western leftists who insisted that China and the
 Soviet Union didn't practice true Communism.  If it is utterly irrelevant
 that many of the mathematical archives are interoperable and DC-compliant,
 why will making them interoperable and OAI-compliant make all the
 difference?  Granted, the OAI group may have made a better standard
 than the Dublin Core.  It's still insane to dismiss one as paganism and
 embrace the other as gospel.

Greg, I don't care! One of the purposes of interoperability is to make
sure it can all be harvested into global virtual archives like ARC
http://arc.cs.odu.edu/ thereby making the individual archive locus
irrelevant (and empowering distributed archiving). If DC-compliance
is enough to vouchsafe that, that's fine with me! Let 1000 flowers
bloom! *You* (not the Western leftists) are the one who seems to have
some sort of animus against these other archives!

And I think we are beginning to repeat ourselves (again). We have bet
on our respective horses. Can we now wait and see how they do in the
self-archiving sweepstakes? (I have the advantage that I win either
way, just as long as they 

Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Wed, 8 Nov 2000, Greg Kuperberg wrote:

 While libraries certainly should help preserve e-prints, I do not trust
 any one library, nor any other sole institution, to archive material
 single-handedly. Any caretaker can lose or destroy a unique copy of
 any document...  That is why it is important to redundantly and
 openly mirror an archive and not just allow third-party searches. The
 arXiv has 18 mirror sites on six continents

Who is disagreeing with this? All requisite redundancy is just as
desirable, and feasible, and inevitable, with institution-based
distributed archiving as with discipline-based archiving.

I think there is an incorrect analogy at the heart of Greg's frequent
use of the term fragmented in speaking about the institution-based
approach to self-archiving:

I think Greg continues to equate (1) archiving with publishing, and
(2) institutional digital collections with localized books-on-shelves
(ripe for a Library-of-Alexandria catastrophe; hence his example of the
lost/destroyed unique document). And (3) (unrefereed, unpublished)
PREprints continue to be treated as the paradigm for it all, whereas
it is much more informative and representative to see it in terms of
(refereed, published) POSTprints: We are, after all, aiming at freeing
the REFEREED literature -- with the prepublication embryological stages
merely an added bonus, rather than the focus of it all.

So, to summarize: Whilst, our refereed papers are already, as they are,
safely in the hands of journals and libraries, blissfully mirrored
(though unblissfully unfree), we need not fret about Alexandria.
Freeing a postprint (sic) via self-archiving (whether central or
institutional, interoperable or not) is a bonus, a plus, a freebie, a way
to make it accessible to those multitudes worldwide who cannot access
it because of the S/L/P firewalls surrounding the safe, Alexandria
versions.

It is inviting Zeno's Paralysis (again) to say: Keep waiting till you
have an Alexandria-proof centralized, mirrored, redundant arXiv-style
Archive to self-archive them in before you dare to self-archive your
(already safely mirrored) postprints.

Nay! Release them from their hostagehood behind obsolete,
impact-blocking, and completely surmountable access barriers online
today through self-archiving, addict fellow-researchers the world over
to that new, free form of access to it all, and the redundancies and
mirrors will come tomorrow, in plenty of time to keep the freed corpus
aloft in the skies. (And nothing is at risk: the firewalled version
remains as safe -- from catastrophic loss as well as illicit access --
as it ever was.)

If that is now transparent for postprints, it should be equally
transparent that the same applies to preprints: They are destined to
become postprints (hence secure, for the above reasons) anyway. Being
available online early is a bonus; a freebie. Moreover, it is bonus
that has no prior history of enjoying the safe/secure status of
postprints anyway: access to preprints was always restricted and
evanescent, destined to be superseded by the secure postprint once it
was available.

Now the redundancy and mirroring that will be accorded the freed
postprint corpus, once it is freed, will also be inherited by the
preprint corpus.

So there is nothing to lose, and everything to be gained, by
self-archiving all preprints and postprints now, in either the
centralized OAI-compliant (http://www.openarchives.org) archives like
arXiv (http://arXiv.org), or in institutional OAI-compliant archives,
like Eprints (http://www.eprints.org).

Ignore Cassandras: Preservation problems are eminently soluble, once
the goods are up there: the real problem now is how to get researchers
to put them up there, at long last. Central archives have gone part of
the distance but are proving too slow. Institutional archives are natural
allies in hastening us on the road to the optimal and inevitable.

 As a rule, it is better for web sites to share the same archive than
 to each have fragments. It is better for Oxford and Cambridge to
 each have all of Shakespeare's plays than for Oxford to have only the
 comedies and Cambridge to have only the tragedies. That is why I favor
 shared interoperability, which is in some ways centralized, to fragmented
 interoperability, which is optimistically called decentralized. Massive
 redundancy is one of the few strengths of the existing paper-based system;

I am not an expert on digital storage, coding or preservation, but I am
not at all sure that Greg is technically right above (and I'm certain
that the Oxford/Cambridge hard-copy analogy is fallacious). I would
like to hear from specialists in localized vs. distributed digital
coding, redundancy, etc. -- bearing in mind that in the case of the
refereed literature, this is all moot anyway, because free access now,
is infinitely preferable to no access, no matter how short-lived it
risks being. The locus classicus is still safely ensconced behind the
toll 

Re: Central vs. Distributed Archives

2000-11-09 Thread Thomas Krichel
  Greg Kuperberg writes

 But I disagree entirely with the claim that distributed
 interoperability has never been tried before.  It has been tried several
 times, whole-heartedly with these two projects:

 MPRESS - mathnet.preprints.org
 NCSTRL - ncstrl.org

 And it has been a factor in many other projects, including Hypatia
 and the AMS preprint server.  Some of these projects are more
 successful than others, but *all* of them suffer from inconstancy
 of the underlying archives.

  The largest project that has been done with a distributed
  interoperability is RePEc. RePEc catalogs 11 items now.
  While there is the occasional case that an archive my become
  obsolete, from about 140 archives, I think 5 have been made obsolete,
  i.e. have been moved  to a place outside the original archive
  maintainer's control. Thus while it is problem, it is not a minor one.
  It is by far outweight by other advantages, such as distributed costs,
  minimum quality control, and wide community partipation.

  Cheers,

  Thomas Krichel http://openlib.org/home/krichel
 RePEc:per:1965-06-05:thomas_krichel

  2000-10-05 to 2001-01-06:
  Institute for Economic Research / Hitotsubashi University
  2-1 Naka / Kunitachi / Tokyo 186-8603 / Japan / +81(0)42 580 8349
  tho...@micro.ier.hit-u.ac.jp


Re: Central vs. Distributed Archives

2000-11-09 Thread David Goodman
Steve, I think you misunderstand Greg's concern (and mine)
We do not disagree with what you want to do; we want to add to it. We are
assuming, I think,
that something similar to the plan you advocate will be the basic process.

I do not think it enough to say distributed=secure. It's only the first step
to security.
In addition to being distributed, there also needs to be a reliable
caretaker--not just to do the housekeeping, but to ensure that the archive is
kept compatible with changing technology.
I suggested that the archives be organized redundantly both by discipline and
by university (and possibly by geographic/political entity, as well as what
anyone wants to do).

There are undoubtedly well-organized academic departments that can do this.
There are also academic departments that cannot be relied on to do this right,
because of size, interest, or finances. The same goes for professional
societies. Certainly no individual can be relied on: all humans are mortal.
All of this goes as well for refereed as for unrefereed, preprint as for
reprint, officially published as for unpublished.

As a librarian, I do not assume it is good enough that
 our refereed papers are already, as they are,
 safely in the hands of journals and libraries, ...

There are very few library copies of many journals, and though there is
excellent backup from national libraries, even their collections are
incomplete. The literature published up to now will be much more secure when
it too has been digitized and placed on free publicly available mirrored
servers, with all the additional precautions. Besides security, this will also
make them generally available with all the additional advantages of plans such
as yours.


Re: Central vs. Distributed Archives

2000-11-09 Thread Tim Brody
  Greg:
  As a rule, it is better for web sites to share the same archive than
  to each have fragments. It is better for Oxford and Cambridge to
  each have all of Shakespeare's plays than for Oxford to have only the
  comedies and Cambridge to have only the tragedies. That is why I favor
  shared interoperability, which is in some ways centralized, to fragmented
  interoperability, which is optimistically called decentralized. Massive
  redundancy is one of the few strengths of the existing paper-based system;

 Stevan:
 I am not an expert on digital storage, coding or preservation, but I am
 not at all sure that Greg is technically right above (and I'm certain
 that the Oxford/Cambridge hard-copy analogy is fallacious). I would
 like to hear from specialists in localized vs. distributed digital
 coding, redundancy, etc. -- bearing in mind that in the case of the

If I may separate the political issues from the technical.

Political:

There is a fear that a decentralised system will result in no overall
responsibility for archive continuity. But, equally, a centralised
body can decide that a system is no longer useful or is too expensive
to be free - what happens if XXX goes pay-per-view? What rights do
mirrors have to store XXX if they are told to remove their archive?

Technical:

The fear is that there will be only one copy of a paper stored in an
institution department or library and if that archive is lost that
paper disappears into digital oblivion.

Data storage is very cheap - there is little difference between storing
1 or 100 copies. Oxford and Cambridge could farm all world physics
archives and store their contents. This is not currently done because
Open Archives include pay-per-view archives, where only the abstract
can be farmed - and hence there is no provision for farming of texts.

I may also point out that there are already archives that perform
distributed mirroring - math arXiv is primarily made up of papers that
have been archived elsewhere (judging by the lack of associated meta
data and updates).

Tim Brody
Computer Science, University of Southampton
email: tdb...@soton.ac.uk
Web: http://www.ecs.soton.ac.uk/~tdb198/


Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Thu, 9 Nov 2000, David Goodman wrote:

 Steve, I think you misunderstand Greg's concern (and mine) We do not
 disagree with what you want to do; we want to add to it. We are
 assuming, I think, that something similar to the plan you advocate will
 be the basic process.

 I do not think it enough to say distributed = secure. It's only the first
 step to security. In addition to being distributed, there also needs to
 be a reliable caretaker--not just to do the housekeeping, but to ensure
 that the archive is kept compatible with changing technology.

I agree completely.

I didn't say distributed = secure (there's a lot more to security than
that). I said being freely accessible now, in distributed institutional
Eprint archives is a powerful new way to complement being freely
accessible in centralized Eprint archives, which are still growing much
too slowly. It should not be delayed for one moment by security
concerns, not one moment.

 I suggested that the archives be organized redundantly both by
 discipline and by university (and possibly by geographic/political
 entity, as well as what anyone wants to do).

Again, complete agreement.

 There are undoubtedly well-organized academic departments that can do
 this. There are also academic departments that cannot be relied on to
 do this right, because of size, interest, or finances. The same goes
 for professional societies. Certainly no individual can be relied on:
 all humans are mortal. All of this goes as well for refereed as for
 unrefereed, preprint as for reprint, officially published as for
 unpublished.

Agreed, and digital librarians are clearly the pertinent experts.

 As a librarian, I do not assume it is good enough that our refereed
 papers are already, as they are, safely in the hands of journals and
 libraries, ...

Yes, but let us not again mix up agendas. There could have been --
independent of any movement to free the refereed literature online -- a
movement to increase the security of the on-paper corpus (both papers
and books) on-line.

That's fine, desirable, but unrelated to this Forum's agenda, which is
to FREE the refereed corpus online. Concerns about strengthening the
paper literature's current security should not be wrapped into the
freeing (now!) initiative for the refereed literature; nor should
freeing (now!) be made in any way conditional on first meeting a priori
security concerns. Although it is an oversimplification, it is best to
treat the freeing initiative as a pure freebie, a windfall, over and
above what we have already. We are talking about archiving, not
publishing, an extra version of what is already published (on-paper).

This face-valid, immediate goal should be kept as distinct from
preservation concerns as it should be kept from peer-review-reform
concerns (likewise worthy, but orthogonal, and indeed even at
cross-purposes if yoked in any way to the freeing initiative).

 There are very few library copies of many journals, and though there is
 excellent backup from national libraries, even their collections are
 incomplete. The literature published up to now will be much more secure
 when it too has been digitized and placed on free publicly available
 mirrored servers, with all the additional precautions. Besides
 security, this will also make them generally available with all the
 additional advantages of plans such as yours.

David, the securing issue is a separate one from the freeing! The
material on the shelves now is not free; nor is it, let us agree, as
secure as it might be. Increasing its security by distributed digital
back-up is one thing (and need not be freely accessible either);
freeing it online is quite another.

Please, please keep these two separate or you will only encourage more
Zeno's Paralysis!


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98  99  00):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 11:16:11AM +, Stevan Harnad wrote:
 Nay! Release them from their hostagehood behind obsolete,
 impact-blocking, and completely surmountable access barriers online
 today through self-archiving, addict fellow-researchers the world over
 to that new, free form of access to it all, and the redundancies and
 mirrors will come tomorrow, in plenty of time to keep the freed corpus
 aloft in the skies.

Entirely aside from whether your proposals are the best ones, you have
previously described them as being nothing other than the Ginsparg
model.  Well I think of myself as devoted to the Ginsparg model,
but my interpretation of it is significantly different from the one
that you give here.  In 1997 my thinking was much more like yours,
but three years of direct experience with the arXiv has changed it.
My creed is, build a large, integrated, immortal archive now, and the
e-prints will come tomorrow.  I won't insist that this approach is right
for your discipline, because maybe you know your own community better
than I do.  But I do feel strongly that it is right for my discipline.
And I can't speak for Paul Ginsparg either, but I would be surprised
if he contradicted me outright, since he has influenced my thinking a
great deal through direct correspondence.

In general your liberation terminology doesn't sit so well with me.  I do
hint at liberation terminology from time to time; in fact the name of my
front end, Front for the Mathematics arXiv, is a deliberate allusion.
If the math arXiv is revolutionary, I would liken it to the American
revolution.  We are building a new system on new territory and letting
immigrants come.  I see a lot of Alexander Hamilton in our approach, and
somewhat less of Thomas Jefferson.  Your comments have some character
of Jefferson, but very little of Hamilton, and often they sound almost
Marxist.  I might compare your overall vision to the Communards of Paris.
But hey, you could be right in your own society.

You have also correctly picked up that I don't accept the dichotomy
between preprints and postprints.  My view is that the preprint
and the postprint are Tweedledum and Tweedledee.  But that is a topic
for another posting.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 05:58:14PM +, Tim Brody wrote:
 I may also point out that there are already archives that perform
 distributed mirroring - math arXiv is primarily made up of papers that
 have been archived elsewhere (judging by the lack of associated meta
 data and updates).

I don't understand this comment.  Most of the papers in the math arXiv
are eventually published, and many are in preprint series of one sort
or another.  However I conjecture that at least half of the submissions
in the most recent three months are not on any other web site, not
even on a home page.  And for those that are not published or not yet
published, the arXiv is the only project that explicitly promises to
keep them permanently.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 07:16:47PM +, Stevan Harnad wrote:
 I don't think sublinear or linear growth is right for
 your discipline (maths) either...

Of course more growth is better than less.  Several of us (both the arXiv
staff led by Paul Ginsparg and the math advisory committee chaired by Dave
Morrison, on which I serve) have worked hard to accelerate the growth of
the math arXiv.  I can report a partial victory.  The archives that we
glued together were at best growing linearly with a low slope and were
showing some signs of sublinearity.  After we put them together there was
a discontinuous increase in new submissions, and linear growth commenced
with a higher slope.  I don't have a chart but the numbers are there at

http://front.math.ucdavis.edu/math

After we had changed so much, I was surprised that growth was still
linear.  (Paul Ginsparg wasn't surprised.)  I now believe that linear
growth in e-prints is inherent.  But both the discontinuity and the
one-time change in slope were heartening.  That is a realistic goal when
you change the system.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-08 Thread Thomas Bacher
This is what University Presses need to become -- the formatter, keeper, and
distributor (with the university library) of the intellectual goods. If that
were to happen, funded of course by the university, then the university
could avoid paying twice (once to the researcher and twice to the publisher)
for intellectual property. The university would also save money in the long
term. I believe it will come to this model within the next five years.

Thomas Bacher, Director, Purdue Press
1207 SCC-E, W. Lafayette, IN 47907-1207
(765)494-2038   Fax: (765)496-2442
www.thepress.purdue.edu

Be at your life-long-learning best. Read from a University Press.


Re: Central vs. Distributed Archives

2000-11-08 Thread David Goodman
Departments are not the place, for exactly the reasons John explains. More
than one of the academic depts. in more than one major university I have been
affiliated with has managed to lose unique copies of Ph.D. theses, as well as
every other possible type of item.
I think this is an appropriate role for libraries in two dimensions: each
university library should take the responsibility for all publications by its
faculty and students, AND appropriate major libraries or groups of libraries
could also take the responsibility for specific research areas that are not
being otherwise covered.
If university presses wanted to participate I think most libraries would
welcome the partnership.

The systems are inexpensive enough for redundancy to be affordable, and this
might be one solution to the refereed/nonrefereed controversy. It only
requires adequate cross-archive indexing.

A part of the savings could be used to increase the number of librarians
helping the other members of the university navigate the new system. Most
users need help in navigating the present system (the higher the academic
level the more likely they are to request it, because they know enough to
realize they need it). They will need it all the more during the period of
transition. Nothing in the prior course of human-developed systems gives
reason to suppose they will need it less even after the transition is
complete. (If the AI people think they can compete in this, I encourage them
to keep trying.)

John MacColl wrote:

 Greg Kuperberg wrote:

   So should we mathematicians trust individual math departments to
   permanently preserve their e-prints? I don't think so. Our own math
   preprint series at UC Davis is an arXiv overlay - all articles are
   automatically contributed to the math arXiv. One of my
  arguments for this
   arrangement is that we can't promise to babysit these preprints forever.
   We could easily forget our obligation.

 Stevan Harnad replied:
 
  The Department could easily forget; the institutional library is unlikely
  to do so. It has a lot of prior practise with stability/permanence! (And
  it has a good deal to gain from maintaining robust institutional Eprint
  Archives: The prospects of serials-crisis relief, as other
  institutional libraries do the same thing, with their own Eprint
  archives --

 I would concur with this response, and would wish to develop a couple of
 points about why libraries are important in the freed literature scenario.
 Interestingly, the notion of  'forgetting' gives a new dimension to the
 notion of libraries as 'memory organisations'. They are no longer simply
 memory organisations in the sense of storing knowledge, as in a memory, but
 particularly as that knowledge becomes networked they are becoming
 organisers of access, for which function their contribution to their parent
 institution is to understand information structures, sources and
 presentations. This requires that they are memory joggers as well as memory
 fillers. That has always been true, but internet publication has increased
 both the complexity of these structures, and the rate of publication. More
 and more the challenge for academic libraries is to preserve the roles of
 hunters and collectors of knowledge in the age of internet publishing: that
 requires that they take a much more active approach to identifying and
 maintaining knowledge than was required in the age of print, when libraries
 had adapted to the culture of publishers, and had settled into a role which
 was primarily passive.

 But as Stevan says, interoperability in the world of eprint archives has not
 been tried before (and therefore cannot be criticised as the wrong model).
 More than that, it is at present the only model really capable of surviving
 in the world of internet publishing, and it conforms to the way librarians
 see publishing culture moving, which is why the library profession is so
 concerned with metadata - the key to the knowledge structures which are in
 transition. In the passive model, academics and researchers ordered books
 and journals via the library, and the library sought to ensure that the
 material which arrived in the form of physical product was organised
 optimally. Now, we find academics and researchers creating web sites with
 links to internet sources, and themselves interacting with such sources (as
 they will with open archives) without needing to act via the library. Our
 role as librarians is to keep pace with these changes and evolve new methods
 for providing not only 'permanence and stability', but also description and
 classification to ensure that sources are findable by other researchers,
 students and teachers. So - to take Greg's point about centralisation -
 whether an institution wishes to create an open archive for itself as an
 institution, or whether a single department wishes to do it, is a matter for
 them to decide, but either way it is in their interest to let the library
 know that 

Re: Central vs. Distributed Archives

2000-11-08 Thread Greg Kuperberg
On Wed, Nov 08, 2000 at 12:30:39PM -0400, David Goodman wrote:
 Departments are not the place, for exactly the reasons John explains. More
 than one of the academic depts. in more than one major university I have been
 affiliated with has managed to lose unique copies of Ph.D. theses, as well as
 every other possible type of item.

The fact is that most math papers on the web (excluding those in the
arXiv) are on department, and not campus-wide, web servers.  This is
even true of papers that are organized into preprint series.  One of the
dangers of an interoperability approach is to hoist the e-print vision
on such an accidental foundation.  I also agree with John MacColl's
position that libraries are more reliable archivists than departments
in principle.  But I disagree entirely with the claim that distributed
interoperability has never been tried before.  It has been tried several
times, whole-heartedly with these two projects:

MPRESS - mathnet.preprints.org
NCSTRL - ncstrl.org

And it has been a factor in many other projects, including Hypatia
and the AMS preprint server.  Some of these projects are more
successful than others, but *all* of them suffer from inconstancy
of the underlying archives.

While libraries certainly should help preserve e-prints, I do not trust
any one library, nor any other sole institution, to archive material
single-handedly.  Any caretaker can lose or destroy a unique copy of
any document.  (Just last year the Boston Public Library lost thousands
of books in a flood, for example.)  That is why it is important to
redundantly and openly mirror an archive and not just allow third-party
searches.  The arXiv has 18 mirror sites on six continents, listed at:

http://arxiv.org/servers.html

That is not as many copies of the arXiv as I would like to see, although
it is enough full-fledged active mirrors.  More significantly anyone
who wants to can maintain yet another copy of the arXiv following the
instructions at:

http://front.math.ucdavis.edu/scripted

As a rule, it is better for web sites to share the same archive than
to each have fragments.  It is better for Oxford and Cambridge to
each have all of Shakespeare's plays than for Oxford to have only the
comedies and Cambridge to have only the tragedies.  That is why I favor
shared interoperability, which is in some ways centralized, to fragmented
interoperability, which is optimistically called decentralized.  Massive
redundancy is one of the few strengths of the existing paper-based system;
let's not tear up the road in addition to scrapping the horse carriage.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-07 Thread Greg Kuperberg
On Tue, Nov 07, 2000 at 03:15:36PM +, Stevan Harnad wrote:
 So the answer is: Sure I'd have been happy to have CogPrints subsumed
 by arXiv if that had proved to be the way to get the entire refereed
 corpus online and free. But now it looks as if OAI-compliant
 distributed Eprint Archiving (including arXiv) will instead be
 subsumed into the global virtual Eprint Archive.

I have learned not to claim that the arXiv is the Philosopher's Stone,
much as I would like it to be.  But if you're serious about merging
with the arXiv, let's see how well OAI is doing in a year, as measured
by the number of search queries at multiarchive OAS agents.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-06 Thread Stevan Harnad
On Fri, 3 Nov 2000, Greg Kuperberg wrote:

 It is not really a neutral statement to declare that it no longer
 matters whether a paper is in a central archive or a distributed one.
 Each archive is in a way an entrenched interest.  Each archive maintainer
 has put a lot of work into his or her project, and therefore wouldn't
 want it assimilated into a larger archive without a very good reason.

I am afraid I cannot follow this at all. Are you saying that the
maintainer of a free public archive of refereed research has an
interest in NOT having that research assimilated into still larger
public archives, if it increases their visibility, accessibility and
impact?

(If there really do exist such entrenched archive-maintainer
interests, they begin to resemble the conflict of interest that has
emerged between researchers and journal publishers, when it comes to
access-barriers to their work!)

The maintainers I have in mind are those whose interest is in freeing
this research from needless access/impact barriers, not in adding to
them!

In particular, neither universities who provide distributed
institutional Eprint Archives for self-archiving the refereed research
of their researchers, nor Learned Societies who do so for the sake of
their disciplines, in a centralized archive, have anything to gain from
preventing their respective archive contents from being harvested by
Open Archive Services into still larger virtual archives, all
seamlessly interoperable (e.g., http://arc.cs.odu.edu/).

As to justifying access-barriers on the grounds that the archive
maintainer has put a lot of work into his or her project, the Eprints
software should now make that work so minimal that this dubious
rationale becomes moot anyway: http://www.eprints.org

 This is overconfidence.  The biggest reason that it is overconfidence
 is that it defers the permanence question.  But there are other reasons
 as well.  One is that one of the most useful features of the arXiv
 (and similar services such as CogPrints) is immediate notification of
 new results.

There is no (not-readily-solvable) permanence question. At this
point, getting the literature on-line and free is the most important
thing to do, now. The collective interests that this will generate in
KEEPING it all on-line and free will ensure that all proper steps are
taken to ensure permanence.

The OAI-compliant archive-creating/maintaining Eprints software has the
same notification service as CogPrints -- indeed, it is a generic
adaptation of the CogPrints software!
http://cogprints.soton.ac.uk

 Another is non-redundancy: the arXiv almost completely
 eliminates the disarray of having many copies of a paper which may
 or may not be different versions.  The OAI standard does not address,
 and perhaps cannot address, either of these important advantages of a
 centralized system.

The OAI-standard has not yet addressed version control (it will) but
the OAI-compliant Eprints Software has. Moreover, version-sorting is
a natural function for an Open Archives Service that harvests all
versions of a paper, and sorts them the way you like (date, archive,
use, etc.) Such a service is a natural one to go hand in hand with
citation-linking (which likewise has to sort versions):
http://opcit.eprints.org

 interoperability keeps getting reinvented.

The OAI protocol is steadily being optimized (and the OAI-compliant
Archives with it): Is this a bad thing?

 Precedent suggests that if OAI succeeds, it will fade into a
 transparent layer, and that beyond it people will see incompatability
 at a new level and invent another standard.

This sounds unduly pessimistic (and could be said against any attempt
to create interoperability standards).

 HTTP is already an interoperability standard, originally invented for
 the purpose of distributing research documents.
 And there are already HTTP-based search engines, including CiteSeer,
 which searches only for research papers.  So it's important explain how
 OAI would go beyond HTTP+CiteSeer.

I suggest that this question be re-directed to the OAI discussion list,
which is concerned with the technical details: u...@vole.lanl.gov
http://vole.lanl.gov/pipermail/ups/


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98  99  00):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be 

Re: Central vs. Distributed Archives

2000-11-06 Thread Greg Kuperberg
On Mon, Nov 06, 2000 at 05:46:57PM +, Stevan Harnad wrote:
 I am afraid I cannot follow this at all. Are you saying that the
 maintainer of a free public archive of refereed research has an
 interest in NOT having that research assimilated into still larger
 public archives, if it increases their visibility, accessibility and
 impact?

My position is borne entirely out of practical experience and not
theory, and I am not saying exactly that.  For a subject-based archive
(as opposed to institutional), the maintainer has an interest in retaining
credit for his efforts. He may also have at least a perceived interest in
retaining control over the archival procedures.  If an outside archive is
assimilated into the huge arXiv, certainly it increases the visibility,
accessibility, you-name-it-ability, of the individual papers.  However the
former maintainer's name may well fade into the background.  At best
asking a maintainer to merge with the arXiv is asking him to change his
duties (if he stays on as an arXiv moderator or an overlay maintainer).
At worst it's asking him to retire.  The math advisory committee has had
dozens of negotiations to merge material into the arXiv.  We consider
all such negotations to be delicate.

After all, Stevan, suppose that we told you that CogPrints would be better
off as part of the arXiv and you should surrender your collection and
your responsibilities.  Would you immediately agree, or would you want
some time to think about it?

Some might ask, what is there to decide about how to run an archive?
For example, the arXiv's policy is that DVI is unreliable as an input
format, although it does offer it as output.  The arXiv requires TeX
source for new submissions if they are written in TeX.  There are other
subject-based archives out there that accept *only* DVI as a submission
format.  The maintainers of these archives feel that TeX source is an
unreliable input format, and moreover that TeX source is confidential
for some authors.  It is very difficult to defuse this seemingly minor
issue, and it is only one of several such issues.

For institutional preprint series the issues are a little different,
but they are equally obstructive.  Usually an institutional maintainer
is less interested in retaining credit, but more concerned, sometimes
correctly, about following his mandate.  If we suggest to university
U that they contribute their papers to the arXiv, the maintainer at U
may say our faculty gave permission for me to list their papers in our
preprint series, but not to contribute them to your arXiv.  That can
lead to yet another bureaucratic thicket.

Right behind these superficial issues are more significant ones like
permanence.  The fact is that many institutional and subject-based
archives do not want the responsibility of permanence.  Some of them
explicitly repudiate it.  A standards-based virtual archive approach,
such as OAI, aspires to please every side and sweep all such issues under
the rug.  I wonder if this is rushing in where angels fear to tread.

 There is no (not-readily-solvable) permanence question. At this
 point, getting the literature on-line and free is the most important
 thing to do, now. The collective interests that this will generate in
 KEEPING it all on-line and free will ensure that all proper steps are
 taken to ensure permanence.

Again, experience tells me otherwise.  Thousands of math preprints have
come and gone on the web.  Let me also give you a quote from a help page
of a non-arXiv math archive:

When your paper is ultimately published we would greatly appreciate
being informed. At that time we will remove the preprint and leave
a pointer to the journal in which it was published.

This flatly contradicts your vision of freeing the literature.  But OAI
itself does not pass judgement on such policies.

 The OAI-compliant archive-creating/maintaining Eprints software has the
 same notification service as CogPrints -- indeed, it is a generic
 adaptation of the CogPrints software!

Yes, but it *only* notifies the subscribers of that one little archive.
The OAI standard leaves OAS agents with no clear notification mechanism,
because there is no guarantee that the agent will be notified in a
timely manner by the foundational archives.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-03 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

 We have had much more success by moving in the opposite direction,
 i.e., by strengthening distributed open archival with a centralized
 foundation.

And continued good success to the math arXiv project!

But why restrict efforts to centralized ones only? The whole point of
OAI interoperability is that it should no longer make any difference
whether a refereed paper is archived in a central archive or a
distributed archive or both! (The only alternative we want to avoid is
neither!)

By way of example of how it no longer makes any difference, CogPrints
http://cogprints.soton.ac.uk is a centralized archive for
cognitive science -- but it is using EXACTLY the same OAI-compliant
Eprints architecture as is has been developed for distributed,
institution-based archiving by http://www.eprints.org. In fact, the
OAI-compliant Eprints software was DERIVED from the prior centralized
CogPrints software!

And institutions are institutions, whether they mount centralized
archives or institutional archives.

And mirroring and harvesting for reliability and permanence are
available to both.

So why keep repeating that centralized archiving helped accelerate math
archiving more quickly than the prior (pre-OAI) distributed archiving?
True, but things didn't stop there. And linear growth is still linear
growth, whereas what we need is exponential growth, across all
disciplines, if we are to reach the optimal and inevitable before we
expire!

So let 1000 flowers bloom, central and distributed. Interoperability
will harvest them all.

 The MPRESS project (http://mathnet.preprints.org/)
 has a lot in common with OAI, and it was started before the universal
 math arXiv.  It has its own metadata standard, Dublin Core, and its
 has a number of institutional preprint series among its data feeds.
 But it hasn't yet caught on.

Maybe that was because it was going it alone, instead of distributing
its efforts across disciplines, as the Open Archives Insitiative is
doing. It's one thing to adopt a standard, quite another to get others
to adopt it too.

(This is why your advocacy of centralized archiving and anti-advocacy
of distributed archiving is divisive and counterproductive: We should
be supporting every effort that gets all the refereed literature up
there, online, accessible, searchable, navigable, and free for all.
Centralized archiving has not managed this alone, so let it now benefit
from the help of Distributed Archiving!)

 It doesn't seem to make much difference to
 authors whether a preprint series is indexed by MPRESS or not.

I don't understand this point. It may be another symptom of the
conflation between publishing and archiving, and between preprints and
postprints: What authors are choosing when they PUBLISH a paper, is a
journal, i.e., a quality-certifier with a known level of quality, a
trusted brand. What authors are choosing when they ARCHIVE their
eprint -- whether the journal-certified, refereed POSTprint or the
unrefereed PREprint -- is a means of making their paper maximally
visible and accessible online, for free for all. OAI-interoperability
provides that, provided the metadata-protocol is shared by all
archives, irrespective of whether they are centralized or
institutional.

MPRESS apparently did not become such a universal (we might even call it
distributed) standard. Perhaps this was in part because it did not
inititially adopt OAI's strategy of minimalism: Pick the minimal
functional metadata set, to maximize the ease of compliance, rather than
going all the way to Dublin Core from the outset. (OAI is inching
towards Dublin Core too, but thanks to minimalism and proselytising
across disciplines, it may manage to bring everyone else along with
it.)

 Part of
 the trouble with MPRESS is that not all of its sources are providing
 as good metadata as they promised.  Ironically the lion's share of good
 metadata in MPRESS comes from the math arXiv.

 I would like to know where OAI thinks that MPRESS went wrong.  In fact
 since I maintain a service provider for the math arXiv, I looked into
 using OA-compliant metadata instead of the ad hoc metadata that I get from
 the arXiv.  I discovered that the OA standard is an oversimplification
 of the full arXiv metadata record, to the point that I can't use the
 OA format.

I will have to leave this to OAI experts to reply to.

 But don't get me wrong.  I am in favor of fragmented interoperability if
 you really can't hope for something better.  And as I said, the overall
 STM literature might well have to be fragmented, for now, down to the
 level of individual disciplines (e.g. chemistry) or small groups of
 disciplines (physics+math+cs).

Fragmented interoperability is a tautology: The whole point of
interoperability is shared metadata standards unifying distributed
(fragmented) systems.

As to hopes: The only pertinent hope is the freeing of the entire
refereed literature online. Centralized self-archiving alone 

Re: Central vs. Distributed Archives

2000-11-03 Thread Greg Kuperberg
On Fri, Nov 03, 2000 at 08:24:44AM +, Stevan Harnad wrote:
 But why restrict efforts to centralized ones only? The whole point of
 OAI interoperability is that it should no longer make any difference
 whether a refereed paper is archived in a central archive or a
 distributed archive or both! (The only alternative we want to avoid is
 neither!)

It is not really a neutral statement to declare that it no longer
matters whether a paper is in a central archive or a distributed one.
Each archive is in a way an entrenched interest.  Each archive maintainer
has put a lot of work into his or her project, and therefore wouldn't
want it assimilated into a larger archive without a very good reason.
So saying that it no longer matters whether it is centralized or
distributed is like saying that it no longer matters whether states
answer to Washington.

This is overconfidence.  The biggest reason that it is overconfidence
is that it defers the permanence question.  But there are other reasons
as well.  One is that one of the most useful features of the arXiv
(and similar services such as CogPrints) is immediate notification of
new results.  Another is non-redundancy: the arXiv almost completely
eliminates the disarray of having many copies of a paper which may
or may not be different versions.  The OAI standard does not address,
and perhaps cannot address, either of these important advantages of a
centralized system.

A more balanced point of view would be to recognize that while a
standards-based distributed system may be much better than anarchy,
it doesn't finish the job.

I also note that interoperability keeps getting reinvented.  Precedent
suggests that if OAI succeeds, it will fade into a transparent layer,
and that beyond it people will see incompatability at a new level and
invent another standard.  HTTP is already an interoperability standard,
originally invented for the purpose of distributing research documents.
And there are already HTTP-based search engines, including CiteSeer,
which searches only for research papers.  So it's important explain how
OAI would go beyond HTTP+CiteSeer.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
I have been skimming the September98 forum on and off for a few months.
As a cursory Internet search will demonstrate, I strongly support
what I consider the Ginsparg model, especially in my own discipline,
mathematics.  I would call it the arXiv model.  But while I agree in
outline with Stevan Harnad et al, I disagree in some of the details.
(And that's where the devil is.) Here is my take on three issues in
particular.

1) I have mixed feelings about the grass-roots connotations of the Open
Archives Inititiative and even more in Harnad's phrase self-archiving.
I do believe that the research literature should be electronic and free,
and it is possible that each discipline must pass through an anarchic,
do-it-yourself phase of open archival before moving on to a more
organized stage.  However, when I started archive work in mathematics,
we already had an array of separate preprint servers cum e-print archives.
The effort since then has been to reorganize much of this jumble into the
math arXiv.  Having many copies of one huge archive is superior to having
many little archives, no matter how interoperable.  Serious permanence
and stability requires closer cooperation than that.

At the overall STM level the literature may have to be divided
into single-discipline or few-discipline fragments for some time.
The Los-Alamos based arXiv works well for the TeX-based e-print culture
in mathematics, physics, and parts of computer science.  But it is
not clear how to extend that particular system to the rest of science.
If you have to have disjoint archives, fragmented interoperability is
then a good goal to work towards.  But you have to realize that it is
only a partial solution.  And I have reservations about encouraging every
tenth researcher to set up yet another archive, because that can lead to
entrenched Lilliputian feifdoms of e-prints.  By my standards the physics
part of the arXiv, with 130,000 e-prints, is large; the math arXiv,
with 13,000, is medium-sized; and an archive with 1,300 or less is tiny.

2) I have been accused, sometimes correctly, of being overzealous in
my support of the arXiv.  I see that Stevan Harnad has about as much
enthusiasm as I do, and I can't criticize that.  But if the September98
forum has strong advocacy in favor of open archives, it doesn't make sense
to limit criticism.  Because then you're just preaching to the choir.
If you don't want to debate whether or not open archives are a good idea,
maybe that makes sense.  But then you shouldn't dwell on how fantastic
open archives are; instead you should steer the discussion to practical
plans.

3) I also can't criticize Elsevier's Chemistry Preprint Server project.
In a way I can't even criticize commercial publishers with high journal
prices, even though I believe that the mathematical literature should
be free.  A for-profit company is entitled to maximize profit.  If it is
publicly traded, it is legally required to do so up to a point.  (But the
same token, the customer, academia, is entitled to minimize expenses.)
I'm against Napster-style copyright infringement and I have mixed
feelings about journal boycotts.  My approach is less confrontational.
My own recent papers lie permanently in the arXiv, I keep the copyright,
and I will publish in any journal that wants the papers on those terms.

From this point of view, I am not sure about the Chemistry Preprint
Server, because I don't see the business model for it.  But then, I
don't see the business model for Google either, and I think that Google
is great.  It is possible that the Chemistry Preprint Server will be
an important gift from Elsevier to the chemistry research community.
Arguably the chemists should have done it for themselves, but maybe they
lack leadership and need Elsevier to do it for them.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

 1) I have mixed feelings about the grass-roots connotations of the
 Open Archives Initiative and even more in Harnad's phrase
 self-archiving.

You have to distinguish between the Open Archives Initiative (OAI) and
the (Author/Institution) Self-Archiving (Sub-)Initiative.

OAI has now evolved into an initiative for shared standards and
interoperability in the metadata tagging of the contents of online
archives -- WHETHER OR NOT the contents (i.e., apart from the metadata)
of the archives are full-text or free: http://www.openarchives.org

A commercial publisher, for example, can establish an OAI-compliant
Open Archive as readily as any other institution or individual, and
would benefit from the increased visibility provided by the
OAI-compliant interoperability for the contents of the Archive, even if
the full-texts were kept behind an S/L/P financial firewall.

A journal publisher can also establish an OAI-compliant FREE Open
Archive, if they do wish to give away their full-text contents at this
time (as around 400 biomedical publishers are currently willing to do,
as indicated in a very recent posting:
http://www.freemedicaljournals.com
-- although most of those archives are not yet OAI-compliant).

Nor is the OAI particularly committed to either centralized,
discipline-based Open Archiving (e.g. ArXiv, CogPrints) or distributed,
institution-based Open Archiving (Eprints): It is developing
interoperability standards that apply to both, with the objective of
making the difference between them less significant, eventually perhaps
even irrelevant.

The (Author/Institution) Self-Archiving (Sub-)Initiative, however, is
SPECIFICALLY concerned with freeing the refereed research literature
through author/institution self-archiving (in OAI-compliant Open
Archives): http://www.eprints.org

 I do believe that the research literature should be
 electronic and free, and it is possible that each discipline must pass
 through an anarchic, do-it-yourself phase of open archival before
 moving on to a more organized stage.

It is not at all clear why you describe open archiving as anarchic!
It was precisely in order to put order into distributed online digital
archiving resources through interoperability that the OAI was
initiated!

And the other aspect of the order is the order already provided by the
refereed journals, in the form of peer review and its certification.
That order is medium-independent, and will be preserved in a
well-tagged Open Archive: Journal-Name will be a field, etc.

The only do-it-yourself issue is self-archiving itself. And the issue
is very clear: If researchers want the refereed literature freed, now,
then they can do it themselves, by self-archiving, now. Otherwise, they
have to wait until someone else (the journal publishers?) decides to
free it for them -- and that could prove to be a very long wait
indeed.

Harnad, S. (1999) Free at Last: The Future of Peer-Reviewed
Journals.  D-Lib Magazine 5(12) December 1999
http://www.dlib.org/dlib/december99/12harnad.html

 However, when I started archive work in mathematics, we already had an
 array of separate preprint servers cum e-print archives. The effort
 since then has been to reorganize much of this jumble into the math
 arXiv. Having many copies of one huge archive is superior to having
 many little archives, no matter how interoperable. Serious permanence
 and stability requires closer cooperation than that.

Again, it is a question of how long the researcher community is willing
to wait for the optimal and inevitable: It is now within immediate
reach to eliminate all the research access/impact-barriers, now,
through self-archiving. Interoperability will integrate the results
into a global Archive of the entire refereed research literature, in
all disciplines, as searchable as the Institute for Scientific
Information's Current Contents Database -- but including the full-texts
themselves (and free). (See ARC as a prototype and fore-taste of this
capability:  http://arc.cs.odu.edu/)

But note that arXiv-style centralized, discipline-based self-archiving
in Physics, the most advanced self-archiving on the planet -- with
130,000 archived paper in 10 years -- has only freed 30-40% of the
Physics literature so far, and will take 10 more years to free it all
at the present steady linear growth rate:
http://arXiv.org/cgi-bin/show_monthly_submissions

Note that I used to cite the above graph repeatedly as evidence that
the self-archiving cup is half-full. But it is also evidence that it is
still half-empty -- and taking another 10 years to fill.

So the idea is that distributed, pan-disciplinary, institution-based
self-archiving (OAI-compliant, of course) may be what is needed to get
this growth rate into the exponential range for Physics, as well as to
carry it over into all the other disciplines.

Of course multiple copies and mirroring (and harvesting and caching)
will be as important for 

Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 03:07:58PM +, Stevan Harnad wrote:
 It is not at all clear why you describe open archiving as anarchic!
 It was precisely in order to put order into distributed online digital
 archiving resources through interoperability that the OAI was
 initiated!

I certainly think that a standard for interoperability could be useful,
but it is wishful thinking to suppose that it can tame an anarchy of many
tiny little e-print archives.  In my discipline, when the literature
is excessively decentralized, as it was entirely before 1998 and still
largely is, neither authors nor readers have any confidence that papers
floating around on the Net are permanent.  And they are right, because
no one could promise to keep those papers forever with any credibility.
Any given paper could be erased accidentally if it is in one tiny
archive somewhere.  Or maybe the maintainer of that particular archive
never explicitly promised permanence anyway; if so he could shut down
his archive when he gets tired.  The fact that the arXiv is so large
and so widely used and mirrored is a necessary ingredient for assuring
permanence.

 The only do-it-yourself issue is self-archiving itself. And the issue
 is very clear: If researchers want the refereed literature freed, now,
 then they can do it themselves, by self-archiving, now.

The self in self-archiving could mean individuals acting for themselves,
or it could mean the research community acting for itself by directly
supporting one or a few archives.  I have the feeling that you don't
see this as an important distinction.  I'll give you an analogy to show
you what I mean.  I use Linux, which an open, standards-based operating
system.  It would be absurd to call my use of Linux self-programming,
even though Linux is maintained by some of its users.  I see the arXiv as
highly analogous to Linux.  This is why I am reluctant to use the phrase
self-archiving.

 Again, it is a question of how long the researcher community is willing
 to wait for the optimal and inevitable: It is now within immediate
 reach to eliminate all the research access/impact-barriers, now,
 through self-archiving.

I can't say that this ambitious goal is within immediate reach in
mathematics, because many of us have worked hard to make it happen and
we see a lot of work ahead.  We can't expect all mathematicians to change
their minds in one day.  I have no desire to believe, as I once did,
that the exponential rocket is about blast off.

If you think that encouraging many small archives to spring up is the
magic step, then I simply disagree.  Because when we glued together
many small archives into the math arXiv, the whole was much more than
the sum of the parts.  Even though the math arXiv has only 5% of new
math papers, and even though it will take years for it to get to even
50%, it is at least growing more quickly than all of the Lilliputian
mathematical archives put together.

  The Los-Alamos based arXiv works well for the TeX-based e-print culture
  in mathematics, physics, and parts of computer science. But it is not
  clear how to extend that particular system to the rest of science.

 Why? This formula has been repeated so many times that people are
 actually believing it, without anyone ever having explained why it
 should be thought to be true!

I don't mean to say that other disciplines can't have an open archive
that's *like* the arXiv.  I certainly think that they can.  I mean that
other disciplines are sufficiently different that their open archives
might need separate administration.  And that would lead to fragmentation,
which concerns me more than it does you.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

 I certainly think that a standard for interoperability could be useful,
 but it is wishful thinking to suppose that it can tame an anarchy of many
 tiny little e-print archives. In my discipline, when the literature
 is excessively decentralized, as it was entirely before 1998 and still
 largely is, neither authors nor readers have any confidence that papers
 floating around on the Net are permanent. And they are right, because
 no one could promise to keep those papers forever with any credibility.
 ... The fact that the arXiv is so large and so widely used and
 mirrored is a necessary ingredient for assuring permanence.

(1) Archives meeting the conditions to be registered OAI-compliant
data-providers http://www.openarchives.org/sfc/sfc_archives.htm are
not likely to be tiny little ones (though it is no problem if some of
them are).

(2) Most Eprints Archives are likely to be university-based archives, for
all the university's refereed research, in all its disciplines. That's
hardly tiny (or impermanent) either.

(3) The goal is to free the refereed literature, across disciplines,
now. Once the literature is thus freed the process will be irreversible.

(4) The mechanisms for preserving and navigating it will continue to
evolve and improve, with the whole world's refereed assets in this
distributed basket (suitably mirrored, harvested, cached, backed up,
etc.).

(5) The immediate issue is hence not the PERMANENCE of the
self-archived drafts but their EXISTENCE, free for all, now. The
permanence will take care of itself.

 The self in self-archiving could mean individuals acting for themselves,
 or it could mean the research community acting for itself by directly
 supporting one or a few archives. I have the feeling that you don't
 see this as an important distinction.

You are right; I think it is a red herring. Most of the individuals in
question (the authors of the refereed literature) are researchers at
universities and research institutions. In principle each of them could
set up his own Eprints Archive and register it with the OAI (and that
would be fine as a start, and would free the literature irreversibly).

But of course the likely, practical strategy is for the researchers'
universities and research institutions (or, more specifically, their
libraries) to create and administer their institutional Eprint Archives
for all their researchers' refereed output, in all disciplines. (We can
have at least as abiding a faith in the durability of the collections
on universities' airwaves, then, as we now have in the durability of
the collections on their shelves).

 I can't say that this ambitious goal is within immediate reach in
 mathematics, because many of us have worked hard to make it happen and
 we see a lot of work ahead. We can't expect all mathematicians to change
 their minds in one day.

You are now talking about something else: You are talking about what it
will take to induce the research cavalry to drink, once they have been led
to the waters of self-archiving.

There's no second-guessing human nature, but my own hunch is that the
motivational structure at the researchers' own institution -- the one
that benefits from (and rewards) the impact of its own researchers'
refereed output, and the one that is today weighed down by the serials
crisis and the limitations that that puts on its own researchers'
access to the refereed output of researchers at other institutions --
may provide just the kind of local incentive for self-archiving that a
centralized, discipline based entity so far seems unable to provide.

In any case, these two routes to the liberation of the refereed corpus
(centralized and distributed) are complementary (and interoperable!).

 If you think that encouraging many small archives to spring up is the
 magic step, then I simply disagree. Because when we glued together
 many small archives into the math arXiv, the whole was much more than
 the sum of the parts. Even though the math arXiv has only 5% of new
 math papers, and even though it will take years for it to get to even
 50%, it is at least growing more quickly than all of the Lilliputian
 mathematical archives put together.

I am not a mathematician, but this whole is greater than the sum of its
parts argument does not add up for me!

Centralized archiving in maths is at 5% and will take years to get to
50%. What possible reason would there be not to encourage complementing it
by institutional Eprint Archives immediately -- given that they will all be
co-harvested (and mirrored, and cached, etc.) in global virtual archives
anyway, thanks to interoperability?

 other disciplines are sufficiently different that their open archives
 might need separate administration. And that would lead to fragmentation,
 which concerns me more than it does you.

My concern is freeing the refereed literature online, now. There is no
reason it should stay hostage to S/L/P barriers for another 

Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On my other points:

On Thu, Nov 02, 2000 at 03:07:58PM +, Stevan Harnad wrote:
 I have, as moderator, terminated discussion on a few irrelevant or
 saturated topics (is there a conspiracy of university administrators to
 control researchers' intellectual property? is the library serials
 crisis simply a consequence of under-funding the libraries? how can we
 reform or abandon peer review?), but comments, whether supportive or
 critical, on the Forum's central theme -- How to free the refereed
 literature online, now? -- have never been suppressed.

You may see it as closing discussion of all sides of a topic, but I see
some character of closing down just one side of a debate.  Obviously you
are referring to Al Henderson's argument that free scholarly communication
is a stress response to penny-pinching by university administrations.
I'll grant that he has said that many times, and I'll also grant that the
argument sounds absurd to me.  (I am one of the researchers supposedly
bullied by the administration, and if anything my complaint is that
the higher-ups are biased in favor of the historical subscription-based
system.)  But even though I don't agree with him at all, he is no more
repetitive than you are or I am.  Invoking cloture strikes me as an
overreaction.

 I couldn't agree with you more! But what gives you the impression that
 this Forum is trying to prevent companies from doing whatever they
 like?

What you said originally was:

   The Elsevier policy of publicly archiving pre-refereeing preprints
   could be a good first step towards the optimal and inevitable, but it
   is also possible that it is intended as a Trojan Horse,...

I think it's divisive to speculate that someone else's e-print archive is
a Trojan Horse.  It's true that I'm not sure that the CPS is compatible
with Elsevier's mission of maximizing profit.  But let's give it the
benefit of the doubt.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

  what gives you the impression that
  this Forum is trying to prevent companies from doing whatever they
  like?

 What you said originally was:

sh The Elsevier policy of publicly archiving pre-refereeing preprints
sh could be a good first step towards the optimal and inevitable, but it
sh is also possible that it is intended as a Trojan Horse,...

 I think it's divisive to speculate that someone else's e-print archive is
 a Trojan Horse.  It's true that I'm not sure that the CPS is compatible
 with Elsevier's mission of maximizing profit.  But let's give it the
 benefit of the doubt.

Good. Both sides of the question have been aired.

(Please distinguish my actions as moderator, when I invoke cloture,
from the expression of my own views on this topic -- which carry no
more weight then anyone else's ex officio.)

Stevan Harnad


Re: Central vs. Distributed Archives

2000-11-02 Thread Stuart A Yeates
Stevan Harnad wrote:

 (3) The goal is to free the refereed literature, across
 disciplines, now. Once the literature is thus freed the
 process will be irreversible.

Do you mean free as in liberty or free as in free beer ?

This particular bone of contention has effectively split what used to be be
known as a free software movement, but is now known as the free software/open
source movement.


--stuart yeates s.yea...@cs.waikato.ac.nz


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Fri, 3 Nov 2000, Stuart A Yeates wrote:

  (3) The goal is to free the refereed literature, across
  disciplines, now. Once the literature is thus freed the
  process will be irreversible.

 Do you mean free as in liberty or free as in free beer ?

 This particular bone of contention has effectively split what used to be be
 known as a free software movement, but is now known as the free software/open
 source movement.

Free in the way advertisements are free (which I suppose is more like
free beer -- when you're giving away your own home-brew).

But this refereed brew is definitely not free in the sense of liberty
(that would be the vanity press). It is constrained by and answerable to
peer review. Hence it is not relevantly like software either.

But once it successfully passes that quality-control process, and is
certified as such, the author can and should maximize the access to,
and hence the impact of this give-away refereed research by
self-archiving it online, free for all.

http://www.arl.org/sc/subversive/


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
I like Greg Kuperberg's postings, even though we disagree. Greg too is an
advocate of freeing the literature through author self-archiving, but he
prefers centralized archives, whereas I think both centralized and
distributed archiving are welcome and should be encouraged, as both can
hasten the freeing of the refereed literature.

Centralized archiving has been with us for over 10 years, and at its
current rates it will take 10 more years to free the Physics literature
alone, where it is most advanced. In Greg's own field of mathematics,
it might be going even more slowly. It looks to me as if centralized
self-archiving can now use the help of distributed institutional
self-archiving.

By way of counterevidence, Greg cites the fact that in mathematics
institutional self-archiving predated centralized self-archiving
and was unreliable. It was centralized self-archiving that accelerated
and stabilized the process.

What Greg seems to overlook is that the institutional self-archiving he
describes PRE-DATED the Open Archives Initiative (OAI), with its
interoperability. Hence the question of whether or not distributed
self-archiving in OAI-compliant Institutional Eprint Archives will
accelerate the freeing of the literature has not yet been tested.

Greg also seems to conflate, at some junctures, the self-archiving of
unrefereed preprints with the self-archiving of refereed postprints,
as if self-archiving were in some sense a rival to or substitute for
refereed publication (which I certainly do not think it is);
self-archiving is merely a way to free the refereed literature.

On Thu, 2 Nov 2000, Greg Kuperberg wrote:

 In 1997, the year before the universal math arXiv was started, there
 were already some 10 or 20 thousand research papers freely available on
 the web. Most of them were on personal home pages, but thousands were
 in institutional and subject-based preprint series.

This is irrelevant, as noted above. These archives were not
OAI-compliant and hence could not be integrated or navigated in a
useful way.

 Nonetheless the vast majority of these papers were still eventually
 sold as published papers.

This too is irrelevant. The initiative to free the refereed literature
is a PRO-RESEARCHER and PRO-RESEARCH initiative, not an anti-publisher
initiative (nor even particularly a pro-library initiative):

The goal is to free the refereed literature for one and all online.
That is what self-archiving does.

The goal is NOT to prevent other versions of the refereed literature
from being sold, on-paper or on-line, if there is a market for them.
(Why would we want to do that?)

 So what were the publishers selling? Not peer review, because you
 can learn from Math Reviews where a paper has been published without
 subscribing to the journal. To a large extent the journal system was
 selling, and is still selling, stability and permanence.

Fine. Let it continue to do so (whether the stability/permanence is real
or merely imagined). As long as another version is online and free, the
goal is met.

 So that has been the fundamental question of open archival in
 mathematics for years. That is why some of the recalcitrant math
 publishers say that the arXiv is just a preprint server and not a
 permanent e-print archive. Of course I don't agree with them; I
 choose the arXiv over subscription journals as the future route to
 permanent archival.

I'm afraid that this is not making sense to me. What is the argument?
That the jeering of some publishers nullifies the fact that that portion
of the refereed literature that has been freed is indeed free?

The substantive question is: Are the refereed papers online and free? If
they are, who cares if some people keep calling them prepints, when in
reality they include both, pre-refereeing preprints + post-refereeing
postprints (= eprints)?

But I sense another point of disagreement with Greg: Earlier he said
it's not the peer-review that makes people keep paying for the for-fee
(refereed) version despite the availability of the for-free (refereed)
version, but the stability and permanence. Perhaps. But if the
implementation of the peer-review were no longer paid for by the
continued support for the publishers' version, perhaps the true value
and causal role of peer-review in all of this would become clearer.

Moreover, for now, it is not true stability/permanence that
distinguishes the publishers' for-fee version and the archives'
for-free version, but mere PERCEIVED stability/permanence.

With time, that may change. But for now it certainly isn't any reason to
deter us from self-archiving, either centrally or institutionally. On
the contrary; as long as the publishers' for-fee version is seen as the
guarantor of the stability/permanence, there is no reason whatever NOT
to SUPPLEMENT that with the self-archived free version -- without giving
the stability/permanence issue another thought!

 As a practical matter most of the institutional preprint series in
 mathematics 

Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Fri, 3 Nov 2000, Stuart A Yeates wrote:

 So if I hear you correctly OAI will have no traffic with technical reports or
 technical report servers? these _are_ vanity press.

Incorrect. Eprints Archives are for both unrefereed preprints and
refereed postprints, suitably tagged as such.

Stevan Harnad


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 09:29:24PM +, Stevan Harnad wrote:
 Centralized archiving has been with us for over 10 years, and at its
 current rates it will take 10 more years to free the Physics literature
 alone, where it is most advanced. In Greg's own field of mathematics,
 it might be going even more slowly. It looks to me as if centralized
 self-archiving can now use the help of distributed institutional
 self-archiving.

Actually the main difference in math is that we in effect started
later than physics did.  Part of the reason for that is that some of
the mathematicians involved, including me but not mainly me by any
means, instead devoted effort to umbrella archive projects (i.e.,
global virtual archives) that ultimately failed.  We have had much
more success by moving in the opposite direction, i.e., by strengthening
distributed open archival with a centralized foundation.

 What Greg seems to overlook is that the institutional self-archiving he
 describes PRE-DATED the Open Archives Initiative (OAI), with its
 interoperability.

This is partly untrue.  The MPRESS project (http://mathnet.preprints.org/)
has a lot in common with OAI, and it was started before the universal
math arXiv.  It has its own metadata standard, Dublin Core, and its
has a number of institutional preprint series among its data feeds.
But it hasn't yet caught on.  It doesn't seem to make much difference to
authors whether a preprint series is indexed by MPRESS or not.  Part of
the trouble with MPRESS is that not all of its sources are providing
as good metadata as they promised.  Ironically the lion's share of good
metadata in MPRESS comes from the math arXiv.

I would like to know where OAI thinks that MPRESS went wrong.  In fact
since I maintain a service provider for the math arXiv, I looked into
using OA-compliant metadata instead of the ad hoc metadata that I get from
the arXiv.  I discovered that the OA standard is an oversimplification
of the full arXiv metadata record, to the point that I can't use the
OA format.

But don't get me wrong.  I am in favor of fragmented interoperability if
you really can't hope for something better.  And as I said, the overall
STM literature might well have to be fragmented, for now, down to the
level of individual disciplines (e.g. chemistry) or small groups of
disciplines (physics+math+cs).
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Steve Hitchcock

At 21:29 02/11/00 +, Stevan Harnad wrote:

 Obviously I'm not a conservative offering rationales for inaction.
 And my worry is not a priori. NCSTRL and MPRESS are two long-standing
 attempts at standards-based fragmented interoperability. Neither one
 has as much readership as the younger, fully integrated math arXiv.

They pre-dated OAI and Eprints. Have just a bit more patience; but be
prepared to set aside prior prejudices or you will obstruct precisely
what we both want to facilitate!


NCSTRL was effectively the model for OAi. Greg Kuperberg suggests that
NCSTRL has not been successful. It would be useful to have some meaningful
measure of whether NCSTRL has been successful or not, and to hear the views
of the NCSTRL developers (who are also involved in OAi). Maybe real
evidence will yield clues to the ultimate destiny of OAi - central or
distributed.

The Harnad-Kuperberg dialogue has been fascinating but, to my mind, hasn't
resolved the issue conclusively. It will be critical to understand what the
user wants.

Steve


Re: Central vs. Distributed Archives

2000-11-02 Thread Michael L. Nelson
(note: I'm not sure this will get through all the aliases -- I don't think
this email addr is registered with the UPS list, for example)

On Thu, 2 Nov 2000, Steve Hitchcock wrote:

 NCSTRL was effectively the model for OAi. Greg Kuperberg suggests that
 NCSTRL has not been successful. It would be useful to have some meaningful
 measure of whether NCSTRL has been successful or not, and to hear the views
 of the NCSTRL developers (who are also involved in OAi). Maybe real
 evidence will yield clues to the ultimate destiny of OAi - central or
 distributed.


just a point of clarification:  NCSTRL was not directly the model for OAI,
at least architecturally.

OAI has more in common with:

- RePEc (http://www.repec.org/)
- SODA (http://www.dlib.org/dlib/march99/maly/03maly.html)

and similar architectures.

A subset of the Dienst protocol gave us a starting ground for defining a
harvesting protocol, but even that has been relaxed to allow Dienst and
OAI to progress independently.

Most OAI service providers will probably assume a distributed storage
model, because it is certainly easier to build.  But technically OAI is
agnostic with respect to centralized vs. distributed storage of data.
OAI focuses only on metadata.

Regarding centralized vs. distributed, I would submit CiteSeer

http://citeseer.nj.nec.com/cs

as an exemplary DL that seems to have resolved the tension between the two
models - providing both links to distributed copies and cached centralized
copies.

regards,

Michael

 The Harnad-Kuperberg dialogue has been fascinating but, to my mind, hasn't
 resolved the issue conclusively. It will be critical to understand what the
 user wants.

 Steve


 --
 UPS mail list
 Mail submissions to u...@vole.lanl.gov
 To subscribe or unsubscribe visit http://vole.lanl.gov/mailman/listinfo/ups


---
Michael L. Nelson
207 Manning Hall, School of Information and Library Science
University of North Carolinam...@ils.unc.edu
Chapel Hill, NC 27599   http://ils.unc.edu/~mln/
+1 919 966 5042 +1 919 962 8071 (f)


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 10:08:09PM +, Steve Hitchcock wrote:
 NCSTRL was effectively the model for OAi. Greg Kuperberg suggests that
 NCSTRL has not been successful.

I don't want to disparage a project as big and difficult as NCSTRL.
It has had some success.  It's important.  But I don't think that it's
nearly as successful as the arXiv.  I guess I said something stronger
before, that NCSTRL is not as heavily read as the math arXiv, which
is much smaller than the whole arXiv system.  Well possibly I'm wrong
on that.  But I note that the math arXiv is just as heavily read on a
per-paper basis as the larger parent arXiv system.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

1999-06-29 Thread J.W.T.Smith
Professor Harnad,

On Mon, 28 Jun 1999, Stevan Harnad wrote:

 On Mon, 28 Jun 1999, J.W.T.Smith wrote:

  My objection to the Los Alamos Archive model is that it is centralised and
  such a model can easily degenerate into a monopoly.

 A monopoly of what PRODUCT, on behalf of what PROVIDER relative to what
 MARKET? For Los Alamos is in the (government-supported) business of
 making it possible for authors to give away reports of their own
 scientific research away to one and all for free.

A monopoly in the sense that it could become 'the place' where readers
look for items relevant to their subject. The non-presence of an article
in a recognised subject specific archive could imply it is not relevant to
the subject. More on this later.

 And what do you mean centralised? Los Alamos is open to one and all,
 reader and author alike, the world over; it is mirrored in 15
 countries, cached in who knows how many other places and ways,
 incorporated into further Gateways such as NCSTRL and Spires, and there
 integrated with other archives. Anyone else can make copies of the
 archive too (that's part of what make the product free entails), and
 the authors who self-archive in it are encouraged to archive their
 papers elsewhere too, if they wish, including in their own
 institutional servers, which can then be gathered together as another
 backup of the central archive.

You are missing the point. I am not concerned with its availability, I am
concerned with the implied validation of the presence of an item in a
given archive. Even if the archive is mirrored it is a mirror of somewhere
and the address of that somewhere has value. If this has no value why to
we need an archive at all? Why don't we all mount our papers on our
University servers? There are two advantages that I can see of a subject
specific archive:

- It can be properly maintained (it is a true archive)
- It can be a 'one stop shop' of where to look for items on a specific
  subject.

I have no problem with the first role. It is the second that carries the
possibility of monopoly. As long as the archive is maintained by a neutral
organisation (like a large University) this is OK but what if it should
become privatised? Once an archive (or its mirrors) is seen as 'the place'
to search for items of interest and access to that archive can be
controlled it might be temping to place some restriction on access like
payment of a fee (for purely reasonable reasons like getting enough money
to maintain the archive). Now I know the actual quality control/validation
is provided elsewhere (maybe by the 'old' journals, maybe by other
players) but from the point of view of the author they may also need to be
in the archive as well as have the validation/stamp of approval of an
external organisation.

 As I have noted before, this central/distributed issue is a red
 herring, based in part on papyrocentric thinking (we are in reality
 talking about a distributed virtual library where locus has little
 meaning)

You seem to contradict yourself here. If 'locus' (I don't mean physical
position) has no meaning why do we need a Physics archive, or a Biomed
archive, or any other subject archive? Why can't we either have one
universal archive which simply stores and serves on request (at no cost
and forever) any item sent to it, or no archive at all with items being
stored on a user site or a University site or a commercial site (or all
three or some other option/permutation)?

 Stop thinking in terms of a reader-end product, with competition
 among access-blockers, and think instead in terms of a platform for
 author-end freebies, with collaboration among access-providers, and
 things will come into better focus. This is the refereed journal
 literature, not trade books or magazines.

You are preaching to the converted. I have been aware the trade model is
wrong for academic publishing for many years. There have been proposals to
replace this model going back to the 1920s or before. Nothing new here.

  Summary: It is possible to escape the problems of the 'trade model' of
  current academic publishing without running headlong into the possibly
  equally constraining model of a monopolistic central archive.

Yes. Change the vocabulary.

Why don't you drop the word 'journal' then? Why not use 'validator' or
some other word that indicates the role and doesn't carry over
connotations from the old papyrocentric model?

John Smith,
University of Kent at Canterbury, UK.


Re: Central vs. Distributed Archives

1999-06-29 Thread Stevan Harnad
On Tue, 29 Jun 1999, J.W.T.Smith wrote:

 A monopoly in the sense that it could become 'the place' where readers
 look for items relevant to their subject. The non-presence of an article
 in a recognised subject specific archive could imply it is not relevant to
 the subject. More on this later.

Papyrocentric thinking. We live in the era of metadata tagging and
search engines that trawl it all.

 I am not concerned with its availability, I am
 concerned with the implied validation of the presence of an item in a
 given archive.

Don't be. The validator is the journal, as it always was. The Archive is
only the free cosmic bookshelf in the Sky...

 Even if the archive is mirrored it is a mirror of somewhere
 and the address of that somewhere has value. If this has no value why to
 we need an archive at all? Why don't we all mount our papers on our
 University servers?

We should! That was the gist of my 1994 Subversive Proposal:

http://www.arl.org/sc/subversive/

But there are currently still interoperability problems with
institutional servers, so the colossal success of Los Alamos has shown
that we will reach the optimal and inevitable faster by taking both
routes, the centralised and the distributed one:

http://xxx.lanl.gov/cgi-bin/show_monthly_submissions

 There are two advantages that I can see of a subject
 specific archive:

 - It can be properly maintained (it is a true archive)
 - It can be a 'one stop shop' of where to look for items on a specific
   subject.

 I have no problem with the first role. It is the second that carries the
 possibility of monopoly. As long as the archive is maintained by a neutral
 organisation (like a large University) this is OK but what if it should
 become privatised?

EVERYTHING runs the risk of being privatized: Universities, Los
Alamos, NIH. Fighting against the privatization-frenzy in whose grip the
entire planet seems to be at the moment is a worthy enough mission, but
it is completely irrelevant to the centralization/monopoly red herring
that I believe you are preoccupied with -- for the simple reason that
the menace of privatization is completely nonspecific, and afflicts ALL
options, in principle.

In practise, I would not worry too much about a hostile take-over of NIH
by the private sector in the near future, nor about NSF tossing the
Los Alamos Archive to the Trade Winds. Besides, one of the STRENGTHS of
centralization is that the authors that have put their precious eggs
in the collective basket and the users who forage them tend to monitor
them zealously day and night, and are likely to squack vociferously if
they sense any threat:

Taubes, Gary.  E-mail withdrawal prompts spasm. (temporary
shut-down of Los Alamos Laboratory e-print archives succeeds in
raising funds) Science v262, n5131 (Oct 8, 1993):173 (2 pages).

ABSTRACT: Paul Ginsparg shut down the e-print archives of Los
Alamos National Laboratory, the physicists' pre-publication
bulletin board for a few days.  The closure incited users to
petition the Department of Energy and National Science Foundation
for funds and secured official funding from Los Alamos.

 Once an archive (or its mirrors) is seen as 'the place'
 to search for items of interest and access to that archive can be
 controlled it might be temping to place some restriction on access like
 payment of a fee (for purely reasonable reasons like getting enough money
 to maintain the archive).

A lot of other networked services are likely to get a price tag before
the tiny refereed literature archive is likely to: It is the flea on the
tail of the dog, and we will all be best served if it is given a free
ride. Again, this worry is papyrocentric and misplaced.

 Now I know the actual quality control/validation
 is provided elsewhere (maybe by the 'old' journals, maybe by other
 players) but from the point of view of the author they may also need to be
 in the archive as well as have the validation/stamp of approval of an
 external organisation.

This sentence was a bit difficult to decode, but from what I can make of
it, one entity (the established journals -- why on earth not?) can
continue to do the quality controlling and certification-tagging, and
another (new, virtual) one, the Archive, can provide free access to the
texts.

What is the problem?

  As I have noted before, this central/distributed issue is a red
  herring, based in part on papyrocentric thinking (we are in reality
  talking about a distributed virtual library where locus has little
  meaning)

 You seem to contradict yourself here. If 'locus' (I don't mean physical
 position) has no meaning why do we need a Physics archive, or a Biomed
 archive, or any other subject archive? Why can't we either have one
 universal archive which simply stores and serves on request (at no cost
 and forever) any item sent to it, or no archive at all with items being
 stored on a user site or a University site or a commercial site (or all
 three or some 

Re: Central vs. Distributed Archives

1999-06-29 Thread Stevan Harnad
On Tue, 29 Jun 1999, J.W.T.Smith wrote:

 I don't see what is 'papyrocentric' about... the idea of an item
 gaining some kudos by being in a certain archive...
 A similar situation occurs when a journal
 gains kudos from being indexed in a specific online biblographic database.
 No paper involved here.

Forget about indexing in databases. (If the primary journal publishers
need to think about what their new niche will be in the online world of
free, full-text self-archiving by authors, the secondaries and
tertiaries will unfortunately have more serious worries!)

The kudos comes from (P) the prestige (peer-review rigour, quality,
impact factor) of the journal that accepts the paper and (I) the impact
that it makes on research, in the form of further work citing and
building upon it. The potential impact will be made incomparably
greater by free online access for one and all.

Where is the papyrocentric thinking? In the thought that the paper's
locus on the Web is the source of the kudos (as the source of a paper
paper's kudos was the paper journal in which it appeared).

The accepting journal's imprimatur will shrink to a quality control
metadata tag, like a brand-name; the locus (virtual or real) of the
bytes will be of no consequence whatsoever.

Papyrocentric too is the idea that there is something to compete for in
being the locus of a paper. Nothing to sell, nothing to compete for.

 The subject specific archive seems an unnecessary complication. Applying
 Occam's razor it seems we can chop it off and the system can run happily
 without it.

The Los Alamos Archive has demonstrated that (at least in Physics), the
centralized end of the candle managed to free the literature before
the distributed end did. Occam says: Hedge your bets and do both:
Deposit in your local server AND the global one.

Harnad, S. (1998) On-Line Journals and Financial Fire-Walls. Nature
395: 127-128. http://www.ecs.soton.ac.uk/~harnad/nature.html

All authors should continue to entrust their work to the paper
journals of their choice. But if, in addition, they were to
publicly archive their pre-refereeing preprints and then their
post-refereeing reprints on-line on their Home Servers, for free
for all, then the de facto practises of the reader community would
take care of the rest (irrespective of their reservations about
bed/bath/beach reading); library serial cancellations, the collapse
of the paper cardhouse, publisher perestroika, and a free for all,
e-only serial corpus financed by author-end page charges would soon
follow suit.

A centralised variant of this subversion scenario,
http://xxx.lanl.gov, has already passed the point of no return in
Physics and some allied disciplines in the form of Paul Ginsparg's
(1994, 1996) U.S. NSF- (National Science Foundation) and DOE-
(Department of Energy) supported Physics Eprint Archive at Los
Alamos National Laboratory; as history will confirm, he
single-handedly set the world Learned Community on its inexorable
course toward the optimal and the inevitable in August 1991.

 js Why don't you drop the word 'journal' then? Why not use 'validator' or
 js some other word that indicates the role and doesn't carry over
 js connotations from the old papyrocentric model?
 
sh Suit yourself. But I think Physical Review Letters will continue to
sh prefer to call itself by its current familiar and trusted brand name --
sh  and why on earth shouldn't it?

 I'm not saying we shouldn't have Physical Review Letters (or any other
 title) just that in the new model we should stop calling it a 'journal'.

Suit yourself. Maybe we should stop calling the contents articles too.
But what's the point?

 The problem with the word 'journal' is that it carries connotations from
 the papyrocentric world. For example - the idea that an item can only be
 in one 'journal'.

And a good connotation too! We have already gone round this one before:

Referees are a scarce and overworked resource. There is no justification
for asking anyone to referee an already-refereed, already-accepted
paper yet again, for acceptance yet again, elsewhere.

See the two reasonable sources of kudos above: (P) is acceptance by a
peer-reviewed Journal; (I) (and more important) is acceptance by
one's peers through the paper's impact on their reading, research and
citations. No more need for infinite rounds of peer reviewing and
re-reviewing. Otherwise it's like going back to school for more and more
exams instead of getting on with it! We haven't the time or the manpower
for such an orgy of endless assessment (even in the UK!).

 This does not need to be the case in a net-based model.
 Your descriptions of your model seem to contain a papyrocentric
 influence since there still seems to be a close relationship between an
 item and the 'journal' that validates it.

There is nothing papyrocentric about quality control and certification.
Even eggs go through that 

Re: Central vs. Distributed Archives

1999-06-29 Thread J.W.T.Smith
Professor Harnad,

On Tue, 29 Jun 1999, Stevan Harnad wrote:

 On Tue, 29 Jun 1999, J.W.T.Smith wrote:

  A monopoly in the sense that it could become 'the place' where readers
  look for items relevant to their subject. The non-presence of an article
  in a recognised subject specific archive could imply it is not relevant to
  the subject. More on this later.

 Papyrocentric thinking. We live in the era of metadata tagging and
 search engines that trawl it all.

I don't see what is 'papyrocentric' about this since the idea of an item
gaining some kudos by being in a certain archive has no necessary
connection to the paper world. A similar situation occurs when a journal
gains kudos from being indexed in a specific online biblographic database.
No paper involved here.

  Once an archive (or its mirrors) is seen as 'the place'
  to search for items of interest and access to that archive can be
  controlled it might be temping to place some restriction on access like
  payment of a fee (for purely reasonable reasons like getting enough money
  to maintain the archive).

 A lot of other networked services are likely to get a price tag before
 the tiny refereed literature archive is likely to: It is the flea on the
 tail of the dog, and we will all be best served if it is given a free
 ride. Again, this worry is papyrocentric and misplaced.


Again I don't see why this is 'papyrocentric'. It may be paranoid but it
is not 'papyrocentric' :-) .

  Now I know the actual quality control/validation
  is provided elsewhere (maybe by the 'old' journals, maybe by other
  players) but from the point of view of the author they may also need to be
  in the archive as well as have the validation/stamp of approval of an
  external organisation.

 This sentence was a bit difficult to decode, but from what I can make of
 it, one entity (the established journals -- why on earth not?) can
 continue to do the quality controlling and certification-tagging, and
 another (new, virtual) one, the Archive, can provide free access to the
 texts.

 What is the problem?

The subject specific archive seems an unnecessary complication. Applying
Occam's razor it seems we can chop it off and the system can run happily
without it.

  Why don't you drop the word 'journal' then? Why not use 'validator' or
  some other word that indicates the role and doesn't carry over
  connotations from the old papyrocentric model?

 Suit yourself. But I think Physical Review Letters will continue to
 prefer to call itself by its current familiar and trusted brand name --
 and why on earth shouldn't it?

I'm not saying we shouldn't have Physical Review Letters (or any other
title) just that in the new model we should stop calling it a 'journal'.
The problem with the word 'journal' is that it carries connotations from
the papyrocentric world. For example - the idea that an item can only be
in one 'journal'. This does not need to be the case in a net-based model.
Your descriptions of your model seem to contain a papyrocentric
influence since there still seems to be a close relationship between an
item and the 'journal' that validates it. There is no reason why an item
could not be validated by more than one validator - especially if it
crosses current subject boundaries.

John Smith,
University of Kent at Canterbury, UK.


Central vs. Distributed Archives

1999-06-28 Thread Stevan Harnad
On Mon, 28 Jun 1999, J.W.T.Smith wrote:

 This entire debate seems to have become hung up on whether or not the Los
 Alamos Archive model is applicable to e-publishing or e-archiving in other
 subject areas (especially biomed). This has obscured the fact it is
 perfectly possible to believe, as I do, that the Los Alamos Archive model
 is not the way to go for many subjects yet also believe in a model where
 the role of current journals is reduced to that of quality control only.

 My objection to the Los Alamos Archive model is that it is centralised and
 such a model can easily degenerate into a monopoly.

A monopoly of what PRODUCT, on behalf of what PROVIDER relative to what
MARKET? For Los Alamos is in the (government-supported) business of
making it possible for authors to give away reports of their own
scientific research away to one and all for free.

And what do you mean centralised? Los Alamos is open to one and all,
reader and author alike, the world over; it is mirrored in 15
countries, cached in who knows how many other places and ways,
incorporated into further Gateways such as NCSTRL and Spires, and there
integrated with other archives. Anyone else can make copies of the
archive too (that's part of what make the product free entails), and
the authors who self-archive in it are encouraged to archive their
papers elsewhere too, if they wish, including in their own
institutional servers, which can then be gathered together as another
backup of the central archive.

http://xxx.soton.ac.uk/servers.html
http://ncstrl.cs.cornell.edu/
http://www.slac.stanford.edu/spires/about_spireshep.html

As I have noted before, this central/distributed issue is a red
herring, based in part on papyrocentric thinking (we are in reality
talking about a distributed virtual library where locus has little
meaning) and in part on proprietary thinking, based on the reader-end,
access-blockage trade model (whereas we are talking about self-archiving
facility in which authors distribute their own products for free).

This has all been discussed in:

http://amsci-forum.amsci.org/archives/september-forum.html

See:

HTTP://AMSCI-FORUM.AMSCI.ORG/scripts/wa.exe?A1=ind99L=september-forumF=lfO=TH=0D=0T=1#5

 You asserted in a
 recent note (27 June) that there was no intention that any archive become
 a 'mega-journal'. However if it becomes the place where academics in a
 given subject expect to find relevant articles it will have become just
 that and it will become *necessary* for authors to place their work there.

Nothing of the sort! The journal is the quality controller and
certifier. There will continue to be the full spectrum and hierarchy of
journals, varying in quality and impact factor, each with its own
distinctive brand name. In the virtual archive, this will be
designated by tags, so you can restrict your search engine to the
refereed literature appearing in, say, American Physical Society
journals only, if you wish.

An Author Archive is hence, as I said, not a Mega-Journal: It is an
archive, in which the entire refereed journal literature (as well us the
unrefereed preprint literature) is available for free for all.

Now who is monopolizing what for whom?

 Although I have long argued, e.g.,
http://www.ukc.ac.uk/library/papers/jwts/d-journal.htm
 for the separation of the quality control role of the traditional journal
 from the publication role I have always advocated a 'distributed' model
 over a 'centralised' model for 'publication/archiving'. This at least
 escapes the possibility of a monopoly by the operators of the central
 archive. It also echoes the argument in Stuart Weibel's earlier note (11
 June) about the redundncy inherent in the multiple copies of
 books/journals in the current paper library model. That model may be
 inefficient (too many duplicates are kept)  but its robustness is clear.

Redundancy is a non-problem; we know all about backups, mirrors,
distributedness, and even distributed coding. It is a waste of time to
keep dwelling on these solved problems. Moreover, they have nothing to do
with the monopoly issue, which is likewise a red herring.

Stop thinking in terms of a reader-end product, with competition
among access-blockers, and think instead in terms of a platform for
author-end freebies, with collaboration among access-providers, and
things will come into better focus. This is the refereed journal
literature, not trade books or magazines.

 we should take from past publishing models that which is
 clearly of value like peer review (and maybe distributed archiving?) but
 discard that which is clearly constraining (due probably to some feature
 of the underlying medium of the old model) like the linking of quality
 control and distribution.

Correct, but then what is all this needless fuss about centralisation
and monopoly?

 Summary: It is possible to escape the problems of the 'trade model' of
 current academic publishing without running headlong into the possibly
 equally