Yeah, that's a good point, Eric.

I am, however, worried that I can't do what I want to do without
breaking 500 querries a day, and my institution is not going to be
willing to pay for it. So I'm interested in exploring other
opportunities. (Does Umlaut really not exceed 500 querries a day, for
instance?).

I am also interested in publically shared and open sourced algorithms
for workset grouping, that we can all collectively work on to improve
the state of our collective knowledge.  I am unhappy that 'our'
collective institution (OCLC) keeps the products of it's research (such
as the workset algorithm currently being used, but there are other
significant examples many of us know of) as trade secrets, and am
interested in a research project that would not do so.

If 'our' collective institution, OCLC, would share the results of it's
research as open-sourced algorithms, and would provide the services I
need at more affordable costs, then  of course neither of those would be
neccesary. One option is certainly spending time on trying to lobby OCLC
to behave differently. Another option is creating an alternative. Both
are to me legitimate options.

Jonathan

Eric Hellman wrote:
Jonathan,

It's worth noting that OCLC *is* the "we" you are talking about.

OCLC member libraries contribute resources to do exactly what you
suggest, and to do it in a way that is sustainable for the long term.
Worldcat is created and maintained by libraries and by librarians.
I'm the last to suggest that OCLC is the best possible instantiation
of libraries-working-together, but we do try.


Eric



At 3:01 PM -0400 5/9/07, Jonathan Rochkind wrote:
2) More interesting---OCLC's _initial_ work set grouping algorithm is
public. However, we know they've done a lot of additional work to
fine-tune the work set grouping algorithms.
(http://www.frbr.org/2007/01/16/midwinter-implementers).  Some of these
algorithms probably take advantage of all the cool data OCLC has that we
don't, okay.

But how about we start working to re-create this algorithm? "Re-create"
isn't a good word, because we aren't going to violate any NDA's, we're
going to develop/invent our own algorithm, but this one is going to be
open source, not a trade secret like OCLC's.

So we develop an algorithm on our own, and we run that algorithm on our
own data. Our own local catalog. Union catalogs. Conglomerations of
different catalogs that we do ourselves. Even reproductions of the OCLC
corpus (or significant subsets thereof) that we manage to assemble in
ways that don't violate copyright or license agreements.

And then we've got our own workset grouping service. Which is really all
xISBN is.  What is OCLC providing that is so special? Well, if what I've
just outlined above is so much work that we _can't_ pull it off, then I
guess we've got pay OCLC, and if we are willing to do so (rather than go
without the service), then I guess OCLC has correctly pegged their
market price.

But our field is not a healthy field if all research is being done by
OCLC and other vendors. We need research from other places, we need
research that produces public domain results, not proprietary trade
secrets.


--

Eric Hellman, Director                            OCLC Openly
Informatics Division
[EMAIL PROTECTED]                                    2 Broad St., Suite 208
tel 1-973-509-7800 fax 1-734-468-6216              Bloomfield, NJ 07003
http://openly.oclc.org/1cate/      1 Click Access To Everything


--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu

Reply via email to