Jonathan, while having these thoughts your Umlaut service did come to mind. If you ever have time to expand on how it could work in a wide open web environment, I'd love to hear it. (I know you explain below, but I don't know enough about link resolvers to understand what it really means from a short explanation. Diagrams are always welcome!)

kc

On 2/23/12 12:37 PM, Jonathan Rochkind wrote:
On 2/23/2012 2:45 PM, Karen Coyle wrote:
This links to thoughts I've had about linked data and finding a way to
use library holdings over the Web. Obviously, bibliographic data alone
is a full service: people want to get the stuff once they've found out
that such stuff exists. So how do we get users from the retrieval of a
bibliographic record to a place where they have access to the stuff?

I see two options: the WorldCat model, where people get sent to a
central database where they input their zip code, or a URL-like model
where they get a link on retrievals that has knowledge about their
preferred institution and access.

I think we need both of those, and mixtures between the two, and more.

OCLC is trying to do the second one too. For instance with their link
resolver redirector. But it requires link resolvers being registered,
link resolvers working, and link resolvers working for print materials,
etc.

Of course "get a link on retrievals" begs the question of from where
they are retrieving and who is generating this link? But in theory,
anyone with a retrieval system could give you a link through OCLC's link
resolver redirector. Which isn't quite fleshed out yet, but
theoretically could then redirect you to the link resolver of your
choice based on preferences or proximity. Except, well, it doens't work
that well, for a variety of reasons both under and not under OCLC's
control. But it's the sort of architecture we're talking about, I think.

(Now if there was a common machine-readable response for link resolver
type requests, an OCLC-like service could even aggregate the responses
from _several_ "preferred institutions" on one page. Umlaut originally
tried to do that with SFX link resolvers, but it never really went
anywhere).

Anyhow, yeah, both of those, and more. They definitely aren't mutually
exclusive, and the sorts of technologies and metadata ecologies that are
needed to support each one have a whole lot of overlap.

Incidentally, my Umlaut software, mostly targetted at academic
libraries, is really focused on that exact problem: "people want to get
the stuff once they've found out that such stuff exists. So how do we
get users from the retrieval of a bibliographic record to a place where
they have access to the stuff? " But it's definitely not done yet, it's
my goal with Umlaut, but there's still a lot left to do to get there.
(Ultimately, you need some kind of LibX-type approach, browser plugin or
javascript bookmarklet, to get people to a place where they have access
from third parties that have absolutely no interest in collaborating on
this plan. Amazon doesn't want to help you go anywhere other than Amazon
to acquire a book). Definitely a work in progress, but the goal it's
oriented to is exactly what you say. https://github.com/team-umlaut/umlaut

Jonathan





I have no idea if the latter is feasible on a true "web scale," but it
would be my ideal solution. We know that search engines keep track of
your location and tailor retrievals based on that. Could libraries get
into that loop?

kc

On 2/23/12 11:35 AM, Eoghan Ó Carragáin wrote:
That's true, but since Blacklight/Vufind often sit over
digital/institutional repositories as well as ILS systems& subscription
resources, at least some public domain content gets found that otherwise
wouldn't be. As you said, even if the item isn't available digitally,
for
Special Collections libraries unique materials are exposed to potential
researchers who'd never have known about them.
Eoghan

On 23 February 2012 19:25, Sean Hannan<[email protected]> wrote:

It's hard to say. Going off of the numbers that I have, I'd say that
they
do
find what they are looking for, but they unless they are a JHU
affiliate,
they are unable to access it.

Our bounce rate for Google searches is 76%. Which is not necessarily
bad,
because we put a lot of information on our item record pages--we
don't make
you dig for anything.

On the other hand, 9% of visits coming to us through Google searches
are
return visits. To me, that says that the other 91% are not JHU
affiliates,
and that's 91% of Google searchers that won't have access to materials.

I know from monitoring our feedback form, we have gotten in increase in
requests from far flung places for access to things we have in special
collections from non-affiliates.

So, we get lots of exposure via searches, but due to the nature of how
libraries work with subscriptions, licensing, membership and such,
we close
lots of doors once they get there.

-Sean

On 2/23/12 1:55 PM, "Schneider, Wayne"<[email protected]> wrote:

This is really interesting. Do you have evidence (anecdotally or
otherwise) that the people coming to you via search engines found what
they were looking for? Sorry, I don't know exactly how to phrase this.
To put it another way - are your patrons finding you this way?

wayne

-----Original Message-----
From: Code for Libraries [mailto:[email protected]] On
Behalf Of
Sean Hannan
Sent: Thursday, February 23, 2012 12:37 PM
To: [email protected]
Subject: Re: [CODE4LIB] Local catalog records and Google, Bing, Yahoo!

Our Blacklight-powered catalog (https://catalyst.library.jhu.edu/)
comes
up a lot in google search results (try gil scott heron circle of
stone).

Some numbers:

59% of our total catalog traffic comes from google searches 0.04%
of our
total catalog traffic comes from yahoo searches 0.03% of our total
catalog traffic comes from bing searches

For context, 32.96% of our total catalog traffic is direct traffic and
referrals from all of the library websites combined.

Anecdotally, it would appear that bing (and bing-using yahoo) seem to
drastically play down catalog records in their results. We're not
doing
anything to favor a particular search engine; we have a completely
open
robots.txt file.

Google regularly indexes our catalog. Every couple days or so. I
haven't
checked in awhile.

We're not doing any fancy SEO here (though, I'd like to implement some
of the microdata stuff). It's just a function of how the site
works. We
link a lot of our catalog results to further searches (clicking on an
author name takes you to an author search with that name, etc). Google
*loves* that type of intertextual website linking (see also:
Wikipedia).
We also have stable URLs. Search URLs will always return searches with
those parameters, item URLs are based on an ID that does not change.

All of that good stuff doesn't help us with bing, though. ...But
I'm not
really concerned with remedying that, right this moment.

-Sean

On 2/23/12 12:37 PM, "[email protected]"
<[email protected]>
wrote:

First of all, I'm going to say I know little in this area. I've done
some preliminary research about search indexing (Google's) and
investigated a few OPAC robot.txt files. Now to my questions:

- Can someone explain to me or point me to research as to why local
library catalog records do not show up in Google, Bing, or Yahoo!
search
results?
- Is there a general prohibition by libraries for search engines to
crawl their public records?
- Do the search engines not index these records actively?
- Is it a matter of SEO/promoted results?
- Is it because some systems don't mint URLs for each record?

I haven't seen a lot of discussion about this recently and I know
Jason Ranallo has done a lot of work in this area and gave a great
talk at code4lib Seattle on microdata/Schema.org, so I figured this
could be part of that continuing conversation.

I look forward to being educated by you all,

Tod



--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Reply via email to