I think I like your term "aggregated index" even better than "local
index", thanks Peter. You're right that "local" can be confusing as far
as "local to WHAT".
So that's my new choice of terminology with the highest chance of being
understood and least chance of being misconstrued: "broadcast search"
vs. "aggregated index".
As we've discovered in this thread, if you say "federated search"
without qualification, different people _will_ have different ideas of
what you're talking about, as apparently the phrase has been
historically used differently by different people/communities.
I think "broadcast search" and "aggregated index" are specific enough
that it would be harder for reasonable people to misconstrue -- and
don't (yet?) have a history of being used to refer to different things
by different people. So it's what I'm going to use.
Jonathan
Peter Noerr wrote:
>From one of the Federated Search vendor's perspective...
It seems in the broader web world we in the library world have lost
"metasearch". That has become the province of those systems (mamma, dogpile,
etc.) which search the big web search engines (G,Y,M, etc.) primarily for shoppers and
travelers (kayak, mobissimo, etc.) and so on. One of the original differences between
these engines and the library/information world ones was that they presented results by
Source - not combined. This is still evident in a fashion in the travel sites where you
can start multiple search sessions on the individual sites.
We use "Federated Search" for what we do in the library/information space. It
equates directly to Jonathan's Broadcast Search which was the original term I used when
talking about it about 10 years ago. Broadcast is more descriptive, and I prefer it, but
it seems an uphill struggle to get it accepted.
Fed Search has the problem of Ray's definition of Federated, to mean "a bunch of things
brought together". It can be broadcast search (real time searching of remote Sources and
aggregation of a virtual result set), or searching of a local (to the searcher) index which is
composed of material federated from multiple Sources at some previous time. We tend to use the term
"Aggregate Index" for this (and for the Summon-type index) Mixed content is almost a
given, so that is not an issue. And Federated Search systems have to undertake in real time the
normalization and other tasks that Summon will be (presumably) putting into its aggregate index.
A problem in terminology we come across is the use of "local" (notice my
careful caveat in its use above). It is used to mean local to the searcher (as in the
aggregate/meta index above), or it is used to mean local to the original documents (i.e.
at the native Source).
I can't imagine this has done more than confirm that there is no agreed
terminology - which we sort of all knew. So we just do a lot of explaining -
with pictures - to people.
Peter Noerr
Dr Peter Noerr
CTO, MuseGlobal, Inc.
+1 415 896 6873 (office)
+1 415 793 6547 (mobile)
www.museglobal.com
-----Original Message-----
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
Jonathan Rochkind
Sent: Tuesday, April 21, 2009 08:59
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Serials Solutions Summon
Ray Denenberg, Library of Congress wrote:
Leaving aside metasearch and broadcast search (terms invented more
recently)
it is a shame if "federated" has really lost its distinction
from"distributed". Historically, a federated database is one that
integrates multiple (autonomous) databases so it is in effect a
virtual
distributed database, though a single database. I don't think
that's a
hard concept and I don't think it is a trivial distinction.
For at least 10 years vendors in the library market have been selling
us
products called "federated search" which are in fact
distributed/broadcast search products.
If you want to reclaim the term "federated" to mean a local index, I
think you have a losing battle in front of you.
So I'm sticking with "broadcast search" and "local index". Sometimes
you need to use terms invented more recently when the older terms have
been used ambiguously or contradictorily. To me, understanding the two
different techniques and their differences is more important than the
terminology -- it's just important that the terminology be understood.