Don't forget Sphinx.
http://sphinxsearch.com/
Sphinx is a full-text search engine, distributed under GPL version
2. Commercial license is also available for embedded use.
Generally, it's a standalone search engine, meant to provide fast,
size-efficient and relevant fulltext search functions to other
applications. Sphinx was specially designed to integrate well with
SQL databases and scripting languages. Currently built-in data
sources support fetching data either via direct connection to
MySQL, or from an XML pipe.
Fast. Builds easy/everywhere.
--Casey
On Jul 18, 2007, at 7:45 AM, Alberto Accomazzi wrote:
Our project is looking to transition to a new search engine to handle
our bibliographic databases (5.5M records of bibliographic article
metadata + 0.6M fulltext articles). What we are looking for is
something easily tweakable, which offers fielded searches,
boolean/simple search logic, customizable relevance ranking,
proximity,
highlighting, synonym/stemming matching. Needs to run on a linux
64-bit
box. The packages I am aware of are:
1. lucene/clucene/lucy
2. kinosearch
3. xapian
4. zebra
5. invenio
Am I missing any from the list? Are any of these to be excluded based
on our requirements? I'd like to hear experiences from people who are
using or have used these packages.
TIA
-- Alberto
********************************************************************
Dr. Alberto Accomazzi aaccomazzi(at)cfa harvard edu
NASA Astrophysics Data System ads.harvard.edu
Harvard-Smithsonian Center for Astrophysics www.cfa.harvard.edu
60 Garden St, MS 67, Cambridge, MA 02138, USA
********************************************************************