* Relational databases don't do any tokenizing/stemming/normalization
of keywords (i.e., sleeping, sleep, sleeps, Sleep, Slept, slept
should all match a search for "sleep")
* Relational databases do not a have a concept of "relevance", i.e.,
the nutch demo will order results based on how often the term occurrs
in a document, what the term frequency is in the corpus, what the
length of the field is, etc.).
* Relational databases have a rigid schema. If you want to make
information from two databases available through one search interface
you're looking at some non-trivial logic.
There are probably other reasons, but these three come to mind most
quickly. Again you might look into SOLR, this project focuses on
making enterprise data available through a search engine.
On Mar 4, 2008, at 10:31 AM, Duan, Nick wrote:
Could anyone provide any insight on why someone would use nutch/lucene
or any other search engines to index relational databases? With use
cases if possible? Shouldn't the database's own indexing mechanism be
used since it is more efficient?
If there is such a need of indexing the database content using search
engines, what would be the best approach other than de-normalizing the
database?
Thanks a lot in advance!
ND
-----Original Message-----
From: payo [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 04, 2008 12:36 PM
To: [email protected]
Subject: indexing database
hi to all
i can index a database with nutch?
i am use nutch 0.8.1
thanks
--
View this message in context:
http://www.nabble.com/indexing-database-tp15832696p15832696.html
Sent from the Nutch - User mailing list archive at Nabble.com.