Thanks for your comments and suggestions everyone :)
It looks like the general trend is to be in favour of (2) splitting
the frontend web application and the searching application.
Solr looks a lot like what we would liked, but unfortunately we
finished our application a while before Solr initia
Basically you need to separate your web app from your searching, for a
scalable solution. Searching is a different concern. You can develop more
kinds of search when new requirement comes in.
Technorati's way is very similar to one of DBSight configuration. One
machine is dedicated for indexing,
Hadoop is not designed for this type of scenario.
Have a look at Solr (http://lucene.apache.org/solr), this is pretty
much one of it's main use cases. I think it will do what you need to
do and will more than likely work w/ a minimal of configuration on
your existing index (but don't hold
Samuel LEMOINE a écrit :
> I'm acutely interrested by this issue too, as I'm working on
> distributed architecture of Lucene. I'm only at the very beginning of
> my study so that I can't help you much, but Hadoop maybe could fit to
> your requirements. It's a sub-project of Lucene aiming to paralle
Chun Wei Ho a écrit :
Hi,
We are currently running a Tomcat web application serving searches
over our Lucene index (10GB) on a single server machine (Dual 3GHz
CPU, 4GB RAM). Due to performance issues and to scale up to handle
more traffic/search requests, we are getting another server machine.
Server One handle website
Server Two is a light version of tomcat wich handle Lucene Search
In front, a lighttpd which use server two for /search, and server one
for all others things
You can add lucene server with round robin in lighttpd with this scheme.
Careful with fault tolerance and index