Make sure you are not going to "reinvent the wheel" here ;). There's been done a lot around the problem of distributes search engine. This thread might be useful for you: http://search-hadoop.com/m/ARlbS1MiTNY
Alex Baranau ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase On Fri, Nov 19, 2010 at 5:52 PM, Bing Li <lbl...@gmail.com> wrote: > Hi, all, > > I am working on a distributed searching system. Now I have one server only. > It has to crawl pages from the Web, generate indexes locally and respond > users' queries. I think this is too busy for it to work smoothly. > > I plan to use two servers at at least. The jobs to crawl pages and generate > indexes are done by one of them. After that, the new available indexes > should be transmitted to anther one which is responsible for responding > users' queries. From users' point of view, this system must be fast. > However, I don't know how I can get the additional indexes which I can > transmit. After transmission, how to append them to the old indexes? Does > the appending block searching? > > Thanks so much for your help! > > Bing Li >