Index updates between machines

Chun Wei Ho Tue, 03 Apr 2007 08:42:45 -0700

We are running a search service on the internet using two machines. We
have a crawler machine which crawls the web and merges new documents
found into the Lucene index. We have a searcher machine which allows
users to perform searches on the Lucene index.


Periodically, we would copy the newest version of the index from the
crawler machine over to the searcher machine (via copy over a NFS
mount). The searcher would then detect the new version, close the old
index, open the new index and resume the search service.

As the index have been growing in size, we have been noticing that the
search response time on the searcher machine increases drastically
when an index (about 15GB) is being copied from the crawler to the
searcher. Both machines run Fedora Core 4 and are on a gbps lan.

We've tried a number of ways to reduce the impact of the copy over NFS
on searching performance, such as "nice"ing the copy process, but to
no avail. I wonder if anyone is running a lucene search service over a
similar architecture and how you are managing the updates to the
lucene index.

Thanks!

Regards,
CW

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Index updates between machines

Reply via email to