Rajesh parab wrote:
Hi,

We are currently using Lucene 2.0 for full-text
searches within our enterprise application, which can
be deployed in clustered environment. We generate
Lucene index for data stored inside relational
database.

As Lucene 2.0 did not have solid NFS support and as we
wanted Lucene based searches to work properly in
Clustered environment, we had decided on following
approach:
1. The index generation happens on a machine (could be
one of the cluster nodes or a separate machine) and
once the Lucene index is generated, we copy all the
index files to the database.

Note that you can do also incremental replication: often, the Lucene index changes in minor ways (eg a single new segment is flushed and a new segments_N and segments.gen is written) so you should only sync the files that are new (and remove the ones that are now gone). Lucene's write-once approach makes this very simple (you just have to compare file names, not the contents of each file).

It's also possible to replicate without using a DB. EG rsync does a great job.

2. The index search request on each cluster node
retrieves the index files from database (during first
search or after index update), copies to the file
system and use it for searches.
3. Thus, each cluster node has its own copy of the
index and it keeps on picking up latest version if it
is available inside database.

This has worked fine for us till now, though we will
not be able continue with this model in future as we
want to support Lucene based searches across our
application and also want to index large components
inside our application like Wiki, forums, etc. As the
index will grow, storing and retrieving index files
from database will not be an efficient operation.

My questions are:
- Will we be able to use NFS if we move to Lucene
2.3.0?

Make sure you update to 2.3.1, not 2.3.0.

- Will there be any significant performance impact on
index generation and searches if we move to NFS?
- Is Lucene + NFS combination supported for all
operating systems? (We support Windows, Solaris, AIX,
HP-UX, Red Hat Linux)

NFS *should* work, however:

  * It's not widely used, so, test thoroughly in your particular setup.

* Most likely to work is if you use a single machine writing to the index, and many readers.

* Performance is likely not great, especially on searching, but you should test in your specific situation.

- Is there any other alternative available other than
NFS?


It's also possible to replicate without using a DB. EG rsync does agreat job.

You should look at Solr, since it already has all the infrastructure toaccept updates, replicate index changes to remote machines, etc.

I will really appreciate your comments/thoughts on
this topic.

Regards,
Rajesh


______________________________________________________________________ ______________ You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to