Re: Clustered Indexing on common network filesystem

Rajesh parab Thu, 02 Aug 2007 08:54:26 -0700

One more alternative, though I am not sure if anyone
is using it.

Apache Compass has added a plug-in to allow storing
Lucene index files inside the database. This should
work in clustered environment as all nodes will share
the same database instance.


I am not sure the impact it will have on performance.

Is anyone using DB for index storage? Any drawbacks of
this approach?

Regards,
Rajesh

--- Zach Bailey <[EMAIL PROTECTED]> wrote:

> Thanks for your response --
> 
> Based on my understanding, hadoop and nutch are
> essentially the same 
> thing, with nutch being derived from hadoop, and are
> primarily intended 
> to be standalone applications.
> 
> We are not looking for a standalone application,
> rather we must use a 
> framework to implement search inside our current
> content management 
> application. Currently the application search
> functionality is designed 
> and built around Lucene, so migrating frameworks at
> this point is not 
> feasible.
> 
> We are currently re-working our back-end to support
> clustering (in 
> tomcat) and we are looking for information on the
> migration of Lucene 
> from a single node filesystem index (which is what
> we use now and hope 
> to continue to use for clients with a single-node
> deployment) to a 
> shared filesystem index on a mounted network share.
> 
> We prefer to use this strategy because it means we
> do not have to have 
> two disparate methods of managing indexes for
> clients who run in a 
> single-node, non-clustered environment versus
> clients who run in a 
> multiple-node, clustered environment.
> 
> So, hopefully here are some easy questions someone
> could shed some light on:
> 
> Is this not a recommended method of managing indexes
> across multiple nodes?
> 
> At this point would people recommend storing an
> individual index on each 
> node and propagating index updates via a JMS
> framework rather than 
> attempting to handle it transparently with a single
> shared index?
> 
> Is the Lucene index code so intimately tied to
> filesystem semantics that 
> using a shared/networked file system is infeasible
> at this point in time?
> 
> What would be the quickest time-to-implementation of
> these strategies 
> (JMS vs. shared FS)? The most robust/least
> error-prone?
> 
> I really appreciate any insight or response anyone
> can provide, even if 
> it is a short answer to any of the related topics,
> "i.e. we implemented 
> clustered search using per-node indexing with JMS
> update propagation and 
> it works great", or even something as simple as
> "don't use a shared 
> filesystem at this point".
> 
> Cheers,
> -Zach
> 
> testn wrote:
> > Why don't you check out Hadoop and Nutch? It
> should provide what you are
> > looking for.
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 



       
____________________________________________________________________________________
Building a website is a piece of cake. Yahoo! Small Business gives you all the 
tools to get online.
http://smallbusiness.yahoo.com/webhosting 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Clustered Indexing on common network filesystem

Reply via email to