RE: Lucene scalability/clustering
Anson, One way of doing it is having subsets of your indexes / data on different machines. Each machine indexes its own data. You implement a system that distributes queries to the various machines and merges the results back. The working well completely depends on your implementation of the distributed search. I believe there was a discussion about implementing this using a MultiSearcher somewhere as well. Cheers! Jochen -Original Message- From: Anson Lau [mailto:[EMAIL PROTECTED] Sent: Sunday, February 22, 2004 2:17 PM To: 'Lucene Users List' Subject: RE: Lucene scalability/clustering Further on this topic - has anyone tried implementing a distributed search with Lucene? How does it work and does it work well? Anson -Original Message- From: Hamish Carpenter [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 5:24 AM To: Lucene Users List Subject: Re: Lucene scalability/clustering Hi All, I'm Hamish Carpenter who contributed the benchmarks with the comment about the IndexSearcherCache. Using this solved our issues with too many files open under linux. The original IndexSearcherCache email is here: http://www.mail-archive.com/[EMAIL PROTECTED]/msg01967.html See here for a copy of the above message and a download link: http://www.geocities.com/haytona/lucene/ The mailing list doesn't like attachments. The source is 10K in size. HTH Hamish Carpenter. [EMAIL PROTECTED] wrote: BTW, where can I get Peter Halacsy's IndexSearcherCache? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene scalability/clustering
I tend to think of scaling in two dimensions: scaling by volumes of users and scaling by volumes of data. The former is addressed through replicated indexes and the latter by segmented indexes. Distribute replicated segments across multiple boxes and create a broker which a)Determines which segments to query b)Load balances query requests across the replicated servers for each segment c) Merges responses Make sure your communications are batched to avoid too much fine-grained chatter. This is the basis of a scalable architecture. Cheers Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE : RE : Lucene scalability/clustering
I'm trying to see what are some common ways to scale lucene onto multiple boxes. Is RMI based search and using a MultiSearcher the general approach? More details about what you are attempting would be helpful. RBP - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: RE : Lucene scalability/clustering
Anson Lau wrote: I'm trying to see what are some common ways to scale lucene onto multiple boxes. Is RMI based search and using a MultiSearcher the general approach? Yes, although you probably want to use ParallelMultiSearcher. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE : Lucene scalability/clustering
RBP, I'm implementing a search engine for a project at work. It's going to index approx 1.5 rows in a database. I am trying to get a feel of what my options are when scalability becomes an issue. I also want to know if those options require me to implement my app in a different way right from the start. Anson -Original Message- From: Rasik Pandey [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 24, 2004 9:34 PM To: 'Lucene Users List' Subject: RE : RE : Lucene scalability/clustering I'm trying to see what are some common ways to scale lucene onto multiple boxes. Is RMI based search and using a MultiSearcher the general approach? More details about what you are attempting would be helpful. RBP - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE : Lucene scalability/clustering
Further on this topic - has anyone tried implementing a distributed search with Lucene? How does it work and does it work well? I assume you are referring to RMI based search? It works well as does MultiSearcher. RBP - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: RE : Lucene scalability/clustering
I'm trying to see what are some common ways to scale lucene onto multiple boxes. Is RMI based search and using a MultiSearcher the general approach? There doesn't seem to be many articles on the web on how to implement a lucene search cluster. If anyone knows a good article can you please post it here? Thanks, Anson -Original Message- From: Rasik Pandey [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 9:46 PM To: 'Lucene Users List' Subject: RE : Lucene scalability/clustering Further on this topic - has anyone tried implementing a distributed search with Lucene? How does it work and does it work well? I assume you are referring to RMI based search? It works well as does MultiSearcher. RBP - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene scalability/clustering
On Saturday 21 February 2004 20:24, Otis Gospodnetic wrote: http://jakarta.apache.org/lucene/docs/benchmarks.html BTW, where can I get Peter Halacsy's IndexSearcherCache? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene scalability/clustering
Hi All, I'm Hamish Carpenter who contributed the benchmarks with the comment about the IndexSearcherCache. Using this solved our issues with too many files open under linux. The original IndexSearcherCache email is here: http://www.mail-archive.com/[EMAIL PROTECTED]/msg01967.html See here for a copy of the above message and a download link: http://www.geocities.com/haytona/lucene/ The mailing list doesn't like attachments. The source is 10K in size. HTH Hamish Carpenter. [EMAIL PROTECTED] wrote: BTW, where can I get Peter Halacsy's IndexSearcherCache? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene scalability/clustering
Further on this topic - has anyone tried implementing a distributed search with Lucene? How does it work and does it work well? Anson -Original Message- From: Hamish Carpenter [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 5:24 AM To: Lucene Users List Subject: Re: Lucene scalability/clustering Hi All, I'm Hamish Carpenter who contributed the benchmarks with the comment about the IndexSearcherCache. Using this solved our issues with too many files open under linux. The original IndexSearcherCache email is here: http://www.mail-archive.com/[EMAIL PROTECTED]/msg01967.html See here for a copy of the above message and a download link: http://www.geocities.com/haytona/lucene/ The mailing list doesn't like attachments. The source is 10K in size. HTH Hamish Carpenter. [EMAIL PROTECTED] wrote: BTW, where can I get Peter Halacsy's IndexSearcherCache? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene scalability/clustering
http://jakarta.apache.org/lucene/docs/benchmarks.html --- [EMAIL PROTECTED] wrote: Hi! How well does Lucene scale? Is it able to handle 100.000 (more or less complex) queries a day (i.e. 9 to 5) on an index with half a million docs? What hardware is recommended for that demand? What to do if it cannot handle it quickly enough? Regards, Timo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]