help required: how to design a large scale solr system

2008-09-24 Thread Ben Shlomo, Yatir
Hi! I am already using solr 1.2 and happy with it. In a new project with very tight dead line (10 development days from today) I need to setup a more ambitious system in terms of scale Here is the spec: * I need to index about 60,000,000 documents *

Re: help required: how to design a large scale solr system

2008-09-24 Thread Mark Miller
From my limited experience: I think you might have a bit of trouble getting 60 mil docs on a single machine. Cached queries will probably still be *very* fast, but non cached queries are going to be very slow in many cases. Is that 5 seconds for all queries? You will never meet that on first

RE: help required: how to design a large scale solr system

2008-09-24 Thread Ben Shlomo, Yatir
@lucene.apache.org Subject: Re: help required: how to design a large scale solr system From my limited experience: I think you might have a bit of trouble getting 60 mil docs on a single machine. Cached queries will probably still be *very* fast, but non cached queries are going to be very slow

Re: help required: how to design a large scale solr system

2008-09-24 Thread Martin Iwanowski
Hi, I'm very new to search engines in general. I've been using Zend_Search_Lucene class before to try Lucene in general and though it surely works it's not what I'm looking for performance wise. I recently installed Solr on a newly installed Ubuntu (Hardy Heron) machine. I have about

Re: help required: how to design a large scale solr system

2008-09-24 Thread Mark Miller
files as opposed to directly indexing each document via http post? -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 2:12 PM To: solr-user@lucene.apache.org Subject: Re: help required: how to design a large scale solr system From my

Re: help required: how to design a large scale solr system

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 07:46:57 -0400 Mark Miller [EMAIL PROTECTED] wrote: Yes. You will def see a speed increasing by avoiding http (especially doc at a time http) and using the direct csv loader. http://wiki.apache.org/solr/UpdateCSV and the obvious reason that if, for whatever reason,

Re: help required: how to design a large scale solr system

2008-09-24 Thread Mark Miller
Norberto Meijome wrote: On Wed, 24 Sep 2008 07:46:57 -0400 Mark Miller [EMAIL PROTECTED] wrote: Yes. You will def see a speed increasing by avoiding http (especially doc at a time http) and using the direct csv loader. http://wiki.apache.org/solr/UpdateCSV and the obvious reason

Re: help required: how to design a large scale solr system

2008-09-24 Thread Norberto Meijome
On Wed, 24 Sep 2008 11:45:34 -0400 Mark Miller [EMAIL PROTECTED] wrote: Nothing to stop you from breaking up the tsv/csv files into multiple tsv/csv files. Absolutely agreeing with you ... in one system where I implemented SOLR, I have a process run through the file system and lazily pick up

Re: help required: how to design a large scale solr system

2008-09-24 Thread Otis Gospodnetic
- Original Message From: Ben Shlomo, Yatir [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, September 24, 2008 2:50:54 AM Subject: help required: how to design a large scale solr system Hi! I am already using solr 1.2 and happy with it. In a new project with very tight

Re: help required: how to design a large scale solr system

2008-09-24 Thread Jon Drukman
Martin Iwanowski wrote: How can I setup to run Solr as a service, so I don't need to have a SSH connection open? The advice that I was given on this very list was to use daemontools. I set it up and it is really great - starts when the machine boots, auto-restart on failures, easy to bring