Re: solr benchmarks

2011-01-03 Thread dc tech
Tri:
What is the volume of content (# of documents) and index size you are
expecting? What about the document complexity in terms of # of fields, what
are you storing in the index, complexity of the queries etc?

We have used SOLR with 10m documents with 1-3 second response times on the
front end  - this is with minimal tuning, 4-5 facet fields and large blobs
of content in the index and jRuby on Rails and complex queries and under low
load conditions (hence caches are probably not warmed much).

We have external search application almost fully powered by SOLR (except for
web crawl) and the response is of the typically less than 1 second with
about 100k documents. Solr time is probably 100-200 ms of this.

My sense is that SOLR is as fast as it gets and scales very, very well. On
the user group, I have seen reference to people using SOLR for 100m
documents or more. It would be useful to get your use case(s).





On Mon, Jan 3, 2011 at 10:44 AM, Jak Akdemir jakde...@gmail.com wrote:

 Hi,
 You can find benchmark results but these are not directly based on index
 size vs. response time
 http://wiki.apache.org/solr/SolrPerformanceData

 On Sat, Jan 1, 2011 at 4:06 AM, Tri Nguyen tringuye...@yahoo.com wrote:

  Hi,
 
  I remember going through some page that had graphs of response times
 based
  on index size for solr.
 
  Anyone know of such pages?
 
  Internally, we have some requirements for response times and I'm trying
 to
  figure out when to shard the index.
 
  Thanks,
 
  Tri



Re: solr benchmarks

2011-01-02 Thread Toke Eskildsen
On Sat, 2011-01-01 at 03:06 +0100, Tri Nguyen wrote:
 I remember going through some page that had graphs of response times based on 
 index size for solr.
  
 Anyone know of such pages?

Sorry, no. Some small scale tests with our corpus showed that response
times suffered less than proportionally to index size, with regard to
the raw searches: Doubling the index size did not halve the response
time. On the other hand, faceting time was proportional to the index
size. As always, your mileage will vary.

 Internally, we have some requirements for response times and I'm trying to 
 figure out when to shard the index.

If you discover that your searches are primarily IO-bound, which is
often the case, and if you're still using spinning disks, I highly
recommend that you upgrade to SDD's. They are very cheap compared to
RAM, you don't need to change your code or workflow and they work
beautifully with Lucene/SOLR: They gave us 2-4 times speedup, compared
to 2 * 15.000 RPM harddisks in RAID 1. Compared to holding the index
fully in RAM (with a 14GB index) they gave us 80% on a dual core machine
- more CPU cores might benefit more from the RAM solution.



solr benchmarks

2010-12-31 Thread Tri Nguyen
Hi,
 
I remember going through some page that had graphs of response times based on 
index size for solr.
 
Anyone know of such pages?
 
Internally, we have some requirements for response times and I'm trying to 
figure out when to shard the index.
 
Thanks,
 
Tri

Re: solr benchmarks

2010-12-31 Thread François Schiettecatte
I would shard the index so that each shard is no larger than the memory of the 
machine it sits on, that way your entire index will be in memory all the time. 
When I was at Feedster (I wrote the search engine), the rule of thumb I had was 
to have 14GB of index on a 16GB machine.

François

On Dec 31, 2010, at 9:06 PM, Tri Nguyen wrote:

 Hi,
  
 I remember going through some page that had graphs of response times based on 
 index size for solr.
  
 Anyone know of such pages?
  
 Internally, we have some requirements for response times and I'm trying to 
 figure out when to shard the index.
  
 Thanks,
  
 Tri



Re: Solr Benchmarks

2006-11-09 Thread Joachim Martin

Hi Walter,

Thunderbird shows that there is an attachment to this message in the 
message list, but when I view
the message, no attachment is available.  Could you try sending this 
attachment again?


Thanks --Joachim

Walter Underwood wrote:


I've done some testing using JMeter. I followed the instructions
in the JMeter FAQ for How do I use external data files in my
test scripts?

  http://wiki.apache.org/jakarta-jmeter/JMeterFAQ

I'm attaching the script I built with this. A few notes:

 



Re: Solr Benchmarks

2006-11-09 Thread Walter Underwood
Here it is again, but the mailing list might strip attachments.
It is very easy to build your own using the instructions in the FAQ.

wunder

On 11/9/06 11:02 AM, Joachim Martin [EMAIL PROTECTED] wrote:

 Hi Walter,
 
 Thunderbird shows that there is an attachment to this message in the
 message list, but when I view
 the message, no attachment is available.  Could you try sending this
 attachment again?
 
 Thanks --Joachim
 
 Walter Underwood wrote:
 
 I've done some testing using JMeter. I followed the instructions
 in the JMeter FAQ for How do I use external data files in my
 test scripts?
 
   http://wiki.apache.org/jakarta-jmeter/JMeterFAQ
 
 I'm attaching the script I built with this. A few notes:
 
  
 



Re: Solr Benchmarks

2006-11-09 Thread Chris Hostetter

: Here it is again, but the mailing list might strip attachments.
: It is very easy to build your own using the instructions in the FAQ.

in general, the Apache mailing lists strip attachments.  In my experience
plain text attachments seem to be okay, as long as they aren't too big and
have the mime type set properly by your mail sender.

in practice: it's usually better to just cut/paste in the body of your
message, or send a URL to an external resource.

:
: wunder
:
: On 11/9/06 11:02 AM, Joachim Martin [EMAIL PROTECTED] wrote:
:
:  Hi Walter,
: 
:  Thunderbird shows that there is an attachment to this message in the
:  message list, but when I view
:  the message, no attachment is available.  Could you try sending this
:  attachment again?
: 
:  Thanks --Joachim
: 
:  Walter Underwood wrote:
: 
:  I've done some testing using JMeter. I followed the instructions
:  in the JMeter FAQ for How do I use external data files in my
:  test scripts?
: 
:http://wiki.apache.org/jakarta-jmeter/JMeterFAQ
: 
:  I'm attaching the script I built with this. A few notes:
: 
: 
: 
:
:



-Hoss



Re: Solr Benchmarks

2006-11-06 Thread Walter Underwood
On 11/6/06 6:28 AM, Nicolas St-Laurent [EMAIL PROTECTED] wrote:
 
 Is there any Solr benchmarks available somewhere ? I would like to
 know how well it performs. I understand that it depends on the
 hardware config and on the application server used. Just to got an
 idea...

With search engines, you really need to test with your documents,
your queries, and your settings. Performance might vary by a
factor of ten or more.

I've done some testing using JMeter. I followed the instructions
in the JMeter FAQ for How do I use external data files in my
test scripts?

   http://wiki.apache.org/jakarta-jmeter/JMeterFAQ

I'm attaching the script I built with this. A few notes:

* The queries should be one per line in a file named query.txt
  in the JMeter bin directory.
* This test will use HTTP 1.1 persistent connections, so it
  is faster than a bunch of different clients. It should be
  fairly accurate if search is front-ended by another app.
* It helps to have a lot of queries, maybe 50K or more.
  I've seen other search engines run entirely from cache with a
  1000 query test set.
* JMeter can use a lot of CPU, so it might hit the limit before
  Solr does. Watch the CPU usage on both systems (JMeter and Solr)
  to see which one is the bottleneck.
* The display graphs can slow down JMeter on long tests. I was
  seeing spots of low CPU usage on the Solr server and those
  went away when I cleared the graph.

I was very pleased with the Solr performance in my testing.
With our small corpus (65K docs) I was seeing over 240 qps
on my dev box (dual 3 GHz Xeon). I expect that it didn't touch
the disk at all, since the index is only 50 Meg.

wunder
-- 
Walter Underwood
Search Guru, Netflix

 
  



Re: Solr Benchmarks

2006-11-06 Thread Nicolas St-Laurent


Le 06-11-06 à 12:50, Walter Underwood a écrit :



   http://wiki.apache.org/jakarta-jmeter/JMeterFAQ

I'm attaching the script I built with this. A few notes:


Well, I doesn't get the script...


I was very pleased with the Solr performance in my testing.
With our small corpus (65K docs) I was seeing over 240 qps
on my dev box (dual 3 GHz Xeon). I expect that it didn't touch
the disk at all, since the index is only 50 Meg.

wunder


Thank you wunder.  It gives me a good idea of what to expect of Solr.  
I understand that performance change a lot depending of the context  
of execution. It's a good idea to user JMeter to get a performance  
report. I will try this.


Nicolas



Re: Solr Benchmarks

2006-11-06 Thread Nicolas St-Laurent


Le 06-11-06 à 12:21, Kevin Lewandowski a écrit :


As of today Solr is running under Tomcat on a single dedicated box.
It's a 2.66Ghz P4, with 1 gig ram. The index has about 1.2 million
documents and is 1.2 gigs in size. This machine handles 250,000
queries per day with no problem. CPU load stays around 0.15 most of
the time.

I hope that is helpful to you.

Kevin

Thank you Kevin. It gives me a good idea. I use a simple socket  
server right now in front of Lucene. I will give Solr a try.