Hi Eric

Thanks for extensive answers. I will try to tune up my Solr
installation according to your advises and the wiki page you've
mentioned

Best regards,
Volodymyr

2012/4/27 Jeremy Taylor <[email protected]>:
> DataStax offers a Solr integration that isn't master/slave and is
> NearRealTimes.  Essentially, the software offers the great features of
> Solr without the major shortcomings.
>
> Jeremy
>
> -----Original Message-----
> From: Erick Erickson [mailto:[email protected]]
> Sent: Friday, April 27, 2012 5:26 AM
> To: [email protected]
> Subject: Re: Benchmark Solr vs Elastic Search vs Sensei
>
> Some observations:
> 1> I suspect some of your queries aren't doing what you expect, but
>     I'm not sure if that matters. e.g. !tags:chick magnet will be parsed
>     as -tags:chick defaultField:magnet.
> 2> Typical Solr setups in production are usually master/slave
>     setups. Your indexing process (the commits) are causing
>     new searchers to be opened/warmed/etc quite regularly,
>     reducing your throughput. It's not surprising at all that
>     your QPS rate increases when not indexing.
> 3> The trunk Near Real Time with "soft commits" should change
>     the characteristics of the test with background indexing. You
>     might try that.
> 4> Examine your cache usage, see the Solr admin page. Caches
>     are quite important. Also consider autowarming characteristics.
> 5> There's a ton of stuff you can do to tune query rate. Unfortunately
>     what the specific thing that would help your situation is hard to
>     say. You might start with:
>    http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
>
> Best
> Erick
>
> On Thu, Apr 26, 2012 at 9:50 PM, Volodymyr Zhabiuk <[email protected]>
> wrote:
>> Hi Solr users
>>
>> I've implemented the project to compare the performance between Solr,
>> Elastic Search and SenseiDB https://github.com/vzhabiuk/search-perf
>>  the Solr version 3.5.0 was used. I've used the default configuration,
>> just enabled json updates and used the following schema
>>
> https://github.com/vzhabiuk/search-perf/blob/master/configs/solr/schema.xm
> l.
>> 2.5 mln documents were put into the index, after that I've launched
>> the indexing process to add anotherr 500k docs. I was issuing commits
>> after each 500 doc batch . At the same time I've launched the
>> concurrent client, that sent the following type of queries
>> ((tags:moon-roof%20or%20tags:electric%20or%20tags:highend%20or%20tags:
>> hybrid)%20AND%20(!tags:family%20AND%20!tags:chick%20magnet%20AND%20!ta
>> gs:soccer%20mom))%20
>> OR%20((color:red%20or%20color:green%20or%20color:white%20or%20color:ye
>> llow)%20AND%20(!color:gold%20AND%20!color:silver%20AND%20!color:black)
>> )%20
>> OR%20mileage:[15001%20TO%2017500]%20OR%20mileage:[17501%20TO%20*]%20
>> OR%20city:u.s.a.*
>> &facet=true&facet.field=tags&facet.field=color
>> The query contains the high level "OR" query, consisting of 2 terms, 2
>> ranges and 1 prefix. It is designed to hit ~60-70% of all the docs
>> Here is the performance result:
>> #Threads     min       median         mean            75%         qps
>>   1         208.95ms  332.66ms    350.48ms     422.92ms     2.8
>>   2         188.68ms  338.09ms    339.22ms     402.15ms     5.9
>>   3         151.06ms  326.64ms    336.20ms     418.61ms     8.8
>>   4         125.13ms  332.90ms    332.18ms     396.14ms     12.0 If
>> there is no  indexing process on background The result is as follows
>> for 2,6 mln docs:
>> #Threads     min     median          mean             75%         qps
>>   1         106.70ms  199.66ms    199.40ms     234.89ms     5.1
>>   2         128.61ms  199.12ms    201.81ms     229.89ms     9.9
>>   3         110.99ms  197.43ms    203.13ms     232.25ms     14.7
>>   4         90.24ms    201.46ms      200.46ms     227.75ms     19.9
>>   5         106.14ms  208.75ms    207.69ms     242.88ms     24.0
>>   6         103.75ms  208.91ms    211.23ms     238.60ms     28.3
>>   7         113.54ms  207.07ms    209.69ms     239.99ms     33.3
>>   8         117.32ms  216.38ms    224.74ms     258.74ms     35.5 I've
>> got three questions so far:
>> 1. In case of background indexing the latency is almost 2 times
>> higher, is there any way to overcome this?
>> 2. How can we tune the Solr to get better results ?
>> 3. What's in your opinion is the preferred type of queries that I can
>> use for the benchmark?
>>
>> With many thanks,
>> Volodymyr
>>
>>
>> BTW here is the spec of my machine
>> RedHat 6.1 64bit
>> Intel XEON e5620 @2.40 GHz, 8 cores
>> 63 GB RAM

Reply via email to