Re: Solr PHP highload search
Consider just looking at it with jconsole (should be in your Java release) to get a sense of the memory usage/collection. How much physical memory do you have overall? Because this is not what I'd expect. Your CPU load is actually reasonably high, so it doesn't look like you're swapping. By and large, trying to use RAMDirectories isn't a good solution, between the OS and Solr, they read the necessary parts of your index into memory and use that. Best Erick On Wed, Jun 13, 2012 at 7:13 AM, Alexandr Bocharov wrote: > Thank you for help :) > > I'm giving 2048M the JVM for each node. > CPU load is jumping 70-90%. > Memory usage is increasing to max during testing (probably cache is > filling). > I/O I didn't monitor. > > I'd like to see answers on my other questions. > > 2012/6/13 Erick Erickson > >> How much memory are you giving the JVM? Have you put a performance >> monitor on the running process to see what resources have been >> exhausted (i.e. are you I/O bound? CPU bound?) >> >> Best >> Erick >> >> On Tue, Jun 12, 2012 at 3:40 AM, Alexandr Bocharov >> wrote: >> > Hi, all. >> > >> > I need advice for configuring Solr search to use at highload production. >> > >> > I've wrote user's search engine (PHP class), that uses over 70 parameters >> > for searching users. >> > User's database is over 30 millions records. >> > Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. >> > Previous search engine can handle 700,000 queries per day for searching >> > users - it is ~8 queries/sec (4 mysql servers with manual sharding via >> > Gearman) >> > >> > Example of queries are: >> > >> > [responseHeader] => SolrObject Object >> > ( >> > [status] => 0 >> > [QTime] => 517 >> > [params] => SolrObject Object >> > ( >> > [bq] => Array >> > ( >> > [0] => bool_field1:1^30 >> > [1] => str_field1:str_value1^15 >> > [2] => tint_field1:tint_field1^5 >> > [3] => bool_field2:1^6 >> > [4] => date_field1:[NOW-14DAYS TO NOW]^20 >> > [5] => date_field2:[NOW-14DAYS TO NOW]^5 >> > ) >> > >> > [indent] => on >> > [start] => 0 >> > [q.alt] => *:* >> > [wt] => xml >> > [fq] => Array >> > ( >> > [0] => tint_field2:[tint_value2 TO >> tint_value22] >> > [1] => str_field1:str_value1 >> > [2] => str_field2:str_value2 >> > [3] => tint_field3:(tint_value3 OR >> tint_value32 >> > OR tint_value33 OR tint_value34 OR tint_value5) >> > [4] => tint_field4:tint_value4 >> > [5] => -bool_field1:[* TO *] >> > ) >> > >> > [version] => 2.2 >> > [defType] => dismax >> > [rows] => 10 >> > ) >> > >> > ) >> > >> > >> > I test my PHP search API and found that concurrent random queries, for >> > example 10 queries at one time increases QTime from avg 500 ms to 3000 ms >> > at 2 nodes. >> > >> > 1. How can I tweak my queries or parameters or Solr's config to decrease >> > QTime? >> > 2. What if I put my index data to emulated RAM directory, can it increase >> > greatly performance? >> > 3. Sorting by boost queries has a great influence on QTime, how can I >> > optimize boost queries? >> > 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 >> nodes >> > per machine, will it increase performance? >> > 5. What is "multi-core query", how can I configure it, and will it >> increase >> > performance? >> > >> > Thank you! >>
Re: Solr PHP highload search
Thank you for help :) I'm giving 2048M the JVM for each node. CPU load is jumping 70-90%. Memory usage is increasing to max during testing (probably cache is filling). I/O I didn't monitor. I'd like to see answers on my other questions. 2012/6/13 Erick Erickson > How much memory are you giving the JVM? Have you put a performance > monitor on the running process to see what resources have been > exhausted (i.e. are you I/O bound? CPU bound?) > > Best > Erick > > On Tue, Jun 12, 2012 at 3:40 AM, Alexandr Bocharov > wrote: > > Hi, all. > > > > I need advice for configuring Solr search to use at highload production. > > > > I've wrote user's search engine (PHP class), that uses over 70 parameters > > for searching users. > > User's database is over 30 millions records. > > Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. > > Previous search engine can handle 700,000 queries per day for searching > > users - it is ~8 queries/sec (4 mysql servers with manual sharding via > > Gearman) > > > > Example of queries are: > > > > [responseHeader] => SolrObject Object > >( > >[status] => 0 > >[QTime] => 517 > >[params] => SolrObject Object > >( > >[bq] => Array > >( > >[0] => bool_field1:1^30 > >[1] => str_field1:str_value1^15 > >[2] => tint_field1:tint_field1^5 > >[3] => bool_field2:1^6 > >[4] => date_field1:[NOW-14DAYS TO NOW]^20 > >[5] => date_field2:[NOW-14DAYS TO NOW]^5 > >) > > > >[indent] => on > >[start] => 0 > >[q.alt] => *:* > >[wt] => xml > >[fq] => Array > >( > >[0] => tint_field2:[tint_value2 TO > tint_value22] > >[1] => str_field1:str_value1 > >[2] => str_field2:str_value2 > >[3] => tint_field3:(tint_value3 OR > tint_value32 > > OR tint_value33 OR tint_value34 OR tint_value5) > >[4] => tint_field4:tint_value4 > >[5] => -bool_field1:[* TO *] > >) > > > >[version] => 2.2 > >[defType] => dismax > >[rows] => 10 > >) > > > >) > > > > > > I test my PHP search API and found that concurrent random queries, for > > example 10 queries at one time increases QTime from avg 500 ms to 3000 ms > > at 2 nodes. > > > > 1. How can I tweak my queries or parameters or Solr's config to decrease > > QTime? > > 2. What if I put my index data to emulated RAM directory, can it increase > > greatly performance? > > 3. Sorting by boost queries has a great influence on QTime, how can I > > optimize boost queries? > > 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 > nodes > > per machine, will it increase performance? > > 5. What is "multi-core query", how can I configure it, and will it > increase > > performance? > > > > Thank you! >
Re: Solr PHP highload search
How much memory are you giving the JVM? Have you put a performance monitor on the running process to see what resources have been exhausted (i.e. are you I/O bound? CPU bound?) Best Erick On Tue, Jun 12, 2012 at 3:40 AM, Alexandr Bocharov wrote: > Hi, all. > > I need advice for configuring Solr search to use at highload production. > > I've wrote user's search engine (PHP class), that uses over 70 parameters > for searching users. > User's database is over 30 millions records. > Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. > Previous search engine can handle 700,000 queries per day for searching > users - it is ~8 queries/sec (4 mysql servers with manual sharding via > Gearman) > > Example of queries are: > > [responseHeader] => SolrObject Object > ( > [status] => 0 > [QTime] => 517 > [params] => SolrObject Object > ( > [bq] => Array > ( > [0] => bool_field1:1^30 > [1] => str_field1:str_value1^15 > [2] => tint_field1:tint_field1^5 > [3] => bool_field2:1^6 > [4] => date_field1:[NOW-14DAYS TO NOW]^20 > [5] => date_field2:[NOW-14DAYS TO NOW]^5 > ) > > [indent] => on > [start] => 0 > [q.alt] => *:* > [wt] => xml > [fq] => Array > ( > [0] => tint_field2:[tint_value2 TO tint_value22] > [1] => str_field1:str_value1 > [2] => str_field2:str_value2 > [3] => tint_field3:(tint_value3 OR tint_value32 > OR tint_value33 OR tint_value34 OR tint_value5) > [4] => tint_field4:tint_value4 > [5] => -bool_field1:[* TO *] > ) > > [version] => 2.2 > [defType] => dismax > [rows] => 10 > ) > > ) > > > I test my PHP search API and found that concurrent random queries, for > example 10 queries at one time increases QTime from avg 500 ms to 3000 ms > at 2 nodes. > > 1. How can I tweak my queries or parameters or Solr's config to decrease > QTime? > 2. What if I put my index data to emulated RAM directory, can it increase > greatly performance? > 3. Sorting by boost queries has a great influence on QTime, how can I > optimize boost queries? > 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 nodes > per machine, will it increase performance? > 5. What is "multi-core query", how can I configure it, and will it increase > performance? > > Thank you!
Re: Solr PHP highload search
Add "&debugQuery=true" to your query and look at the "timing" section that comes back with the response to see q breakdown of Qtime. It should offer some insight into which search component(s) are taking the most time. That might point you in the right direction for improvements. Also, see how much JVM memory is available when you are running queries. Maybe memory is low and garbage collections are occurring too frequently. -- Jack Krupansky -Original Message- From: Alexandr Bocharov Sent: Tuesday, June 12, 2012 3:40 AM To: solr-user@lucene.apache.org Subject: Solr PHP highload search Hi, all. I need advice for configuring Solr search to use at highload production. I've wrote user's search engine (PHP class), that uses over 70 parameters for searching users. User's database is over 30 millions records. Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. Previous search engine can handle 700,000 queries per day for searching users - it is ~8 queries/sec (4 mysql servers with manual sharding via Gearman) Example of queries are: [responseHeader] => SolrObject Object ( [status] => 0 [QTime] => 517 [params] => SolrObject Object ( [bq] => Array ( [0] => bool_field1:1^30 [1] => str_field1:str_value1^15 [2] => tint_field1:tint_field1^5 [3] => bool_field2:1^6 [4] => date_field1:[NOW-14DAYS TO NOW]^20 [5] => date_field2:[NOW-14DAYS TO NOW]^5 ) [indent] => on [start] => 0 [q.alt] => *:* [wt] => xml [fq] => Array ( [0] => tint_field2:[tint_value2 TO tint_value22] [1] => str_field1:str_value1 [2] => str_field2:str_value2 [3] => tint_field3:(tint_value3 OR tint_value32 OR tint_value33 OR tint_value34 OR tint_value5) [4] => tint_field4:tint_value4 [5] => -bool_field1:[* TO *] ) [version] => 2.2 [defType] => dismax [rows] => 10 ) ) I test my PHP search API and found that concurrent random queries, for example 10 queries at one time increases QTime from avg 500 ms to 3000 ms at 2 nodes. 1. How can I tweak my queries or parameters or Solr's config to decrease QTime? 2. What if I put my index data to emulated RAM directory, can it increase greatly performance? 3. Sorting by boost queries has a great influence on QTime, how can I optimize boost queries? 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 nodes per machine, will it increase performance? 5. What is "multi-core query", how can I configure it, and will it increase performance? Thank you!