Re: Solr Shards multi core slower then single big core
Am 14.05.2012 05:56, schrieb arjit: Thanks Erick for the reply. I have 6 cores which doesn't contain duplicated data. every core has some unique data. What I thought was when I read it would read parallel 6 cores and join the result and return the query. And this would be efficient then reading one big core. No, it's not. When you request 10 documents from Solr, it can't know in prior which shards contain how many of those documents. It could be that each shard only needs to fill one or two documents into the result, but it might be that only one shard conatins all ten docuemnts. Therefor, Solr needs to request 10 documents from each shard, then taking only the 10 top documents from those 60 ones and drop the rest. And it gets worse when you set an offset of, say, 100. Sharding is (nearly) always slower than using one big index with sufficient hardware resources. Only use sharding when your index is too huge to fit into one single machine. Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Sharding is (nearly) always slower than using one big index with sufficient hardware resources. Only use sharding when your index is too huge to fit into one single machine. If you're not constrained by CPU or IO, in other words have plenty of CPU cores available together with for example separate hard discs for each shard splitting your index into smaller shards can in some cases make a huge difference in one box too. -- Sami Siren
Re: Solr Shards multi core slower then single big core
Am 14.05.2012 13:22, schrieb Sami Siren: Sharding is (nearly) always slower than using one big index with sufficient hardware resources. Only use sharding when your index is too huge to fit into one single machine. If you're not constrained by CPU or IO, in other words have plenty of CPU cores available together with for example separate hard discs for each shard splitting your index into smaller shards can in some cases make a huge difference in one box too. Do you have an example? This is hard to believe. If you've several shard on the same machine, you'll need that much memory that each shard has enough for all its caches and duch. With that lot of memory, a single Solr core should be really fast. If dividing the index is the reason, then a software RAID 0 (striping) should be much better. The only point I see is the concurrent search for one request. Maybe, for large requests, this might outweigh the sharding overhead, but only for long-running requests without disk I/O. I only see the case when using very complicated query functions. And, this only stays true as long as you don't run multiple concurrent requests. Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Michael Kuhlmann k...@solarier.de To: solr-user@lucene.apache.org Sent: Monday, May 14, 2012 7:56 AM Subject: Re: Solr Shards multi core slower then single big core Am 14.05.2012 13:22, schrieb Sami Siren: Sharding is (nearly) always slower than using one big index with sufficient hardware resources. Only use sharding when your index is too huge to fit into one single machine. If you're not constrained by CPU or IO, in other words have plenty of CPU cores available together with for example separate hard discs for each shard splitting your index into smaller shards can in some cases make a huge difference in one box too. Do you have an example? This is hard to believe. If you've several shard on the same machine, you'll need that much memory that each shard has enough for all its caches and duch. With that lot of memory, a single Solr core should be really fast. If dividing the index is the reason, then a software RAID 0 (striping) should be much better. The only point I see is the concurrent search for one request. Maybe, for large requests, this might outweigh the sharding overhead, but only for long-running requests without disk I/O. I only see the case when using very complicated query functions. And, this only stays true as long as you don't run multiple concurrent requests. Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single requests, or even under heavy load? Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Hi Kuli, As long as there are enough CPUs with spare cycles and disk IO is not a bottleneck, this works faster. This was 12+ months ago. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Michael Kuhlmann k...@solarier.de To: solr-user@lucene.apache.org Sent: Monday, May 14, 2012 10:21 AM Subject: Re: Solr Shards multi core slower then single big core Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single requests, or even under heavy load? Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Hi, all, I've been running into murmurs about this idea elsewhere: http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene Michael On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Kuli, As long as there are enough CPUs with spare cycles and disk IO is not a bottleneck, this works faster. This was 12+ months ago. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Michael Kuhlmann k...@solarier.de To: solr-user@lucene.apache.org Sent: Monday, May 14, 2012 10:21 AM Subject: Re: Solr Shards multi core slower then single big core Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single requests, or even under heavy load? Greetings, Kuli
Re: Solr Shards multi core slower then single big core
We used to have one large index - then moved to 10 shards (7 million docs each) - parallel search across all shards, and we get better performance that way. We use a 40 core box with 128GB ram. We do a lot of faceting so maybe that is why since facets can be built in parallel on different threads/cores. We also have indexes on fast local disks (6 15K RPM disks using raid stripes). On May 14, 2012, at 10:42 AM, Michael Della Bitta wrote: Hi, all, I've been running into murmurs about this idea elsewhere: http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene Michael On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Kuli, As long as there are enough CPUs with spare cycles and disk IO is not a bottleneck, this works faster. This was 12+ months ago. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Michael Kuhlmann k...@solarier.de To: solr-user@lucene.apache.org Sent: Monday, May 14, 2012 10:21 AM Subject: Re: Solr Shards multi core slower then single big core Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single requests, or even under heavy load? Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Robert can you tell what you mean when you say We do a lot of faceting so maybe that is why since facets can be built in parallel on different threads/cores. I am novice in solr. Can you tell me where Can i read about it ? Thanks , Arjit On Mon, May 14, 2012 at 8:54 PM, Robert Stewart [via Lucene] ml-node+s472066n3983692...@n3.nabble.com wrote: We used to have one large index - then moved to 10 shards (7 million docs each) - parallel search across all shards, and we get better performance that way. We use a 40 core box with 128GB ram. We do a lot of faceting so maybe that is why since facets can be built in parallel on different threads/cores. We also have indexes on fast local disks (6 15K RPM disks using raid stripes). On May 14, 2012, at 10:42 AM, Michael Della Bitta wrote: Hi, all, I've been running into murmurs about this idea elsewhere: http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene Michael On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic [hidden email] http://user/SendEmail.jtp?type=nodenode=3983692i=0 wrote: Hi Kuli, As long as there are enough CPUs with spare cycles and disk IO is not a bottleneck, this works faster. This was 12+ months ago. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Michael Kuhlmann [hidden email]http://user/SendEmail.jtp?type=nodenode=3983692i=1 To: [hidden email]http://user/SendEmail.jtp?type=nodenode=3983692i=2 Sent: Monday, May 14, 2012 10:21 AM Subject: Re: Solr Shards multi core slower then single big core Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single requests, or even under heavy load? Greetings, Kuli -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3983692.html To unsubscribe from Solr Shards multi core slower then single big core, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3979115code=YXJqaXQyOTJAZ21haWwuY29tfDM5NzkxMTV8MTIwOTQwMDU4MA== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3983697.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shards multi core slower then single big core
Aha! See, Kuli, I wasn't making it up! ;) Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Robert Stewart bstewart...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, May 14, 2012 11:23 AM Subject: Re: Solr Shards multi core slower then single big core We used to have one large index - then moved to 10 shards (7 million docs each) - parallel search across all shards, and we get better performance that way. We use a 40 core box with 128GB ram. We do a lot of faceting so maybe that is why since facets can be built in parallel on different threads/cores. We also have indexes on fast local disks (6 15K RPM disks using raid stripes). On May 14, 2012, at 10:42 AM, Michael Della Bitta wrote: Hi, all, I've been running into murmurs about this idea elsewhere: http://stackoverflow.com/questions/8698762/run-multiple-big-solr-shard-instances-on-one-physical-machine http://java.dzone.com/articles/optimizing-solr-or-how-7x-your?mz=33057-solr_lucene Michael On Mon, May 14, 2012 at 10:29 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Kuli, As long as there are enough CPUs with spare cycles and disk IO is not a bottleneck, this works faster. This was 12+ months ago. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Michael Kuhlmann k...@solarier.de To: solr-user@lucene.apache.org Sent: Monday, May 14, 2012 10:21 AM Subject: Re: Solr Shards multi core slower then single big core Am 14.05.2012 16:18, schrieb Otis Gospodnetic: Hi Kuli, In a client engagement, I did see this (N shards on 1 beefy box with lots of RAM and CPU cores) be faster than 1 big index. I want to believe you, but I also want to understand. Can you explain why? And did this only happen for single requests, or even under heavy load? Greetings, Kuli
Re: Solr Shards multi core slower then single big core
Thanks Erick for the reply. I have 6 cores which doesn't contain duplicated data. every core has some unique data. What I thought was when I read it would read parallel 6 cores and join the result and return the query. And this would be efficient then reading one big core. My question is wouldn't Solr read in parallel from shards when a query is fired to it ? Please let me know If i am assuming something which is wrong. Thanks , Arjit On Sun, May 13, 2012 at 12:44 AM, Erick Erickson [via Lucene] ml-node+s472066n3982950...@n3.nabble.com wrote: One of the points of sharding is to use more _machines_. Running multiple shards on a single machine is not magically going to make things faster. In fact I'd expect your process to consume more resources since the cores are now not sharing common data (i.e. having a single word in more than one core will use two instances of that word). Best Erick On Fri, May 11, 2012 at 3:38 AM, arjit [hidden email]http://user/SendEmail.jtp?type=nodenode=3982950i=0 wrote: My query is SolrQuery sQuery = new SolrQuery(query.getQueryStr()); sQuery.setQueryType(dismax); sQuery.setRows(100); if (!query.isSearchOnDefaultField()) { sQuery.setParam(qf, queryFields.toArray(new String[queryFields.size()])); } sQuery.setFields(visibleFields.toArray(new String[visibleFields.size()])); if(query.isORQuery()) { sQuery.setParam(mm,1); } My search is requestHandler name=dismax class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.01/float str name=shardslocalhost:9090/solr/book1,localhost:9090/solr/book2,localhost:9090/solr/book3,localhost:9090/solr/book4,localhost:9090/solr/book5,localhost:9090/solr/book6/str str name=qf text^2.0 /str str name=fl title item_id author titleMinusAuthor /str int name=ps4/int str name=q.alt*:*/str str name=hl.fltext features name/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str str name=f.text.hl.fragmenterregex/str /lst /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3979243.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3982950.html To unsubscribe from Solr Shards multi core slower then single big core, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3979115code=YXJqaXQyOTJAZ21haWwuY29tfDM5NzkxMTV8MTIwOTQwMDU4MA== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3983601.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shards multi core slower then single big core
My query is SolrQuery sQuery = new SolrQuery(query.getQueryStr()); sQuery.setQueryType(dismax); sQuery.setRows(100); if (!query.isSearchOnDefaultField()) { sQuery.setParam(qf, queryFields.toArray(new String[queryFields.size()])); } sQuery.setFields(visibleFields.toArray(new String[visibleFields.size()])); if(query.isORQuery()) { sQuery.setParam(mm,1); } My search is requestHandler name=dismax class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.01/float str name=shardslocalhost:9090/solr/book1,localhost:9090/solr/book2,localhost:9090/solr/book3,localhost:9090/solr/book4,localhost:9090/solr/book5,localhost:9090/solr/book6/str str name=qf text^2.0 /str str name=fl title item_id author titleMinusAuthor /str int name=ps4/int str name=q.alt*:*/str str name=hl.fltext features name/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str str name=f.text.hl.fragmenterregex/str /lst /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Shards-multi-core-slower-then-single-big-core-tp3979115p3979243.html Sent from the Solr - User mailing list archive at Nabble.com.