Re: Scaling SolrCloud
Is not a typo. I was wrong, for zookeeper 2 nodes still count as majority. It's not the desirable configuration but is tolerable. Thanks Erick. \-- /Yago Riveiro > On Jan 21 2016, at 4:15 am, Erick Erickson erickerick...@gmail.com wrote: > > bq: 3 are to risky, you lost one you lost quorum > > Typo? You need to lose two. > > On Wed, Jan 20, 2016 at 6:25 AM, Yago Riveiro yago.rive...@gmail.com wrote: Our Zookeeper cluster is an ensemble of 5 machines, is a good starting point, 3 are to risky, you lost one you lost quorum and with 7 sync cost increase. ZK cluster is in machines without IO and rotative hdd (don't not use SDD to gain IO performance, zookeeper is optimized to spinning disks). The ZK cluster behaves without problems, the first deploy of ZK was in the same machines that the Solr Cluster (ZK log in its own hdd) and that didn't wok very well, CPU and networking IO from Solr Cluster was too much. About schema modifications. Modify the schema to add new fields is relative simple with new API, in the pass all the work was manually uploading the schema to ZK and reloading all collections (indexing must be disable or timeouts and funny errors happen). With the new Schema API this is more user friendly. Anyway, I stop indexing and for reload the collections (I don't know if it's necessary nowadays). About Indexing data. We have self made data importer, it's not java and not performs batch indexing (with 500 collections buffer data and build the batch is expensive and complicate for error handling). We use regular HTTP post in json. Our throughput is about 1000 docs/s without any type of optimization. Some time we have issues with replication, the slave can keep pace with leader insertion and a full sync is requested, this is bad because sync the replica again implicates a lot of IO wait and CPU and with replicas with 100G take an hour or more (normally when this happen, we disable indexing to release IO and CPU and not kill the node with a load of 50 or 60). In this department my advice is "keep it simple" in the end is an HTTP POST to a node of the cluster. \\-- /Yago Riveiro On Jan 20 2016, at 1:39 pm, Troy Edwards lt;tedwards415...@gmail.comgt; wrote: Thank you for sharing your experiences/ideas. Yago since you have 8 billion documents over 500 collections, can you share what/how you do index maintenance (e.g. add field)? And how are you loading data into the index? Any experiences around how Zookeeper ensemble behaves with so many collections? Best, On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro lt;yago.rive...@gmail.comgt; wrote: gt; What I can say is: gt; gt; gt; * SDD (crucial for performance if the index doesn't fit in memory, and gt; will not fit) gt; * Divide and conquer, for that volume of docs you will need more than 6 gt; nodes. gt; * DocValues to not stress the java HEAP. gt; * Do you will you aggregate data?, if yes, what is your max gt; cardinality?, this question is the most important to size correctly the gt; memory needs. gt; * Latency is important too, which threshold is acceptable before gt; consider a query slow? gt; At my company we are running a 12 terabytes (2 replicas) Solr cluster with gt; 8 gt; billion documents sparse over 500 collection . For this we have about 12 gt; machines with SDDs and 32G of ram each (~24G for the heap). gt; gt; We don't have a strict need of speed, 30 second query to aggregate 100 gt; million gt; documents with 1M of unique keys is fast enough for us, normally the gt; aggregation performance decrease as the number of unique keys increase, gt; with gt; low unique key factor, queries take less than 2 seconds if data is in OS gt; cache. gt; gt; Personal recommendations: gt; gt; * Sharding is important and smart sharding is crucial, you don't want gt; run queries on data that is not interesting (this slow down queries when gt; the dataset is big). gt; * If you want measure speed do it with about 1 billion documents to gt; simulate something real (real for 10 billion document world). gt; * Index with re-indexing in mind. with 10 billion docs, re-index data gt; takes months ... This is important if you don't use regular features of gt; Solr. In my case I configured Docvalues with disk format (not standard gt; feature in 4.x) and at some point this format was deprecated. Upgrade Solr gt; to 5.x was an epic 3 months battle to do it without full downtime. gt; * Solr is like your girlfriend, will demand love and care and plenty of gt; space to full-recover replicas that in some point are out of sync, happen a gt; lot restarting nodes (this is annoying with replicas with 100G),
Re: Scaling SolrCloud
NP. My usual question though is "how often do you expect to lose a second ZK node before you can replace the first one that died?" My tongue-in-cheek statement is often "If you're losing two nodes regularly, you have problems with your hardware that you're not really going to address by adding more ZK nodes" ;). And do note that even if you lose quorum, SolrCloud will continue to serve _queries_, albeit the "picture" each individual Solr node has of the current state of all the Solr nodes will get stale. You won't be able to index though. That said, the internal Solr load balancers auto-distribute queries anyway to live nodes, so things can limp along. As always, it's a tradeoff between expense/complexity and robustness though, and each and every situation is different in how much risk it can tolerate. FWIW, Erick On Thu, Jan 21, 2016 at 1:49 AM, Yago Riveirowrote: > Is not a typo. I was wrong, for zookeeper 2 nodes still count as majority. > It's not the desirable configuration but is tolerable. > > > > Thanks Erick. > > > > \-- > > /Yago Riveiro > >> On Jan 21 2016, at 4:15 am, Erick Erickson erickerick...@gmail.com > wrote: > >> > >> bq: 3 are to risky, you lost one you lost quorum > >> > >> Typo? You need to lose two. > >> > >> On Wed, Jan 20, 2016 at 6:25 AM, Yago Riveiro yago.rive...@gmail.com > wrote: > Our Zookeeper cluster is an ensemble of 5 machines, is a good starting > point, > 3 are to risky, you lost one you lost quorum and with 7 sync cost > increase. > > > > ZK cluster is in machines without IO and rotative hdd (don't not use SDD > to > gain IO performance, zookeeper is optimized to spinning disks). > > > > The ZK cluster behaves without problems, the first deploy of ZK was in > the > same machines that the Solr Cluster (ZK log in its own hdd) and that > didn't > wok very well, CPU and networking IO from Solr Cluster was too much. > > > > About schema modifications. > > Modify the schema to add new fields is relative simple with new API, in > the > pass all the work was manually uploading the schema to ZK and reloading > all > collections (indexing must be disable or timeouts and funny errors > happen). > > With the new Schema API this is more user friendly. Anyway, I stop > indexing > and for reload the collections (I don't know if it's necessary nowadays). > > About Indexing data. > > > > We have self made data importer, it's not java and not performs batch > indexing > (with 500 collections buffer data and build the batch is expensive and > complicate for error handling). > > > > We use regular HTTP post in json. Our throughput is about 1000 docs/s > without > any type of optimization. Some time we have issues with replication, the > slave > can keep pace with leader insertion and a full sync is requested, this is > bad > because sync the replica again implicates a lot of IO wait and CPU and > with > replicas with 100G take an hour or more (normally when this happen, we > disable > indexing to release IO and CPU and not kill the node with a load of 50 or > 60). > > In this department my advice is "keep it simple" in the end is an HTTP > POST to > a node of the cluster. > > > > \\-- > > /Yago Riveiro > > On Jan 20 2016, at 1:39 pm, Troy Edwards > lt;tedwards415...@gmail.comgt; > wrote: > > > > Thank you for sharing your experiences/ideas. > > > > Yago since you have 8 billion documents over 500 collections, can you > share > what/how you do index maintenance (e.g. add field)? And how are you > loading > data into the index? Any experiences around how Zookeeper ensemble > behaves > with so many collections? > > > > Best, > > > > > On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro > lt;yago.rive...@gmail.comgt; > wrote: > > > > gt; What I can say is: > gt; > gt; > gt; * SDD (crucial for performance if the index doesn't fit in > memory, and > gt; will not fit) > gt; * Divide and conquer, for that volume of docs you will need more > than 6 > gt; nodes. > gt; * DocValues to not stress the java HEAP. > gt; * Do you will you aggregate data?, if yes, what is your max > gt; cardinality?, this question is the most important to size > correctly the > gt; memory needs. > gt; * Latency is important too, which threshold is acceptable before > gt; consider a query slow? > gt; At my company we are running a 12 terabytes (2 replicas) Solr > cluster > with > gt; 8 > gt; billion documents sparse over 500 collection . For this we have > about 12 > gt; machines with SDDs and 32G of ram each (~24G for the heap). > gt; > gt; We don't have a strict need of speed, 30 second query to > aggregate 100 > gt; million > gt; documents with 1M of unique keys is fast enough for us, normally > the > gt; aggregation performance decrease as the number of unique keys > increase, > gt; with > gt; low unique key factor, queries take less than 2 seconds if data > is in OS > gt; cache. > gt; >
Re: Scaling SolrCloud
Alternatively, do you still want to be protected against a single failure during scheduled maintenance? With a three node ensemble, when one Zookeeper node is being updated or moved to a new instance, one more failure means it does not have a quorum. With a five node ensemble, three nodes would still be up. If you are OK with that risk, run three nodes. If not, run five. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jan 21, 2016, at 9:27 AM, Erick Ericksonwrote: > > NP. My usual question though is "how often do you expect to lose a > second ZK node before you can replace the first one that died?" > > My tongue-in-cheek statement is often "If you're losing two nodes > regularly, you have problems with your hardware that you're not really > going to address by adding more ZK nodes" ;). > > And do note that even if you lose quorum, SolrCloud will continue to > serve _queries_, albeit the "picture" each individual Solr node has of > the current state of all the Solr nodes will get stale. You won't be > able to index though. That said, the internal Solr load balancers > auto-distribute queries anyway to live nodes, so things can limp > along. > > As always, it's a tradeoff between expense/complexity and robustness > though, and each and every situation is different in how much risk it > can tolerate. > > FWIW, > Erick > > On Thu, Jan 21, 2016 at 1:49 AM, Yago Riveiro wrote: >> Is not a typo. I was wrong, for zookeeper 2 nodes still count as majority. >> It's not the desirable configuration but is tolerable. >> >> >> >> Thanks Erick. >> >> >> >> \-- >> >> /Yago Riveiro >> >>> On Jan 21 2016, at 4:15 am, Erick Erickson erickerick...@gmail.com >> wrote: >> >>> >> >>> bq: 3 are to risky, you lost one you lost quorum >> >>> >> >>> Typo? You need to lose two. >> >>> >> >>> On Wed, Jan 20, 2016 at 6:25 AM, Yago Riveiro yago.rive...@gmail.com >> wrote: >> Our Zookeeper cluster is an ensemble of 5 machines, is a good starting >> point, >> 3 are to risky, you lost one you lost quorum and with 7 sync cost >> increase. >> >> >> >> ZK cluster is in machines without IO and rotative hdd (don't not use SDD >> to >> gain IO performance, zookeeper is optimized to spinning disks). >> >> >> >> The ZK cluster behaves without problems, the first deploy of ZK was in >> the >> same machines that the Solr Cluster (ZK log in its own hdd) and that >> didn't >> wok very well, CPU and networking IO from Solr Cluster was too much. >> >> >> >> About schema modifications. >> >> Modify the schema to add new fields is relative simple with new API, in >> the >> pass all the work was manually uploading the schema to ZK and reloading >> all >> collections (indexing must be disable or timeouts and funny errors >> happen). >> >> With the new Schema API this is more user friendly. Anyway, I stop >> indexing >> and for reload the collections (I don't know if it's necessary >> nowadays). >> >> About Indexing data. >> >> >> >> We have self made data importer, it's not java and not performs batch >> indexing >> (with 500 collections buffer data and build the batch is expensive and >> complicate for error handling). >> >> >> >> We use regular HTTP post in json. Our throughput is about 1000 docs/s >> without >> any type of optimization. Some time we have issues with replication, the >> slave >> can keep pace with leader insertion and a full sync is requested, this >> is >> bad >> because sync the replica again implicates a lot of IO wait and CPU and >> with >> replicas with 100G take an hour or more (normally when this happen, we >> disable >> indexing to release IO and CPU and not kill the node with a load of 50 >> or >> 60). >> >> In this department my advice is "keep it simple" in the end is an HTTP >> POST to >> a node of the cluster. >> >> >> >> \\-- >> >> /Yago Riveiro >> >> On Jan 20 2016, at 1:39 pm, Troy Edwards >> lt;tedwards415...@gmail.comgt; >> wrote: >> >> >> >> Thank you for sharing your experiences/ideas. >> >> >> >> Yago since you have 8 billion documents over 500 collections, can >> you >> share >> what/how you do index maintenance (e.g. add field)? And how are you >> loading >> data into the index? Any experiences around how Zookeeper ensemble >> behaves >> with so many collections? >> >> >> >> Best, >> >> >> >> >> On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro >> lt;yago.rive...@gmail.comgt; >> wrote: >> >> >> >> gt; What I can say is: >> gt; >> gt; >> gt; * SDD (crucial for performance if the index doesn't fit in >> memory, and >> gt; will not fit) >> gt; * Divide and conquer, for that volume of docs you will need >> more >> than 6 >> gt; nodes. >> gt; * DocValues to not stress the java HEAP. >> gt; * Do you will you aggregate data?, if yes, what is your max >> gt; cardinality?, this question is the most
Re: Scaling SolrCloud
bq: 3 are to risky, you lost one you lost quorum Typo? You need to lose two. On Wed, Jan 20, 2016 at 6:25 AM, Yago Riveirowrote: > Our Zookeeper cluster is an ensemble of 5 machines, is a good starting point, > 3 are to risky, you lost one you lost quorum and with 7 sync cost increase. > > > > ZK cluster is in machines without IO and rotative hdd (don't not use SDD to > gain IO performance, zookeeper is optimized to spinning disks). > > > > The ZK cluster behaves without problems, the first deploy of ZK was in the > same machines that the Solr Cluster (ZK log in its own hdd) and that didn't > wok very well, CPU and networking IO from Solr Cluster was too much. > > > > About schema modifications. > > Modify the schema to add new fields is relative simple with new API, in the > pass all the work was manually uploading the schema to ZK and reloading all > collections (indexing must be disable or timeouts and funny errors happen). > > With the new Schema API this is more user friendly. Anyway, I stop indexing > and for reload the collections (I don't know if it's necessary nowadays). > > About Indexing data. > > > > We have self made data importer, it's not java and not performs batch indexing > (with 500 collections buffer data and build the batch is expensive and > complicate for error handling). > > > > We use regular HTTP post in json. Our throughput is about 1000 docs/s without > any type of optimization. Some time we have issues with replication, the slave > can keep pace with leader insertion and a full sync is requested, this is bad > because sync the replica again implicates a lot of IO wait and CPU and with > replicas with 100G take an hour or more (normally when this happen, we disable > indexing to release IO and CPU and not kill the node with a load of 50 or 60). > > In this department my advice is "keep it simple" in the end is an HTTP POST to > a node of the cluster. > > > > \-- > > /Yago Riveiro > >> On Jan 20 2016, at 1:39 pm, Troy Edwards tedwards415...@gmail.com > wrote: > >> > >> Thank you for sharing your experiences/ideas. > >> > >> Yago since you have 8 billion documents over 500 collections, can you share > what/how you do index maintenance (e.g. add field)? And how are you loading > data into the index? Any experiences around how Zookeeper ensemble behaves > with so many collections? > >> > >> Best, > >> > >> > On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro yago.rive...@gmail.com > wrote: > >> > >> What I can say is: > > > * SDD (crucial for performance if the index doesn't fit in memory, and > will not fit) > * Divide and conquer, for that volume of docs you will need more than 6 > nodes. > * DocValues to not stress the java HEAP. > * Do you will you aggregate data?, if yes, what is your max > cardinality?, this question is the most important to size correctly the > memory needs. > * Latency is important too, which threshold is acceptable before > consider a query slow? > At my company we are running a 12 terabytes (2 replicas) Solr cluster > with > 8 > billion documents sparse over 500 collection . For this we have about 12 > machines with SDDs and 32G of ram each (~24G for the heap). > > We don't have a strict need of speed, 30 second query to aggregate 100 > million > documents with 1M of unique keys is fast enough for us, normally the > aggregation performance decrease as the number of unique keys increase, > with > low unique key factor, queries take less than 2 seconds if data is in OS > cache. > > Personal recommendations: > > * Sharding is important and smart sharding is crucial, you don't want > run queries on data that is not interesting (this slow down queries when > the dataset is big). > * If you want measure speed do it with about 1 billion documents to > simulate something real (real for 10 billion document world). > * Index with re-indexing in mind. with 10 billion docs, re-index data > takes months ... This is important if you don't use regular features of > Solr. In my case I configured Docvalues with disk format (not standard > feature in 4.x) and at some point this format was deprecated. Upgrade > Solr > to 5.x was an epic 3 months battle to do it without full downtime. > * Solr is like your girlfriend, will demand love and care and plenty of > space to full-recover replicas that in some point are out of sync, happen > a > lot restarting nodes (this is annoying with replicas with 100G), don't > underestimate this point. Free space can save your life. > > \\-- > > /Yago Riveiro > > On Jan 19 2016, at 11:26 pm, Shawn Heisey > lt;apa...@elyograg.orggt; > wrote: > > > > On 1/19/2016 1:30 PM, Troy Edwards wrote: > gt; We are currently "beta testing" a SolrCloud with 2 nodes and 2 > shards > with > gt; 2 replicas each. The number of documents is about 125000. > gt; > gt; We now want to scale this to about 10 billion documents. > gt; > gt; What are the steps to prototyping,
Re: Scaling SolrCloud
Our Zookeeper cluster is an ensemble of 5 machines, is a good starting point, 3 are to risky, you lost one you lost quorum and with 7 sync cost increase. ZK cluster is in machines without IO and rotative hdd (don't not use SDD to gain IO performance, zookeeper is optimized to spinning disks). The ZK cluster behaves without problems, the first deploy of ZK was in the same machines that the Solr Cluster (ZK log in its own hdd) and that didn't wok very well, CPU and networking IO from Solr Cluster was too much. About schema modifications. Modify the schema to add new fields is relative simple with new API, in the pass all the work was manually uploading the schema to ZK and reloading all collections (indexing must be disable or timeouts and funny errors happen). With the new Schema API this is more user friendly. Anyway, I stop indexing and for reload the collections (I don't know if it's necessary nowadays). About Indexing data. We have self made data importer, it's not java and not performs batch indexing (with 500 collections buffer data and build the batch is expensive and complicate for error handling). We use regular HTTP post in json. Our throughput is about 1000 docs/s without any type of optimization. Some time we have issues with replication, the slave can keep pace with leader insertion and a full sync is requested, this is bad because sync the replica again implicates a lot of IO wait and CPU and with replicas with 100G take an hour or more (normally when this happen, we disable indexing to release IO and CPU and not kill the node with a load of 50 or 60). In this department my advice is "keep it simple" in the end is an HTTP POST to a node of the cluster. \-- /Yago Riveiro > On Jan 20 2016, at 1:39 pm, Troy Edwards tedwards415...@gmail.com wrote: > > Thank you for sharing your experiences/ideas. > > Yago since you have 8 billion documents over 500 collections, can you share what/how you do index maintenance (e.g. add field)? And how are you loading data into the index? Any experiences around how Zookeeper ensemble behaves with so many collections? > > Best, > > On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveiro yago.rive...@gmail.com wrote: > > What I can say is: * SDD (crucial for performance if the index doesn't fit in memory, and will not fit) * Divide and conquer, for that volume of docs you will need more than 6 nodes. * DocValues to not stress the java HEAP. * Do you will you aggregate data?, if yes, what is your max cardinality?, this question is the most important to size correctly the memory needs. * Latency is important too, which threshold is acceptable before consider a query slow? At my company we are running a 12 terabytes (2 replicas) Solr cluster with 8 billion documents sparse over 500 collection . For this we have about 12 machines with SDDs and 32G of ram each (~24G for the heap). We don't have a strict need of speed, 30 second query to aggregate 100 million documents with 1M of unique keys is fast enough for us, normally the aggregation performance decrease as the number of unique keys increase, with low unique key factor, queries take less than 2 seconds if data is in OS cache. Personal recommendations: * Sharding is important and smart sharding is crucial, you don't want run queries on data that is not interesting (this slow down queries when the dataset is big). * If you want measure speed do it with about 1 billion documents to simulate something real (real for 10 billion document world). * Index with re-indexing in mind. with 10 billion docs, re-index data takes months ... This is important if you don't use regular features of Solr. In my case I configured Docvalues with disk format (not standard feature in 4.x) and at some point this format was deprecated. Upgrade Solr to 5.x was an epic 3 months battle to do it without full downtime. * Solr is like your girlfriend, will demand love and care and plenty of space to full-recover replicas that in some point are out of sync, happen a lot restarting nodes (this is annoying with replicas with 100G), don't underestimate this point. Free space can save your life. \\-- /Yago Riveiro On Jan 19 2016, at 11:26 pm, Shawn Heisey lt;apa...@elyograg.orggt; wrote: On 1/19/2016 1:30 PM, Troy Edwards wrote: gt; We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards with gt; 2 replicas each. The number of documents is about 125000. gt; gt; We now want to scale this to about 10 billion documents. gt; gt; What are the steps to prototyping, hardware estimation and stress testing? There is no general information available for sizing, because there are too many factors that will affect the answers. Some of the important information that you need will be
Re: Scaling SolrCloud
Thank you for sharing your experiences/ideas. Yago since you have 8 billion documents over 500 collections, can you share what/how you do index maintenance (e.g. add field)? And how are you loading data into the index? Any experiences around how Zookeeper ensemble behaves with so many collections? Best, On Tue, Jan 19, 2016 at 6:05 PM, Yago Riveirowrote: > What I can say is: > > > * SDD (crucial for performance if the index doesn't fit in memory, and > will not fit) > * Divide and conquer, for that volume of docs you will need more than 6 > nodes. > * DocValues to not stress the java HEAP. > * Do you will you aggregate data?, if yes, what is your max > cardinality?, this question is the most important to size correctly the > memory needs. > * Latency is important too, which threshold is acceptable before > consider a query slow? > At my company we are running a 12 terabytes (2 replicas) Solr cluster with > 8 > billion documents sparse over 500 collection . For this we have about 12 > machines with SDDs and 32G of ram each (~24G for the heap). > > We don't have a strict need of speed, 30 second query to aggregate 100 > million > documents with 1M of unique keys is fast enough for us, normally the > aggregation performance decrease as the number of unique keys increase, > with > low unique key factor, queries take less than 2 seconds if data is in OS > cache. > > Personal recommendations: > > * Sharding is important and smart sharding is crucial, you don't want > run queries on data that is not interesting (this slow down queries when > the dataset is big). > * If you want measure speed do it with about 1 billion documents to > simulate something real (real for 10 billion document world). > * Index with re-indexing in mind. with 10 billion docs, re-index data > takes months ... This is important if you don't use regular features of > Solr. In my case I configured Docvalues with disk format (not standard > feature in 4.x) and at some point this format was deprecated. Upgrade Solr > to 5.x was an epic 3 months battle to do it without full downtime. > * Solr is like your girlfriend, will demand love and care and plenty of > space to full-recover replicas that in some point are out of sync, happen a > lot restarting nodes (this is annoying with replicas with 100G), don't > underestimate this point. Free space can save your life. > > \-- > > /Yago Riveiro > > > On Jan 19 2016, at 11:26 pm, Shawn Heisey apa...@elyograg.org > wrote: > > > > > > On 1/19/2016 1:30 PM, Troy Edwards wrote: > We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards > with > 2 replicas each. The number of documents is about 125000. > > We now want to scale this to about 10 billion documents. > > What are the steps to prototyping, hardware estimation and stress > testing? > > > > > > There is no general information available for sizing, because there are > too many factors that will affect the answers. Some of the important > information that you need will be impossible to predict until you > actually build it and subject it to a real query load. > > > > > > https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont- > have-a-definitive-answer/ > > > > > > With an index of 10 billion documents, you may not be able to precisely > predict performance and hardware requirements from a small-scale > prototype. You'll likely need to build a full-scale system on a small > testbed, look for bottlenecks, ask for advice, and plan on a larger > system for production. > > > > > > The hard limit for documents on a single shard is slightly less than > Java's Integer.MAX_VALUE -- just over two billion. Because deleted > documents count against this max, about one billion documents per shard > is the absolute max that should be loaded in practice. > > > > > > BUT, if you actually try to put one billion documents in a single > server, performance will likely be awful. A more reasonable limit per > machine is 100 million ... but even this is quite large. You might need > smaller shards, or you might be able to get good performance with larger > shards. It all depends on things that you may not even know yet. > > > > > > Memory is always a strong driver for Solr performance, and I am speaking > specifically of OS disk cache -- memory that has not been allocated by > any program. With 10 billion documents, your total index size will > likely be hundreds of gigabytes, and might even reach terabyte scale. > Good performance with indexes this large will require a lot of total > memory, which probably means that you will need a lot of servers and > many shards. SSD storage is strongly recommended. > > > > > > For extreme scaling on Solr, especially if the query rate will be high, > it is recommended to only have one shard replica per server. > > > > > > I have just added an "extreme scaling" section to the following wiki > page, but it's mostly a placeholder right now. I would like
Scaling SolrCloud
We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards with 2 replicas each. The number of documents is about 125000. We now want to scale this to about 10 billion documents. What are the steps to prototyping, hardware estimation and stress testing? Thanks
Re: Scaling SolrCloud
On 1/19/2016 1:30 PM, Troy Edwards wrote: We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards with 2 replicas each. The number of documents is about 125000. We now want to scale this to about 10 billion documents. What are the steps to prototyping, hardware estimation and stress testing? There is no general information available for sizing, because there are too many factors that will affect the answers. Some of the important information that you need will be impossible to predict until you actually build it and subject it to a real query load. https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ With an index of 10 billion documents, you may not be able to precisely predict performance and hardware requirements from a small-scale prototype. You'll likely need to build a full-scale system on a small testbed, look for bottlenecks, ask for advice, and plan on a larger system for production. The hard limit for documents on a single shard is slightly less than Java's Integer.MAX_VALUE -- just over two billion. Because deleted documents count against this max, about one billion documents per shard is the absolute max that should be loaded in practice. BUT, if you actually try to put one billion documents in a single server, performance will likely be awful. A more reasonable limit per machine is 100 million ... but even this is quite large. You might need smaller shards, or you might be able to get good performance with larger shards. It all depends on things that you may not even know yet. Memory is always a strong driver for Solr performance, and I am speaking specifically of OS disk cache -- memory that has not been allocated by any program. With 10 billion documents, your total index size will likely be hundreds of gigabytes, and might even reach terabyte scale. Good performance with indexes this large will require a lot of total memory, which probably means that you will need a lot of servers and many shards. SSD storage is strongly recommended. For extreme scaling on Solr, especially if the query rate will be high, it is recommended to only have one shard replica per server. I have just added an "extreme scaling" section to the following wiki page, but it's mostly a placeholder right now. I would like to have a discussion with people who operate very large indexes so I can put real usable information in this section. I'm on IRC quite frequently in the #solr channel. https://wiki.apache.org/solr/SolrPerformanceProblems Thanks, Shawn
Re: Scaling SolrCloud
What I can say is: * SDD (crucial for performance if the index doesn't fit in memory, and will not fit) * Divide and conquer, for that volume of docs you will need more than 6 nodes. * DocValues to not stress the java HEAP. * Do you will you aggregate data?, if yes, what is your max cardinality?, this question is the most important to size correctly the memory needs. * Latency is important too, which threshold is acceptable before consider a query slow? At my company we are running a 12 terabytes (2 replicas) Solr cluster with 8 billion documents sparse over 500 collection . For this we have about 12 machines with SDDs and 32G of ram each (~24G for the heap). We don't have a strict need of speed, 30 second query to aggregate 100 million documents with 1M of unique keys is fast enough for us, normally the aggregation performance decrease as the number of unique keys increase, with low unique key factor, queries take less than 2 seconds if data is in OS cache. Personal recommendations: * Sharding is important and smart sharding is crucial, you don't want run queries on data that is not interesting (this slow down queries when the dataset is big). * If you want measure speed do it with about 1 billion documents to simulate something real (real for 10 billion document world). * Index with re-indexing in mind. with 10 billion docs, re-index data takes months ... This is important if you don't use regular features of Solr. In my case I configured Docvalues with disk format (not standard feature in 4.x) and at some point this format was deprecated. Upgrade Solr to 5.x was an epic 3 months battle to do it without full downtime. * Solr is like your girlfriend, will demand love and care and plenty of space to full-recover replicas that in some point are out of sync, happen a lot restarting nodes (this is annoying with replicas with 100G), don't underestimate this point. Free space can save your life. \-- /Yago Riveiro > On Jan 19 2016, at 11:26 pm, Shawn Heisey apa...@elyograg.org wrote: > > On 1/19/2016 1:30 PM, Troy Edwards wrote: We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards with 2 replicas each. The number of documents is about 125000. We now want to scale this to about 10 billion documents. What are the steps to prototyping, hardware estimation and stress testing? > > There is no general information available for sizing, because there are too many factors that will affect the answers. Some of the important information that you need will be impossible to predict until you actually build it and subject it to a real query load. > > https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont- have-a-definitive-answer/ > > With an index of 10 billion documents, you may not be able to precisely predict performance and hardware requirements from a small-scale prototype. You'll likely need to build a full-scale system on a small testbed, look for bottlenecks, ask for advice, and plan on a larger system for production. > > The hard limit for documents on a single shard is slightly less than Java's Integer.MAX_VALUE -- just over two billion. Because deleted documents count against this max, about one billion documents per shard is the absolute max that should be loaded in practice. > > BUT, if you actually try to put one billion documents in a single server, performance will likely be awful. A more reasonable limit per machine is 100 million ... but even this is quite large. You might need smaller shards, or you might be able to get good performance with larger shards. It all depends on things that you may not even know yet. > > Memory is always a strong driver for Solr performance, and I am speaking specifically of OS disk cache -- memory that has not been allocated by any program. With 10 billion documents, your total index size will likely be hundreds of gigabytes, and might even reach terabyte scale. Good performance with indexes this large will require a lot of total memory, which probably means that you will need a lot of servers and many shards. SSD storage is strongly recommended. > > For extreme scaling on Solr, especially if the query rate will be high, it is recommended to only have one shard replica per server. > > I have just added an "extreme scaling" section to the following wiki page, but it's mostly a placeholder right now. I would like to have a discussion with people who operate very large indexes so I can put real usable information in this section. I'm on IRC quite frequently in the #solr channel. > > https://wiki.apache.org/solr/SolrPerformanceProblems > > Thanks, Shawn
Scaling SolrCloud and DIH
I'm curious how people are using DIH with SolrCloud. I have cron jobs set up to trigger the dataimports which come from both xml files and a sql database. Some are frequent small delta imports while others are larger daily xml imports. Here's what I've tried: 1. Set up a micro box that sends the dataimport requests to a load balancer using cron. This didn't work because frequent requests would get spread around and at one point all my nodes were doing the dataimport requests at the same time. 2. Designate one box as the indexer and call dataimport via localhost. The problem here is that I now have a single point of failure for indexing -- I always have to have that box running. I love that SolrCloud is distributed so I can have 3 boxes in my cluster and I don't care which one goes down. I don't really know what the solution is, but I guess it would be nice if the dataimport was cloud aware. Meaning that the cluster knows an update is happening on one of the boxes and won't let another one start. That way I could just send the dataimport request up through the load balancer and forget about it. Anyway, I thought I would see how others are handling this issue. Cheers, Jim -- View this message in context: http://lucene.472066.n3.nabble.com/Scaling-SolrCloud-and-DIH-tp4047049.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scaling SolrCloud and DIH
There is still some work to be done to make DIH play nicely with SolrCloud in terms of failover. https://issues.apache.org/jira/browse/SOLR-4058 is one of the issues that should be addressed. I think I made another issue or two, but I don't remember them offhand. - Mark On Mar 13, 2013, at 11:39 AM, jimtronic jimtro...@gmail.com wrote: I'm curious how people are using DIH with SolrCloud. I have cron jobs set up to trigger the dataimports which come from both xml files and a sql database. Some are frequent small delta imports while others are larger daily xml imports. Here's what I've tried: 1. Set up a micro box that sends the dataimport requests to a load balancer using cron. This didn't work because frequent requests would get spread around and at one point all my nodes were doing the dataimport requests at the same time. 2. Designate one box as the indexer and call dataimport via localhost. The problem here is that I now have a single point of failure for indexing -- I always have to have that box running. I love that SolrCloud is distributed so I can have 3 boxes in my cluster and I don't care which one goes down. I don't really know what the solution is, but I guess it would be nice if the dataimport was cloud aware. Meaning that the cluster knows an update is happening on one of the boxes and won't let another one start. That way I could just send the dataimport request up through the load balancer and forget about it. Anyway, I thought I would see how others are handling this issue. Cheers, Jim -- View this message in context: http://lucene.472066.n3.nabble.com/Scaling-SolrCloud-and-DIH-tp4047049.html Sent from the Solr - User mailing list archive at Nabble.com.