Re: Cold replication
The fact that your index is 200G is meaningless, assuming you're talking about disk size. Please just measure before you make assumptions about what will work, it'll save you a world of hurt. I'm not claiming that just using EBS will satisfy your need, but if you're swapping your search speed will suffer a lot, it doesn't matter whether you're swapping to SSD or EBS. Well, it does matter but either way I'm 90% sure you won't be satisfied with performance. It's just that you'll be _less_ unhappy with SSD. If your index is changing rapidly, SSDs can be really useful as they make loading parts of the index into memory faster. I say that the disk size of your index is meaningless, and by that I mean that, for instance, if you've set stored="true" for a field, a verbatim copy of that is stored on disk and is largely irrelevant in terms of the memory it requires as it's only read for assembling the final doc to return to the user, not for finding matches. The stored data is held in *.fdt files. The *.fdt files may be a very small percent of the disk space or a very large percent, there's no way to know. Other options also have disk .vs. memory implications. As far as autowarming, that's simply a way to replay some of the recent queries and filter queries when a new searcher is opened (i.e. you commit new documents). It's intended to smooth over spikes by moving relevant parts of the index to memory from disk. Best, Erick On Tue, Jul 19, 2016 at 1:16 AM, Emir Arnautovic <emir.arnauto...@sematext.com> wrote: > Hi Mahmoud, > What you can do is use local SSD disk as cache for EBS. You can try lvmcache > or bcache. It will boost your performance while data will remain on EBS. > > Thanks, > Emir > > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > > > > > On 18.07.2016 19:34, Mahmoud Almokadem wrote: >> >> Thanks Erick, >> >> I'll take a look at the replication on Solr. But I don't know if it well >> support incremental backup or not. >> >> And I want to use SSD because my index cannot be held in memory. The index >> is about 200GB on each instance and the RAM is 61GB and the update >> frequency is high. So, I want to use SSDs equipped with the servers >> instead >> on EBSs. >> >> Would you explain what you mean with proper warming? >> >> Thanks, >> Mahmoud >> >> >> On Mon, Jul 18, 2016 at 5:46 PM, Erick Erickson <erickerick...@gmail.com> >> wrote: >> >>> Have you tried the replication API backup command here? >>> >>> >>> https://cwiki.apache.org/confluence/display/solr/Index+Replication#IndexReplication-HTTPAPICommandsfortheReplicationHandler >>> >>> Warning, I haven't worked with this personally in this >>> situation so test. >>> >>> I do have to ask why you think SSDs are required here and >>> if you've measured. With proper warming, most of the >>> index is held in memory anyway and the source of >>> the data (SSD or spinning) is not a huge issue. SSDs >>> certainly are better/faster, but have you measured whether >>> they are _enough_ faster to be worth the added >>> complexity? >>> >>> Best, >>> Erick >>> >>> Best, >>> Erick >>> >>> On Mon, Jul 18, 2016 at 4:05 AM, Mahmoud Almokadem >>> <prog.mahm...@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> We have SolrCloud 6.0 installed on 4 i2.2xlarge instances with 4 shards. >>> >>> We store the indices on EBS attached to these instances. Fortunately >>> these >>> instances are equipped with TEMPORARY SSDs. We need to the store the >>> indices on the SSDs but they are not safe. >>>> >>>> The index is updated every five minutes. >>>> >>>> Could we use the SSDs to store the indices and create an incremental >>> >>> backup or cold replication on the EBS? So we use EBS only for storing >>> indices not serving the data to the solr. >>>> >>>> Incase of losing the data on SSDs we can restore a backup from the EBS. >>> >>> Is it possible? >>>> >>>> Thanks, >>>> Mahmoud >>>> >>>> >
Re: Cold replication
Hi Mahmoud, What you can do is use local SSD disk as cache for EBS. You can try lvmcache or bcache. It will boost your performance while data will remain on EBS. Thanks, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On 18.07.2016 19:34, Mahmoud Almokadem wrote: Thanks Erick, I'll take a look at the replication on Solr. But I don't know if it well support incremental backup or not. And I want to use SSD because my index cannot be held in memory. The index is about 200GB on each instance and the RAM is 61GB and the update frequency is high. So, I want to use SSDs equipped with the servers instead on EBSs. Would you explain what you mean with proper warming? Thanks, Mahmoud On Mon, Jul 18, 2016 at 5:46 PM, Erick Erickson <erickerick...@gmail.com> wrote: Have you tried the replication API backup command here? https://cwiki.apache.org/confluence/display/solr/Index+Replication#IndexReplication-HTTPAPICommandsfortheReplicationHandler Warning, I haven't worked with this personally in this situation so test. I do have to ask why you think SSDs are required here and if you've measured. With proper warming, most of the index is held in memory anyway and the source of the data (SSD or spinning) is not a huge issue. SSDs certainly are better/faster, but have you measured whether they are _enough_ faster to be worth the added complexity? Best, Erick Best, Erick On Mon, Jul 18, 2016 at 4:05 AM, Mahmoud Almokadem <prog.mahm...@gmail.com> wrote: Hi, We have SolrCloud 6.0 installed on 4 i2.2xlarge instances with 4 shards. We store the indices on EBS attached to these instances. Fortunately these instances are equipped with TEMPORARY SSDs. We need to the store the indices on the SSDs but they are not safe. The index is updated every five minutes. Could we use the SSDs to store the indices and create an incremental backup or cold replication on the EBS? So we use EBS only for storing indices not serving the data to the solr. Incase of losing the data on SSDs we can restore a backup from the EBS. Is it possible? Thanks, Mahmoud
Re: Cold replication
Thanks Erick, I'll take a look at the replication on Solr. But I don't know if it well support incremental backup or not. And I want to use SSD because my index cannot be held in memory. The index is about 200GB on each instance and the RAM is 61GB and the update frequency is high. So, I want to use SSDs equipped with the servers instead on EBSs. Would you explain what you mean with proper warming? Thanks, Mahmoud On Mon, Jul 18, 2016 at 5:46 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Have you tried the replication API backup command here? > > https://cwiki.apache.org/confluence/display/solr/Index+Replication#IndexReplication-HTTPAPICommandsfortheReplicationHandler > > Warning, I haven't worked with this personally in this > situation so test. > > I do have to ask why you think SSDs are required here and > if you've measured. With proper warming, most of the > index is held in memory anyway and the source of > the data (SSD or spinning) is not a huge issue. SSDs > certainly are better/faster, but have you measured whether > they are _enough_ faster to be worth the added > complexity? > > Best, > Erick > > Best, > Erick > > On Mon, Jul 18, 2016 at 4:05 AM, Mahmoud Almokadem > <prog.mahm...@gmail.com> wrote: > > Hi, > > > > We have SolrCloud 6.0 installed on 4 i2.2xlarge instances with 4 shards. > We store the indices on EBS attached to these instances. Fortunately these > instances are equipped with TEMPORARY SSDs. We need to the store the > indices on the SSDs but they are not safe. > > > > The index is updated every five minutes. > > > > Could we use the SSDs to store the indices and create an incremental > backup or cold replication on the EBS? So we use EBS only for storing > indices not serving the data to the solr. > > > > Incase of losing the data on SSDs we can restore a backup from the EBS. > Is it possible? > > > > Thanks, > > Mahmoud > > > > >
Re: Cold replication
Have you tried the replication API backup command here? https://cwiki.apache.org/confluence/display/solr/Index+Replication#IndexReplication-HTTPAPICommandsfortheReplicationHandler Warning, I haven't worked with this personally in this situation so test. I do have to ask why you think SSDs are required here and if you've measured. With proper warming, most of the index is held in memory anyway and the source of the data (SSD or spinning) is not a huge issue. SSDs certainly are better/faster, but have you measured whether they are _enough_ faster to be worth the added complexity? Best, Erick Best, Erick On Mon, Jul 18, 2016 at 4:05 AM, Mahmoud Almokadem <prog.mahm...@gmail.com> wrote: > Hi, > > We have SolrCloud 6.0 installed on 4 i2.2xlarge instances with 4 shards. We > store the indices on EBS attached to these instances. Fortunately these > instances are equipped with TEMPORARY SSDs. We need to the store the indices > on the SSDs but they are not safe. > > The index is updated every five minutes. > > Could we use the SSDs to store the indices and create an incremental backup > or cold replication on the EBS? So we use EBS only for storing indices not > serving the data to the solr. > > Incase of losing the data on SSDs we can restore a backup from the EBS. Is it > possible? > > Thanks, > Mahmoud > >
Cold replication
Hi, We have SolrCloud 6.0 installed on 4 i2.2xlarge instances with 4 shards. We store the indices on EBS attached to these instances. Fortunately these instances are equipped with TEMPORARY SSDs. We need to the store the indices on the SSDs but they are not safe. The index is updated every five minutes. Could we use the SSDs to store the indices and create an incremental backup or cold replication on the EBS? So we use EBS only for storing indices not serving the data to the solr. Incase of losing the data on SSDs we can restore a backup from the EBS. Is it possible? Thanks, Mahmoud