ES was definitely not built for cross DC. You can aim for cross DC redundancy with snapshot+restore functionality though.
Regarding shards, it's best practise to have one primary shard per node. Whatever replicas you setup is more of a personal choice, the more replicas you have the more redundancy and search response throughput, but also the more storage space and memory you use. Regarding load balancing/redundancy for LS, there are a few options, you can look at HAProxy or Zookeeper for example, you could also try using an anycast endpoint. This is a bit more of an open solution based on your requirements and systems. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: [email protected] web: www.campaignmonitor.com On 13 July 2014 16:31, Stefano Ruggiero <[email protected]> wrote: > Thanks alot for the answer, you confirm my worries about a cross > datacenter cluster, even if i think that ES has been build also for this > type of situation, the problem that i see is that with tribe nodes we cant > have a full HA across the 2 DC, even if is a good solution for searching in > all DC it isent a good solution for replicate indexed data, am i right ? > > so you suggests to have 4 shard and on replica so 1 primary and 1 replica > per node ? (obv f we have installed 4 nodes in 2 DC) > > what do you use for load balance indexing across nodes where one of them > go down given that logstash allow only 1 ip or domian in the output > configuration. > > Regards > > Il giorno domenica 13 luglio 2014 02:28:01 UTC+2, Mark Walkom ha scritto: >> >> As you may have read, ES is latency sensitive and so having a cluster >> across your DCs isn't recommended. >> You may want to look at tribe nodes and then have two separate clusters, >> that way you get around your problems with wanting all data available in >> both DCs and also cross DC load balancing. >> >> Around shard counts, to ensure you balance load you ideally want one >> shard per node, then create replicas based on what you require. Trying to >> setup replica only nodes isn't worth the trouble though. >> >> Security wise, the base setup you have is good but you may want to have a >> look at some of the community baed solutions to ACLs if that's what you >> want. >> >> Regards, >> Mark Walkom >> >> Infrastructure Engineer >> Campaign Monitor >> email: [email protected] >> web: www.campaignmonitor.com >> >> >> On 12 July 2014 21:02, Stefano Ruggiero <[email protected]> wrote: >> >>> Hi all, >>> >>> i would like to start this conversation to discuss about the best >>> architecture of ELK based on our hardware and needed for a test envirorment. >>> >>> What we have: >>> >>> - 4+ ES nodes >>> - x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU >>> - x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU >>> - 10+ LS Collectors >>> - 2+ Kibana instances >>> >>> we have 2 separate Datacentre infact, as i show, we have the specular >>> resources on the above list, so for example we have 2 ES nodes on the first >>> location and the other 2 in the second location that are linked with double >>> redundant fiber 10 gbit . >>> >>> Our test is to understand how ELK stack performing with indexing of all >>> Application and Server Events, so we are talking about 200 Events for >>> seconds in the test lab. We would like to have a retention of 2 or 3 >>> mounth, so seraching with kibana that logs, and then close and backup old >>> index that we test is a working well with curator plugin. >>> >>> *What is the best configuration for Load balancing events across the >>> two locatio*n i mean every collectors should have 2 available choice >>> for the output in case of one node go down or is performing bad , what do >>> you suggests ? >>> we try Nginx with health check but i think that ES should do something >>> similar for load balancing indexing process with a node master false, data >>> false , even if we raed in the community that this type of node is reserved >>> for balancing search and not indexing that go every time across the master >>> of the cluster, am i right? >>> >>> *What is the best configuration that you test ?* i mean how many shards >>> how many replicas for a full High availability and redundant solution ? >>> we try to play with 2 shard and one replica for 4 data node, because as >>> we see replcas are involved in search process so it can be a good solution >>> to reserve some nodes only for replicas but what we miss is if a node go >>> down or a datacentre died can we have all data automatically on the other >>> side (just with replicas) ? ( we know that for the golden rule we need to >>> have 5 nodes and 3 minimum master node for a cluster so if we have only 2 >>> DC could be critical because one DC need to have more nodes and become the >>> leader of the all cluster... ) >>> >>> *What is your best configuration for a security prospective ? * >>> we test nginx also as reverse proxy with standard autentcation to >>> prevent unwanted DELETE and PUT but we are looking for a more strong >>> solution with more flexibility and roles/premissions configuration like a >>> standard SQL DB. Our network layer is really strong every ELK layer has his >>> own DMZ, ACL and firewall rule >>> >>> iam worried about espacially on the ES configuration like shards replica >>> and load balancing i think that this conversation should be helpfull for a >>> very large community auditor that have some doubts about ES and ELK stack >>> in general. >>> >>> Best Regards, >>> Stefano >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Yqn18_5-3cbA_%3DNRedQtcvWe28spbwmTeWgsbwWk_zDQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
