Re: Suggestion for a Test ELK Deployment ( Best Performance/ Load Balancing / Security / HA)

Mark Walkom Sun, 13 Jul 2014 02:51:19 -0700

ES was definitely not built for cross DC.
You can aim for cross DC redundancy with snapshot+restore functionality
though.


Regarding shards, it's best practise to have one primary shard per node.
Whatever replicas you setup is more of a personal choice, the more replicas
you have the more redundancy and search response throughput, but also the
more storage space and memory you use.

Regarding load balancing/redundancy for LS, there are a few options, you
can look at HAProxy or Zookeeper for example, you could also try using an
anycast endpoint. This is a bit more of an open solution based on your
requirements and systems.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [email protected]
web: www.campaignmonitor.com


On 13 July 2014 16:31, Stefano Ruggiero <[email protected]> wrote:

> Thanks alot for the answer, you confirm my worries about a cross
> datacenter cluster, even if i think that ES has been build also for this
> type of situation, the problem that i see is that with tribe nodes we cant
> have a full HA across the 2 DC, even if is a good solution for searching in
> all DC it isent a good solution for replicate indexed data, am i right ?
>
> so you suggests to have 4 shard and on replica so 1 primary and 1 replica
> per node ? (obv f we have installed 4 nodes in 2 DC)
>
> what do you use for load balance indexing across nodes where one of them
> go down given that logstash allow only 1 ip or domian in the output
> configuration.
>
> Regards
>
> Il giorno domenica 13 luglio 2014 02:28:01 UTC+2, Mark Walkom ha scritto:
>>
>> As you may have read, ES is latency sensitive and so having a cluster
>> across your DCs isn't recommended.
>> You may want to look at tribe nodes and then have two separate clusters,
>> that way you get around your problems with wanting all data available in
>> both DCs and also cross DC load balancing.
>>
>> Around shard counts, to ensure you balance load you ideally want one
>> shard per node, then create replicas based on what you require. Trying to
>> setup replica only nodes isn't worth the trouble though.
>>
>> Security wise, the base setup you have is good but you may want to have a
>> look at some of the community baed solutions to ACLs if that's what you
>> want.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: [email protected]
>> web: www.campaignmonitor.com
>>
>>
>> On 12 July 2014 21:02, Stefano Ruggiero <[email protected]> wrote:
>>
>>> Hi all,
>>>
>>> i would like to start this conversation to discuss about the best
>>> architecture of ELK based on our hardware and needed for a test envirorment.
>>>
>>> What we have:
>>>
>>>    - 4+ ES nodes
>>>    - x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU
>>>       - x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU
>>>    - 10+ LS Collectors
>>>    - 2+ Kibana instances
>>>
>>> we have 2 separate Datacentre infact, as i show, we have the specular
>>> resources on the above list, so for example we have 2 ES nodes on the first
>>> location and the other 2 in the second location that are linked with double
>>> redundant fiber 10 gbit .
>>>
>>> Our test is to understand how ELK stack performing with indexing of all
>>> Application and Server Events, so we are talking about 200 Events for
>>> seconds in the test lab. We would like to have a retention of 2 or 3
>>> mounth, so seraching with kibana that logs, and then close and backup old
>>> index that we test is a working well with curator plugin.
>>>
>>> *What is the best configuration  for Load balancing events across the
>>> two locatio*n i mean every collectors should have 2 available choice
>>> for the output in case of one node go down or is performing bad , what do
>>> you suggests ?
>>> we try Nginx with health check but i think that ES should do something
>>> similar for load balancing indexing process with a node master false, data
>>> false , even if we raed in the community that this type of node is reserved
>>> for balancing search and not indexing that go every time across the master
>>> of the cluster, am i right?
>>>
>>> *What is the best configuration that you test ?* i mean how many shards
>>> how many replicas  for a full High availability and redundant solution ?
>>> we try to play with 2 shard and one replica for 4 data node, because as
>>> we see replcas are involved in search process so it can be a good solution
>>> to reserve some nodes only for replicas but what we miss is if a node go
>>> down or a datacentre died can we have all data automatically on the other
>>> side (just with replicas) ? ( we know that for the golden rule we need to
>>> have 5 nodes and 3 minimum master node for a cluster so if we have only 2
>>> DC could be critical because one DC need to have more nodes and become the
>>> leader of the all cluster... )
>>>
>>> *What is your best configuration for a security prospective ? *
>>> we test nginx also as reverse proxy with standard autentcation to
>>> prevent unwanted DELETE and PUT but we are looking for a more strong
>>> solution with more flexibility and roles/premissions configuration like a
>>> standard SQL DB. Our network layer is really strong every ELK layer has his
>>> own DMZ, ACL and firewall rule
>>>
>>> iam worried about espacially on the ES configuration like shards replica
>>> and load balancing i think that this conversation should be helpfull for a
>>> very large community auditor that have some doubts about ES and ELK stack
>>> in general.
>>>
>>> Best Regards,
>>> Stefano
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Yqn18_5-3cbA_%3DNRedQtcvWe28spbwmTeWgsbwWk_zDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Suggestion for a Test ELK Deployment ( Best Performance/ Load Balancing / Security / HA)

Reply via email to