Re: Recommendations needed for large ELK system design

Mark Walkom Wed, 30 Jul 2014 21:58:28 -0700

1 - Looks ok, but why two replicas? You're chewing up disk for what reason?
Extra comments below.
2 - It's personal preference really and depends on how your end points send
to redis.
3 - 4GB for redis will cache quite a lot of data if you're only doing 50
events p/s (ie hours or even days based on what I've seen).
4 - No, spread it out to all the nodes. More on that below though.
5 - No it will handle that itself. Again, more on that below though.


Suggestions;
Set your indexes to (factors of) 6 shards, ie one per node, it spreads
query performance. I say "factors of" in that you can set it to 12 shards
per index to start and easily scale the node count and still spread the
load.
Split your stats and your log data into different indexes, it'll make
management and retention easier.
You can consider a master only node or (ideally) three that also handle
queries.
Preferably have an uneven number of master eligible nodes, whether you make
them VMs or physicals, that way you can ensure quorum is reached with
minimal fuss and stop split brain.
If you use VMs for master + query nodes then you might want to look at load
balancing the queries via an external service.

To give you an idea, we have a 27 node cluster - 3 masters that also handle
queries and 24 data nodes. Masters are 8GB with small disks, data nodes are
60GB (30 heap) and 512GB disk.
We're running with one replica and have 11TB of logging data. At a high
level we're running out of disk more than heap or CPU and we're very write
heavy, with an average of 1K events p/s and comparatively minimal reads.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 31 July 2014 01:35, Alex <alex.mon...@gmail.com> wrote:

> Hello,
>
> We wish to set up an entire ELK system with the following features:
>
>    - Input from Logstash shippers located on 400 Linux VMs. Only a
>    handful of log sources on each VM.
>    - Data retention for 30 days, which is roughly 2TB of data in indexed
>    ES JSON form (not including replica shards)
>    - Estimated input data rate of 50 messages per second at peak hours.
>    Mostly short or medium length one-line messages but there will be Java
>    traces and very large service responses (in the form of XML) to deal with
>    too.
>    - The entire system would be on our company LAN.
>    - The stored data will be a mix of application logs (info, errors etc)
>    and server stats (CPU, memory usage etc) and would mostly be accessed
>    through Kibana.
>
> This is our current plan:
>
>    - Have the LS shippers perform minimal parsing (but would do
>    multiline). Have them point to two load-balanced servers containing Redis
>    and LS indexers (which would do all parsing).
>    - 2 replica shards for each index, which ramps the total data storage
>    up to 6TB
>    - ES cluster spread over 6 nodes. Each node is 1TB in size
>    - LS indexers pointing to cluster.
>
> So I have a couple questions regarding the setup and would greatly
> appreciate the advice of someone with experience!
>
>    1. Does the balance between the number of nodes, the number of replica
>    shards, and storage size of each node seem about right? We use
>    high-performance equipment and would expect minimal downtime.
>
>    2. What is your recommendation for the system design of the LS
>    indexers and Redis? I've seen various designs with each indexer assigned to
>    a single Redis, or all indexers reading from all Redises.
>
>    3. Leading from the previous question, what would your recommend data
>    size for the Redis servers be?
>
>    4. Not sure what to do about master/data nodes. Assuming all the nodes
>    are on identical hardware would it be beneficial to have a node which is
>    only a master which would only handle requests?
>
>    5. Do we need to do any additional load balancing on the ES nodes?
>
> We are open to any and all suggestions. We have not yet committed to any
> particular design so can change if needed.
>
> Thank you for your time and responses,
> Alex
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b0aee66a-35bb-4770-927b-d9c7e13ad9fc%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/b0aee66a-35bb-4770-927b-d9c7e13ad9fc%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ajESrfRx_VFikYuxyUcanEpVTVaLdhc3ao59hbhz3AVg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Recommendations needed for large ELK system design

Reply via email to