Re: Recommendations needed for large ELK system design

2014-08-13 Thread Alex
Hi Mark, I've done more investigating and it seems that a Client (AKA Query) node cannot also be a Master node. As it says here http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#master-election *Nodes can be excluded from becoming a master by

Re: Recommendations needed for large ELK system design

2014-08-01 Thread Alex
Ok thank you Mark, you've been extremely helpful and we now have a better idea about what we're doing! -Alex On Thursday, 31 July 2014 23:57:26 UTC+1, Mark Walkom wrote: 1 - Curator FTW. 2 - Masters handle cluster state, shard allocation and a whole bunch of other stuff around managing the

Re: Recommendations needed for large ELK system design

2014-07-31 Thread Alex
Hello Mark, Thank you for your reply, it certainly helps to clarify many things. Of course I have some new questions for you! 1. I haven't looked into it much yet but I'm guessing Curator can handle different index naming schemes. E.g. logs-2014.06.30 and stats-2014.06.30. We'd

Re: Recommendations needed for large ELK system design

2014-07-31 Thread Mark Walkom
1 - Curator FTW. 2 - Masters handle cluster state, shard allocation and a whole bunch of other stuff around managing the cluster and it's members and data. A node that is master and data set to false is considered a search node. But the role of being a master is not onerous, so it made sense for

Re: Recommendations needed for large ELK system design

2014-07-31 Thread Otis Gospodnetic
You can further simplify your architecture by using rsyslog with omelasticsearch instead of LS. This might be handy: http://blog.sematext.com/2013/07/01/recipe-rsyslog-elasticsearch-kibana/ Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support *

Recommendations needed for large ELK system design

2014-07-30 Thread Alex
Hello, We wish to set up an entire ELK system with the following features: - Input from Logstash shippers located on 400 Linux VMs. Only a handful of log sources on each VM. - Data retention for 30 days, which is roughly 2TB of data in indexed ES JSON form (not including replica

Re: Recommendations needed for large ELK system design

2014-07-30 Thread Mark Walkom
1 - Looks ok, but why two replicas? You're chewing up disk for what reason? Extra comments below. 2 - It's personal preference really and depends on how your end points send to redis. 3 - 4GB for redis will cache quite a lot of data if you're only doing 50 events p/s (ie hours or even days based