Hey Guys,

First of all our Setup of Elastisearch:

- 1 Node
- 16 GB Ram
- 4 CPU
- Version 0.9.7
- 5 Shards , 1 Replica
- Type of Logs: WinEvent-Logs, Unix-System Logs, Cisco-Device-Logs, 
Firewall-Logs etc.
- About 3 Million Logs per day

Using Logasth to collect Logs and Kibana to access it.

Today we started inserting our Netflow into Elasticsearch. In Fact we have 
a big Production Environment so what we got were about 25000 Logs per 
Second inserting into Elasticsearch.

It was no Problem for the System to manage this much Load but the Index 
grows pretty fast and after 1 hour of testing we got 800 MB of Data(This 
would be 19.2 GB of Data per Day and with a Log retention of 30 Day 576 GB 
of Data)

Because this much Data is unacceptable for our System i really would like 
to have ways to reduce the Disk Space Requirements.

I´ve tried reducing the Disk Space by using the Compression Method inbuilt 
in Elasticsearch, setting _source to compress. Unfortunate this doesnt 
helped much.

Also tried to use the _optimze command since someone wrote this would help 
reducing the Disk Space - Had no effect.

The Goal is to reduce the 576 GB of Data to sth about 80-100 GB.

The First Thing what i could do is to reduce the Number of Shards to 2 
which would reduce the Space Storage to about 220 GB. But i really doesnt 
want to do that in case we add more nodes to the System.

The Next Thing i thinked about was adding a Deduplication File System to 
the ES-Node, but i dont think that De-Dup has much effect on a ES-Index - 
Any Experience in using that?

The last and the most obvious Thing is to Zip the Indices to Tarball or 
.zip. I think thats our Solution for Long-Term Storage (up to 2 Years) but 
its no solution for active Indices(The 30 days) since they would not be 
searchable by Kibana.

So any of you Guys have Suggestion for us?

Cheers






-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a5e95978-cadd-4953-98b1-52af9a8c84ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to