Did you calculate 25000 logs per second volume? Given an estimated average log entry size of 100 bytes:
25.000 logs/sec = 2.500.000 bytes/sec = 9.000.000.000 bytes/hour = 216.000.000.000 bytes/day = 6.480.000.000.000 bytes/month = 6.328.125.000 KB/month = 6.179.809,5703125 MB/month = 6.034,97028350830078 GB/month = 5,89352566748857 TB/month You can expect 6 TB input data volume. So 576 GB index size a month is very small. Active ES indices are LZF compressed by default. There is no possibility to reduce them more siginificantly except reducing input size, reducing replica, or by using special mappings. Special mappings may help in case you want to drop some Kibana features. E.g. you could use keyword analyzer for all strings, throw away field norms, term vectors, etc. plus disabling _all and _source. This saves some space but the price to pay is less "searchability" - careful testing is required if your search requirements are still met. And I am sure you will not reach 80-100GB a month. So my personal recommendation is: You should always plan with TB storage for Elasticsearch indices for log analysis applications. Jörg On Tue, Apr 22, 2014 at 10:25 AM, horst knete <[email protected]> wrote: > Hey Guys, > > First of all our Setup of Elastisearch: > > - 1 Node > - 16 GB Ram > - 4 CPU > - Version 0.9.7 > - 5 Shards , 1 Replica > - Type of Logs: WinEvent-Logs, Unix-System Logs, Cisco-Device-Logs, > Firewall-Logs etc. > - About 3 Million Logs per day > > Using Logasth to collect Logs and Kibana to access it. > > Today we started inserting our Netflow into Elasticsearch. In Fact we have > a big Production Environment so what we got were about 25000 Logs per > Second inserting into Elasticsearch. > > It was no Problem for the System to manage this much Load but the Index > grows pretty fast and after 1 hour of testing we got 800 MB of Data(This > would be 19.2 GB of Data per Day and with a Log retention of 30 Day 576 GB > of Data) > > Because this much Data is unacceptable for our System i really would like > to have ways to reduce the Disk Space Requirements. > > I´ve tried reducing the Disk Space by using the Compression Method inbuilt > in Elasticsearch, setting _source to compress. Unfortunate this doesnt > helped much. > > Also tried to use the _optimze command since someone wrote this would help > reducing the Disk Space - Had no effect. > > The Goal is to reduce the 576 GB of Data to sth about 80-100 GB. > > The First Thing what i could do is to reduce the Number of Shards to 2 > which would reduce the Space Storage to about 220 GB. But i really doesnt > want to do that in case we add more nodes to the System. > > The Next Thing i thinked about was adding a Deduplication File System to > the ES-Node, but i dont think that De-Dup has much effect on a ES-Index - > Any Experience in using that? > > The last and the most obvious Thing is to Zip the Indices to Tarball or > .zip. I think thats our Solution for Long-Term Storage (up to 2 Years) but > its no solution for active Indices(The 30 days) since they would not be > searchable by Kibana. > > So any of you Guys have Suggestion for us? > > Cheers > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a5e95978-cadd-4953-98b1-52af9a8c84ce%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/a5e95978-cadd-4953-98b1-52af9a8c84ce%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEsB8QO6iEN3tDe%3DXwR_vN1xNgHAvwvbSxqne11Zn9Y%2BQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
