On Tue, Apr 14, 2015 at 7:35 PM, <[email protected]> wrote: > Hi Adrian > > Thanks for quick response. > > When I loaded nearly 45m documents of test data with 3 replicas (each > document approx 2K+ bytes in size), I got following info on storage: > > *health status index pri rep docs.count docs.deleted store.size > pri.store.size* > > *green open test_insert 5 3 44985382 0 > 414.9gb 106.4gb* > > This indicates there was hardly any compression on physical storage*.* > Hence my question. How do I find /estimate how much storage would be used > for X number of documents of average size of Y kilobytes each. From above > result, it appears to be no compression at all on all stored data. > > Compression ratios depend so much on the data that you can't really know what the compression ratio will be without indexing sample documents. However, once you indexed enough documents (eg. 100k), you can expect the store size to keep growing linearly with the number of documents.
Most of time the largest part of the index is the document store. In your case I assume that LZ4 is too lightweight a compression algorithm to manage to compress your data efficiently. The high compression option which is coming in elasticsearch 2.0 might help. -- Adrien -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAhehRg9QDNCK-TO%2BKYnX3T%2B5BH9QEM5nUi21u%2BgqBQEFg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
