Re: Compression in Elasticsearch documents

Adrien Grand Wed, 15 Apr 2015 00:14:50 -0700

On Tue, Apr 14, 2015 at 7:35 PM, <[email protected]> wrote:

> Hi Adrian
>
> Thanks for quick response.
>
> When I loaded nearly 45m documents  of test data with 3 replicas (each
> document approx  2K+ bytes in size),  I got  following info on storage:
>
> *health status index       pri rep docs.count docs.deleted store.size
> pri.store.size*
>
> *green  open   test_insert   5   3   44985382            0
> 414.9gb        106.4gb*
>
> This indicates there was hardly any compression on physical storage*.*
> Hence my question. How do I find /estimate how much storage would be used
> for X number of documents of average size of Y kilobytes each. From above
> result, it appears to be no compression at all on all stored data.
>
>
Compression ratios depend so much on the data that you can't really know
what the compression ratio will be without indexing sample documents.
However, once you indexed enough documents (eg. 100k), you can expect the
store size to keep growing linearly with the number of documents.


Most of time the largest part of the index is the document store. In your
case I assume that LZ4 is too lightweight a compression algorithm to manage
to compress your data efficiently. The high compression option which is
coming in elasticsearch 2.0 might help.

-- 
Adrien

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO5%3DkAhehRg9QDNCK-TO%2BKYnX3T%2B5BH9QEM5nUi21u%2BgqBQEFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Compression in Elasticsearch documents

Reply via email to