Thanks Bruno.

On Thu, Feb 18, 2021 at 9:19 AM Bruno Cadonna <br...@confluent.io> wrote:

> Hi Chris,
>
> your estimation looks correct to me.
>
> I do not know how big M might be. Maybe the following link can help you
> with the estimation:
>
> https://github.com/facebook/rocksdb/wiki/Rocksdb-BlockBasedTable-Format
>
> There are also some additional files that RocksDB keeps in its
> directory. I guess the best way to estimate the space is experimentally.
>
> Also take into account that, you will have one state store per partition.
>
> If you want to save disk space, you should try to use Leveled compaction
> (https://github.com/facebook/rocksdb/wiki/Leveled-Compaction) instead
> since it has a space amplification of 10% instead of 100% with Universal
> compaction. That is, you can replace the 2 in your formula with 1.1.
>
> Since AK 2.7, you can also monitor the sizes of your RocksDB state
> stores with the metric total-sst-files-size
> (https://kafka.apache.org/documentation/#kafka_streams_rocksdb_monitoring)
>
> Best,
> Bruno
>
> On 18.02.21 17:43, Chris Toomey wrote:
> > We're using RocksDB as a persistent Kafka state store for compacted
> topics
> > and need to be able to estimate the maximum disk space required.
> >
> > We're using the default config. settings provided by Kafka, which include
> > Universal compaction, no compression, and 4k block size.
> >
> > Given these settings and a topic w/ key size K, value size V, and number
> of
> > records R, I'd assume a rough disk space estimation would be of the form
> >
> > max. disk space = (K+V)*R*M*2
> >
> > where M is an unknown DB size -> disk size multiplier and *2 is to allow
> > for full compaction as per here
> > <https://github.com/facebook/rocksdb/wiki/Universal-Compaction>.
> >
> > Does this look right, and can anyone provide a ballpark range for the
> > multiplier M and/or some guidelines for how to estimate it?
> >
> > much thanks,
> > Chris
> >
>

Reply via email to