Hello.

I'm using Apache Kudu 1.2 on CDH 1.2.

I'm estimating how many servers needed to store my data.

After loading my test data sets,
total_kudu_on_disk_size_across_kudu_replicas in chart library at CDH is
27.9TB whereas sum of `du -sh /path/to/tablet_data/data` on each node is
39.9TB which is 43% bigger than chart library.

I also observed the same difference on my another Kudu test cluster.

I'm curious this is normal and wanted to know there is a way to reduce
physical file size.

Thanks,

Jason.

Reply via email to