[
https://issues.apache.org/jira/browse/KUDU-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087949#comment-16087949
]
Jean-Daniel Cryans commented on KUDU-2071:
------------------------------------------
Hey [~King Lee], this looks like a classical case of KUDU-1943.
> disk size is much large than actually data size
> -----------------------------------------------
>
> Key: KUDU-2071
> URL: https://issues.apache.org/jira/browse/KUDU-2071
> Project: Kudu
> Issue Type: Improvement
> Components: tserver
> Affects Versions: 1.3.0
> Environment: system version
> 4.9.20-11.31.amzn1.x86_64 #1 SMP Thu Apr 13 01:53:57 UTC 2017 x86_64 x86_64
> x86_64 GNU/Linux
> software version:
> kudu 1.3.0-cdh5.11.0
> revision 4dcf4a9d516865d249f4cb9b07f93c67e84614ae
> build type RELEASE
> built by jenkins at 12 Apr 2017 14:02:51 PST on
> kudu-centos66-046c.vpc.cloudera.com
> build id 2017-04-12_13-25-42
> Reporter: KingLee
> Labels: patch
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> I ran m -rf on all the data dirs before reinstalling the cluster, and insert
> 1000000 records to the cluster using yscb, data's size is about 5GB,but it
> cost disk size 260GB, one of node 's disk as follows:
> before write data:
> [root@ip-10-1-42-124 ~]# du -sh /data1/server/kudu/tserver_wal/wals/
> /data2/server/kudu/tserver_data/ /data3/server/kudu/tserver_data/data/
> /data4/server/kudu/tserver_data/data/
> 4.0K /data1/server/kudu/tserver_wal/wals/
> 24K /data2/server/kudu/tserver_data/
> 8.0K /data3/server/kudu/tserver_data/data/
> 8.0K /data4/server/kudu/tserver_data/data/
> after write data:
> [root@ip-10-1-42-124 ~]# du -sh /data1/server/kudu/tserver_wal/wals/
> /data2/server/kudu/tserver_data/ /data3/server/kudu/tserver_data/data/
> /data4/server/kudu/tserver_data/data/
> 2.7G /data1/server/kudu/tserver_wal/wals/
> 29G /data2/server/kudu/tserver_data/
> 29G /data3/server/kudu/tserver_data/data/
> 27G /data4/server/kudu/tserver_data/data/
> actually data size :
> 9b137115cfaa427a9106c87086f41957 5041MBytes
> kudu tserver configure:
> --fs_wal_dir=/var/lib/kudu/tserver
> --fs_data_dirs=/var/lib/kudu/tserver
> --default_num_replicas=3
> --tserver_master_addrs=192.168.1.22:7051,1192.168.1.23:7051,192.168.1.24:7051,192.168.1.25:7051,192.168.1.26:7051
> --maintenance_manager_num_threads=4
> --block_cache_capacity_mb=10240
> --memory_limit_hard_bytes=60000000000
> --fs_wal_dir=/data1/server/kudu/tserver_wal
> --fs_data_dirs=/data2/server/kudu/tserver_data,/data3/server/kudu/tserver_data,/data4/server/kudu/tserver_data
> --fs_data_dirs_reserved_bytes=10000000000
> --log_segment_size_mb=8
> and our production environment 's data is 25TB, but cost 45TB, where do these
> disks go?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)