[ 
https://issues.apache.org/jira/browse/KUDU-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingchun Lai resolved KUDU-3318.
--------------------------------
    Fix Version/s: 1.16.0
       Resolution: Fixed

> Log Block Container metadata consumed too much disk space
> ---------------------------------------------------------
>
>                 Key: KUDU-3318
>                 URL: https://issues.apache.org/jira/browse/KUDU-3318
>             Project: Kudu
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Yingchun Lai
>            Priority: Major
>             Fix For: 1.16.0
>
>
> In log block container, blocks in .data file are append only, there is a 
> related append only .metadata file to trace blocks in .data, this type of 
> entries in metadata are in CREATE type, the other type of entries in metadata 
> are type of DELETE, it means mark the corresponding CREATE block as deleted.
> If there is a pair of CREATE and DELETE entries of a same block id, LBM use 
> hole punch to reclaim disk space in .data file, but the entries in .metadata 
> will not be compacted except bootstrap.
> Another way to limit metadata is the .data file offset reach its size 
> limitation(default 10GB), or block number in metadata reach its limitation(no 
> limit on default).
> I found a case in product environment that metadata consumed too many disk 
> space and near to .data's disk space, it's a waste, and make users confused 
> and complain that the actual disk space is far more than user's data.
>  
> {code:java}
> [root@hybrid01 data]# du -cs *.metadata | sort -n | tail
> 19072 fb58e00979914e95aae7184e3189c8c6.metadata
> 19092 5bbf54294d5948c4a695e240e81d5f80.metadata
> 19168 89da5f3c4dfa469a9935f091bced1856.metadata
> 19200 f27e6ff14bd44fd1838f63f1be35ee64.metadata
> 19256 7b87a5e3c7fa4d3d86dcd3945d6741e1.metadata
> 19256 cf054d1aa7cb4f5cbbbce3b99189bbe1.metadata
> 19496 a6cbb4a284b842deafe6939be051c77c.metadata
> 19568 ba749640df684cb8868d6e51ea3d1b17.metadata
> 19924 e5469080934746e58b0fd2ba29d69c9d.metadata
> 148954280 total    // all metadata size ~149GB
> [root@hybrid01 data]# du -cs *.data | sort -n | tail
> 64568 46dfbc5ac94d429b8d79a536727495df.data
> 64568 b4abc59d4eb2473ca267e0b057c8fad7.data
> 65728 576e09ed7e164ddebe5b1702be296619.data
> 66368 88d295f38dec4197bfbc6927e0528bde.data
> 90904 7291e10aafe74f2792168f6146738c5d.data
> 96788 6e72381ae95840f99864baacbc9169af.data
> 98060 c413553491764d039e702577606bac02.data
> 103556 a5db7a9c2e93457aa06103e45f59d8b4.data
> 138200 3876af02694643d49b19b39789460759.data
> 176443948 total // all data size ~176GB
> [root@hybrid01 data]# kudu pbc dump e5469080934746e58b0fd2ba29d69c9d.metadata 
> --oneline | awk '{print $5}' | sort | uniq -c | egrep -v " 2 "
>      1 6165611810     // low live ratio, only 1 live block
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to