[ 
https://issues.apache.org/jira/browse/HDDS-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685698#comment-17685698
 ] 

Sammi Chen commented on HDDS-7321:
----------------------------------

[~sadanand_shenoy] The level threshold limit for auto-compaction is usually 
high, say 1GB. Imported sst file of a container is generally very small, say 
several KB. So there could be thousands or even more small sst files, and still 
far from the auto-compaction threshold, that's why we need this auto small sst 
file compaction.  Two many small sst files will reduce rocksdb's performance 
and stability.

> Auto rocksDB small sst files compaction
> ---------------------------------------
>
>                 Key: HDDS-7321
>                 URL: https://issues.apache.org/jira/browse/HDDS-7321
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.4.0
>
>
> RocksDB has auto compaction itself, which is triggered by the total level 
> file size. 
> Once the total level file size exceeds the threshold, RocksDB will schedule 
> the compaction in backend.
>  
> When replicating containers between datanodes, current implementation 
> leverages RocksDB  SstFileWriter to export container meta data to individual 
> sst files, and leverages RocksDB ingestExternalFile to import container meta 
> data sst files directly into target datanode RocksDB. If the imported 
> container meta data keys don't overlap with other sst files(Consider Merge 
> RocksDB design, container ID is used as prefix of each meta data key,  this 
> is true for most of time), the imported sst file will be kept and remain the 
> same without compacting with other existing sst files.
> The worst case, if thousands or dozens of thousands of containers are 
> imported on one datanode, there would be dozens of thousands of small sst 
> files under one RocksDB, or across all RocksDB instances of one datanode.  By 
> default,  RocksDB has no limit of open files.  Dozens of thousands of small 
> sst files would exhaust the process open file quota, and service stability 
> will be impacted.
> This task aims to provide a way to auto compact all these small sst files 
> into merged big ones.  Of course, the compaction will have impact on user 
> data read/write performance on this datanode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to