[jira] [Commented] (HBASE-29185) How to ensure the atomicity of data files during compaction

yuting sun (Jira) Thu, 27 Mar 2025 02:30:04 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-29185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937282#comment-17937282
 ]


yuting sun commented on HBASE-29185:
------------------------------------

Yes, I understand the logic here. Recording the compressed file in the 
storefile metadata ensures the atomicity of the file. However, there is still a 
lag in handling the overall atomicity of the file in the following scenario:

If a compact operation aborts, the newfile generated remains in the file 
directory, and there is a long interval before the openStoreFiles action is 
executed. During this period, if the corresponding file in the metadata 
triggers a major compact and is subsequently cleaned up, and if the user 
performs a delete operation, the newfile, once loaded, could affect data 
consistency.

Can we enable the *hbase.regionserver.storefile.refresh.period* configuration 
to activate the StorefileRefresherChore scheduled task, which would 
periodically refresh the store's files?

> How to ensure the atomicity of data files during compaction
> -----------------------------------------------------------
>
>                 Key: HBASE-29185
>                 URL: https://issues.apache.org/jira/browse/HBASE-29185
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.6.0
>            Reporter: yuting sun
>            Priority: Major
>
> {code:java}
> //代码占位符
> {code}
> protected List<HStoreFile> doCompaction(CompactionRequestImpl cr, 
> Collection<HStoreFile> filesToCompact, User user, long compactionStartTime, 
> List<Path> newFiles) throws IOException \{ // Do the steps necessary to 
> complete the compaction. setStoragePolicyFromFileName(newFiles); 
> List<HStoreFile> sfs = storeEngine.commitStoreFiles(newFiles, true); if 
> (this.getCoprocessorHost() != null) { for (HStoreFile sf : sfs) { 
> getCoprocessorHost().postCompact(this, sf, cr.getTracker(), cr, user); } } 
> replaceStoreFiles(filesToCompact, sfs, true); long outputBytes = 
> getTotalSize(sfs); // At this point the store will use new files for all new 
> scanners. refreshStoreSizeAndTotalBytes(); // update store size. long now = 
> EnvironmentEdgeManager.currentTime(); if ( region.getRegionServerServices() 
> != null && region.getRegionServerServices().getMetrics() != null ) \{ 
> region.getRegionServerServices().getMetrics().updateCompaction( 
> region.getTableDescriptor().getTableName().getNameAsString(), cr.isMajor(), 
> now - compactionStartTime, cr.getFiles().size(), newFiles.size(), 
> cr.getSize(), outputBytes); } logCompactionEndMessage(cr, sfs, now, 
> compactionStartTime); return sfs; }
> {code:java}
> //代码占位符{code}
> In the commitStoreFiles logic and replaceStoreFiles logic, after the sfs file 
> is moved to the data directory and before it is loaded by storeFile, if 
> compaction abort occurs, how should the sfs collection be handled? It seems 
> that it is retained in the data directory and waits for the next loading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-29185) How to ensure the atomicity of data files during compaction

Reply via email to