[jira] [Comment Edited] (HBASE-29185) How to ensure the atomicity of data files during compaction

yuting sun (Jira) Sat, 05 Apr 2025 10:37:05 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-29185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937282#comment-17937282
 ]


yuting sun edited comment on HBASE-29185 at 3/21/25 6:25 AM:
-------------------------------------------------------------

Yes, I understand the logic here. Recording the compressed file in the 
storefile metadata ensures the atomicity of the file. However, there is still a 
lag in handling the overall atomicity of the file in the following scenario:

If a compact operation aborts, the newfile generated remains in the file 
directory, and there is a long interval before the openStoreFiles action is 
executed. During this period, if the corresponding file in the metadata 
triggers a major compact and is subsequently cleaned up, and if the user 
performs a delete operation, the newfile, once loaded, could affect data 
consistency.

It looks like we can enable the hbase.regionserver.storefile.refresh.period 
configuration to activate the StorefileRefresherChore scheduled task to refresh 
the stored files periodically


was (Author: JIRAUSER283580):
Yes, I understand the logic here. Recording the compressed file in the 
storefile metadata ensures the atomicity of the file. However, there is still a 
lag in handling the overall atomicity of the file in the following scenario:

If a compact operation aborts, the newfile generated remains in the file 
directory, and there is a long interval before the openStoreFiles action is 
executed. During this period, if the corresponding file in the metadata 
triggers a major compact and is subsequently cleaned up, and if the user 
performs a delete operation, the newfile, once loaded, could affect data 
consistency.

Can we enable the *hbase.regionserver.storefile.refresh.period* configuration 
to activate the StorefileRefresherChore scheduled task, which would 
periodically refresh the store's files?

> How to ensure the atomicity of data files during compaction
> -----------------------------------------------------------
>
>                 Key: HBASE-29185
>                 URL: https://issues.apache.org/jira/browse/HBASE-29185
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.6.0
>            Reporter: yuting sun
>            Priority: Major
>
> {code:java}
> //代码占位符
> {code}
> protected List<HStoreFile> doCompaction(CompactionRequestImpl cr, 
> Collection<HStoreFile> filesToCompact, User user, long compactionStartTime, 
> List<Path> newFiles) throws IOException \{ // Do the steps necessary to 
> complete the compaction. setStoragePolicyFromFileName(newFiles); 
> List<HStoreFile> sfs = storeEngine.commitStoreFiles(newFiles, true); if 
> (this.getCoprocessorHost() != null) { for (HStoreFile sf : sfs) { 
> getCoprocessorHost().postCompact(this, sf, cr.getTracker(), cr, user); } } 
> replaceStoreFiles(filesToCompact, sfs, true); long outputBytes = 
> getTotalSize(sfs); // At this point the store will use new files for all new 
> scanners. refreshStoreSizeAndTotalBytes(); // update store size. long now = 
> EnvironmentEdgeManager.currentTime(); if ( region.getRegionServerServices() 
> != null && region.getRegionServerServices().getMetrics() != null ) \{ 
> region.getRegionServerServices().getMetrics().updateCompaction( 
> region.getTableDescriptor().getTableName().getNameAsString(), cr.isMajor(), 
> now - compactionStartTime, cr.getFiles().size(), newFiles.size(), 
> cr.getSize(), outputBytes); } logCompactionEndMessage(cr, sfs, now, 
> compactionStartTime); return sfs; }
> {code:java}
> //代码占位符{code}
> In the commitStoreFiles logic and replaceStoreFiles logic, after the sfs file 
> is moved to the data directory and before it is loaded by storeFile, if 
> compaction abort occurs, how should the sfs collection be handled? It seems 
> that it is retained in the data directory and waits for the next loading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (HBASE-29185) How to ensure the atomicity of data files during compaction

Reply via email to