[
https://issues.apache.org/jira/browse/HBASE-29185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937282#comment-17937282
]
yuting sun edited comment on HBASE-29185 at 3/21/25 6:25 AM:
-------------------------------------------------------------
Yes, I understand the logic here. Recording the compressed file in the
storefile metadata ensures the atomicity of the file. However, there is still a
lag in handling the overall atomicity of the file in the following scenario:
If a compact operation aborts, the newfile generated remains in the file
directory, and there is a long interval before the openStoreFiles action is
executed. During this period, if the corresponding file in the metadata
triggers a major compact and is subsequently cleaned up, and if the user
performs a delete operation, the newfile, once loaded, could affect data
consistency.
It looks like we can enable the hbase.regionserver.storefile.refresh.period
configuration to activate the StorefileRefresherChore scheduled task to refresh
the stored files periodically
was (Author: JIRAUSER283580):
Yes, I understand the logic here. Recording the compressed file in the
storefile metadata ensures the atomicity of the file. However, there is still a
lag in handling the overall atomicity of the file in the following scenario:
If a compact operation aborts, the newfile generated remains in the file
directory, and there is a long interval before the openStoreFiles action is
executed. During this period, if the corresponding file in the metadata
triggers a major compact and is subsequently cleaned up, and if the user
performs a delete operation, the newfile, once loaded, could affect data
consistency.
Can we enable the *hbase.regionserver.storefile.refresh.period* configuration
to activate the StorefileRefresherChore scheduled task, which would
periodically refresh the store's files?
> How to ensure the atomicity of data files during compaction
> -----------------------------------------------------------
>
> Key: HBASE-29185
> URL: https://issues.apache.org/jira/browse/HBASE-29185
> Project: HBase
> Issue Type: Sub-task
> Components: Compaction
> Affects Versions: 2.6.0
> Reporter: yuting sun
> Priority: Major
>
> {code:java}
> //代码占位符
> {code}
> protected List<HStoreFile> doCompaction(CompactionRequestImpl cr,
> Collection<HStoreFile> filesToCompact, User user, long compactionStartTime,
> List<Path> newFiles) throws IOException \{ // Do the steps necessary to
> complete the compaction. setStoragePolicyFromFileName(newFiles);
> List<HStoreFile> sfs = storeEngine.commitStoreFiles(newFiles, true); if
> (this.getCoprocessorHost() != null) { for (HStoreFile sf : sfs) {
> getCoprocessorHost().postCompact(this, sf, cr.getTracker(), cr, user); } }
> replaceStoreFiles(filesToCompact, sfs, true); long outputBytes =
> getTotalSize(sfs); // At this point the store will use new files for all new
> scanners. refreshStoreSizeAndTotalBytes(); // update store size. long now =
> EnvironmentEdgeManager.currentTime(); if ( region.getRegionServerServices()
> != null && region.getRegionServerServices().getMetrics() != null ) \{
> region.getRegionServerServices().getMetrics().updateCompaction(
> region.getTableDescriptor().getTableName().getNameAsString(), cr.isMajor(),
> now - compactionStartTime, cr.getFiles().size(), newFiles.size(),
> cr.getSize(), outputBytes); } logCompactionEndMessage(cr, sfs, now,
> compactionStartTime); return sfs; }
> {code:java}
> //代码占位符{code}
> In the commitStoreFiles logic and replaceStoreFiles logic, after the sfs file
> is moved to the data directory and before it is loaded by storeFile, if
> compaction abort occurs, how should the sfs collection be handled? It seems
> that it is retained in the data directory and waits for the next loading.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)