BukrosSzabolcs commented on code in PR #4418:
URL: https://github.com/apache/hbase/pull/4418#discussion_r874096575
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:
##########
@@ -1867,6 +1870,10 @@ executorService.new
ExecutorConfig().setExecutorType(ExecutorType.RS_SNAPSHOT_OP
choreService.scheduleChore(brokenStoreFileCleaner);
}
+ if (this.rsMobFileCleanerChore != null) {
+ choreService.scheduleChore(rsMobFileCleanerChore);
Review Comment:
Let me clarify the differences between the 2 cleaners.
RSMobFileCleanerChore:
- runs on RS to have access to currently written files and active storefile
list
- can only archive mob files created by regions hosted on the current RS
- only reads hfiles belonging to regions hosted by the current RS when
looking for references
This allow it to do the majority of the cleanup necessary as efficiently as
possible
MobFileCleanerChore:
- runs on Master
- can only archive mob files created by archived regions (regions no longer
existing in the /data folder). Thanks to this these mob files can no longer be
"currently written" so we do not need the data only available on the RS
- reads every single hfile in /data with a mob enabled CF. This is
necessary, because this is the only way if any of these mobs has active
references
So yes, there is an overlap, the same hfiles could be read by both cleansers
but the a mob file could be archived with either one of them. The cleaner on
the master is wasteful and not especially elegant, but thanks to the lack of
centralized mob tracking we do not have a better way of collecting the
references.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]