I guess the problem is that only HDFS can support append, but HBase is designed to store HFiles on lots of other types of storage, like S3.
And keeping the MOB file open for write is also only suitable for HDFS, as only HDFS can support hflush/hsync, for S3 like storage, a file is only visible when it is closed. Anyway, the MOB feature is not designed and implemented by me, just my thoughts on this area. I'm not sure whether there are people in the community use it in production. Thanks. Xinyu Tan <tanxi...@apache.org> 于2025年6月19日周四 09:28写道: > > Hello everyone > > I am a developer from the IoTDB and Ratis communities, and I am familiar with > distributed systems and storage engines. Recently, I have been studying the > MOBV2 feature in HBase. > > I found that when hbase.mob.compaction.type is set to optimized, it is > possible for multiple files, each not exceeding a specific threshold, to be > generated in a single compaction. However, I also noticed that each time the > memstore is flushed, it can generate a new mob hfile, and since the default > flush threshold for each memstore is 128MB, many small MOB files are created. > Given that the default merge period for mob files is one week, does this mean > that these newly generated small MOB files have to wait a week before they > can be merged into a larger file? I am not sure if my code interpretation is > correct, so is this reasoning accurate? > > If this is the case, I am curious as to why large files in the mob region > aren't reused across different flushes and switched after reaching a certain > size. This approach doesn’t seem to have any downsides, but it could reduce > write amplification. Single-node storage engines like Badger/Titan operate > this way; otherwise, the merging of these small mob HFiles would still cause > write amplification. Was there any specific consideration during the design > that led to this approach? > > Additionally, I would like to understand the current state of the MOB feature > and whether it has reached a production-ready level. > > Thank you! > > Best > ------------------ > Xinyu Tan